Conditional Inference Trees in R Programming

Conditional Inference Trees is a non-parametric class of decision trees and is also known as unbiased recursive partitioning. It is a recursive partitioning approach for continuous and multivariate response variables in a conditional inference framework. To perform this approach in R Programming, ctree() function is used and requires partykit package. In this article, let’s learn about conditional inference trees, syntax, and its implementation with the help of examples.

Conditional Inference Trees

Conditional Inference Trees is a different kind of decision tree that uses recursive partitioning of dependent variables based on the value of correlations. It avoids biasing just like other algorithms of classification and regression in machine learning. Thus, avoiding vulnerability to the errors making it more flexible for the problems in the data. Conditional inference trees use a significance test which is a permutation test that selects covariate to split and recurse the variable. The p-value is calculated in this test. The significance test is executed at each start of the algorithm. This algorithm is not good for data with missing values for learning.
Algorithm:

  1. Test the global null hypothesis between random input and response variables and select the input variable with the highest p-value with response variable.
  2. Perform binary split on the selected input variable.
  3. Recursively perform step 1 and 2.

How Conditional Inference Trees differs from Decision Trees?

Conditional Inference Trees is a tree-based classification algorithm. It is similar to the decision trees as ctree() also performs recursively partitioning of data just like decision trees. The only procedure that makes conditional inference trees different from decision trees is that conditional inference trees use a significance test to select input variables rather than selecting the variable that maximizes the information measure. For example, the Gini coefficient is used in traditional decision trees to select the variable that maximizes the information measure.

Implementation in R

Syntax:
ctree(formula, data)

Parameters:
formula: represents formula on the basis of which model is to be fit
data: represents dataframe containing the variables in the model



Example 1:

In this example, let’s use the regression approach of Condition Inference trees on the air quality dataset which is present in the R base package. After the execution, different levels of ozone will be determined based on different environmental conditions. This helps in learning the different behavior of ozone value in different environmental conditions.

Step 1: Installing the required packages.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Install the required 
# Package for function
install.packages("partykit")

chevron_right


Step 2: Loading the required package.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Load the library
library(partykit)

chevron_right


Step 3: Creating regression model of Condition inference tree.

filter_none

edit
close

play_arrow

link
brightness_4
code

air <- subset(airquality, !is.na(Ozone))
airConInfTree <- ctree(Ozone ~ ., 
                       data = air)

chevron_right


Step 4: Print regression model.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Print model
print(airConInfTree)

chevron_right


Output:

Model formula:
Ozone ~ Solar.R + Wind + Temp + Month + Day

Fitted party:
[1] root
|   [2] Temp <= 82
|   |   [3] Wind  6.9
|   |   |   [5] Temp  77: 31.143 (n = 21, err = 4620.6)
|   [7] Temp > 82
|   |   [8] Wind  10.3: 48.714 (n = 7, err = 1183.4)

Number of inner nodes:    4
Number of terminal nodes: 5

Step 4: Plotting the graph.



filter_none

edit
close

play_arrow

link
brightness_4
code

# Output to be present as PNG file
png(file = "conditionalRegression.png")
  
# Plotting graph
plot(airConInfTree)
  
# Save the file
dev.off()

chevron_right


Output:
output-screen

Explanation:
After executing, the above code produces a graph of conditional inference tree that shows the ozone value in the form of a box plot in each node in different environmental conditions. As in the above output image, Node 5 shows the minimum ozone value. Further, learning the behavior shows Temp6.9 shows the least ozone value in air quality.

Example 2:

In this example, let’s use the classification approach of Condition Inference trees on the iris dataset present in the R base package. After executing the code, different species of iris plants will be determined on the basis of petal length and width.

Step 1: Installing the required packages.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Install the required 
# Package for function
install.packages("partykit")

chevron_right


Step 2: Loading the required package.

filter_none

edit
close

play_arrow

link
brightness_4
code

# Load the library
library(partykit)

chevron_right


Step 3: Creating classification model of Condition inference tree

filter_none

edit
close

play_arrow

link
brightness_4
code

irisConInfTree <- ctree(Species ~ ., 
                        data = iris)

chevron_right


Step 4: Print classification model

filter_none

edit
close

play_arrow

link
brightness_4
code

# Print model
print(irisConInfTree)

chevron_right


Output:

Model formula:
Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width

Fitted party:
[1] root
|   [2] Petal.Length  1.9
|   |   [4] Petal.Width <= 1.7
|   |   |   [5] Petal.Length  4.8: versicolor (n = 8, err = 50.0%)
|   |   [7] Petal.Width > 1.7: virginica (n = 46, err = 2.2%)

Number of inner nodes:    3
Number of terminal nodes: 4

Step 4: Plotting the graph

filter_none

edit
close

play_arrow

link
brightness_4
code

# Output to be present as PNG file
png(file = "conditionalClassification.png",
width = 1200, height = 400)
  
# Plotting graph
plot(irisConInfTree)
  
# Save the file
dev.off()

chevron_right


Output:
output-screen
Explanation:
After executing the above code, species of iris plants are classified based on petal length and width. As in above graph, setosa species have petal length <= 1.9.




My Personal Notes arrow_drop_up

Blockchain Enthusiast

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.