Open In App

CART (Classification And Regression Tree) in Machine Learning

CART( Classification And Regression Trees) is a  variation of the decision tree algorithm. It can handle both classification and regression tasks. Scikit-Learn uses the Classification And Regression Tree (CART) algorithm to train  Decision Trees (also called “growing” trees). CART was first produced by Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone in 1984.

CART(Classification And Regression Tree) for Decision Tree

CART is a predictive algorithm used in Machine learning and it explains how the target variable’s values can be predicted based on other matters. It is a decision tree where each fork is split into a predictor variable and each node has a prediction for the target variable at the end.



The term CART serves as a generic term for the following categories of decision trees:

In the decision tree, nodes are split into sub-nodes based on a threshold value of an attribute. The root node is taken as the training set and is split into two by considering the best attribute and threshold value. Further, the subsets are also split using the same logic. This continues till the last pure sub-set is found in the tree or the maximum number of leaves possible in that growing tree.



CART Algorithm

Classification and Regression Trees (CART) is a decision tree algorithm that is used for both classification and regression tasks. It is a supervised learning algorithm that learns from labelled data to predict unseen data.

How does CART algorithm works?

The CART algorithm works via the following process:

CART algorithm uses Gini Impurity to split the dataset into a decision tree .It does that by searching for the best homogeneity for the sub nodes, with the help of the Gini index criterion.

Gini index/Gini impurity

The Gini index is a metric for the classification tasks in CART. It stores the sum of squared probabilities of each class. It computes the degree of probability of a specific variable that is wrongly being classified when chosen randomly and a variation of the Gini coefficient. It works on categorical variables, provides outcomes either “successful” or “failure” and hence conducts binary splitting only.

The degree of the  Gini index varies from 0 to 1,

Mathematically, we can write Gini Impurity as follows: 

where is the probability of an object being classified to a particular class.

CART for Classification

A classification tree is an algorithm where the target variable is categorical. The algorithm is then used to identify the “Class” within which the target variable is most likely to fall. Classification trees are used when the dataset needs to be split into classes that belong to the response variable(like yes or no)

For classification in decision tree learning algorithm that creates a tree-like structure to predict class labels. The tree consists of nodes, which represent different decision points, and branches, which represent the possible result of those decisions. Predicted class labels are present at each leaf node of the tree.

How Does CART for Classification Work?

CART for classification works by recursively splitting the training data into smaller and smaller subsets based on certain criteria. The goal is to split the data in a way that minimizes the impurity within each subset. Impurity is a measure of how mixed up the data is in a particular subset. For classification tasks, CART uses Gini impurity

CART for Regression

A Regression tree is an algorithm where the target variable is continuous and the tree is used to predict its value. Regression trees are used when the response variable is continuous. For example, if the response variable is the temperature of the day.

CART for regression is a decision tree learning method that creates a tree-like structure to predict continuous target variables. The tree consists of nodes that represent different decision points and branches that represent the possible outcomes of those decisions. Predicted values for the target variable are stored in each leaf node of the tree.

How Does CART works for Regression?

Regression CART works by splitting the training data recursively into smaller subsets based on specific criteria. The objective is to split the data in a way that minimizes the residual reduction in each subset.

Pseudo-code of the CART algorithm

d = 0, endtree = 0
Note(0) = 1, Node(1) = 0, Node(2) = 0
while endtree < 1
if Node(2d-1) + Node(2d) + .... + Node(2d+1-2) = 2 - 2d+1
endtree = 1
else
do i = 2d-1, 2d, .... , 2d+1-2
if Node(i) > -1
Split tree
else
Node(2i+1) = -1
Node(2i+2) = -1
end if
end do
end if
d = d + 1
end while

CART model representation

CART models are formed by picking input variables and evaluating split points on those variables until an appropriate tree is produced.

Steps to create a Decision Tree using the  CART algorithm:

Decision Tree CART Implementations

Here is the code implements the CART algorithm for classifying fruits based on their color and size. It first encodes the categorical data using a LabelEncoder and then trains a CART classifier on the encoded data. Finally, it predicts the fruit type for a new instance and decodes the result back to its original categorical value.

from sklearn.tree import DecisionTreeClassifier
from sklearn.preprocessing import LabelEncoder
 
# Define the features and target variable
features = [
    ["red", "large"],
    ["green", "small"],
    ["red", "small"],
    ["yellow", "large"],
    ["green", "large"],
    ["orange", "large"],
]
target_variable = ["apple", "lime", "strawberry", "banana", "grape", "orange"]
 
# Flatten the features list for encoding
flattened_features = [item for sublist in features for item in sublist]
 
# Use a single LabelEncoder for all features and target variable
le = LabelEncoder()
le.fit(flattened_features + target_variable)
 
# Encode features and target variable
encoded_features = [le.transform(item) for item in features]
encoded_target = le.transform(target_variable)
 
# Create a CART classifier
clf = DecisionTreeClassifier()
 
# Train the classifier on the training set
clf.fit(encoded_features, encoded_target)
 
# Predict the fruit type for a new instance
new_instance = ["red", "large"]
encoded_new_instance = le.transform(new_instance)
predicted_fruit_type = clf.predict([encoded_new_instance])
decoded_predicted_fruit_type = le.inverse_transform(predicted_fruit_type)
print("Predicted fruit type:", decoded_predicted_fruit_type[0])

                    

Output:

Predicted fruit type: apple

Advantages of CART

Limitations of CART

Applications of the CART algorithm

Frequently asked Question(FAQ’s)

1. What is CART (classification and regression tree)?

CART is a decision tree algorithm that can be used for both classification and regression tasks. It works by recursively partitioning the data into smaller and smaller subsets based on certain criteria. The goal is to create a tree structure that can accurately predict the target variable for new data points.

2. What is a regression tree in machine learning?

A regression tree is a type of decision tree that is used to predict continuous target variables. It works by partitioning the data into smaller and smaller subsets based on certain criteria, and then predicting the average value of the target variable within each subset.

3. What is the difference between a regression tree and a classification tree?

A regression tree is used to predict continuous target variables, while a classification tree is used to predict categorical target variables. Regression trees predict the average value of the target variable within each subset, while classification trees predict the most likely class for each data point.

4. What is the difference between cart and decision tree?

CART is a specific implementation of the decision tree algorithm. There are other decision tree algorithms, such as ID3 and C4.5, that have different splitting criteria and pruning techniques.

5. What is the main difference between classification and regression?

Classification is the task of assigning a category to an instance, while regression is the task of predicting a continuous value. For example, classification could be used to predict whether an email is spam or not spam, while regression could be used to predict the price of a house based on its size, location, and amenities.



Article Tags :