How to Calculate Information Gain in Decision Tree?

Last Updated : 13 Feb, 2024

Answer: To calculate information gain in a decision tree, subtract the weighted average entropy of child nodes from the entropy of the parent node.

To calculate information gain in a decision tree, follow these steps:

Calculate the Entropy of the Parent Node:
- Compute the entropy of the parent node using the formula: Entropy=−∑_i₌₁ $\sum_{c}^{i=1}$ pi ⋅log₂(pi)
- Where pi is the proportion of instances belonging to class i, and c is the number of classes.
Split the Data:
- Split the dataset into subsets based on the values of a selected attribute (feature).
Calculate the Entropy of Child Nodes:
- For each subset (child node), calculate its entropy using the same formula as step 1.
Calculate the Weighted Average Entropy of Child Nodes:
- Calculate the weighted average entropy of the child nodes using the formula: Weighted Average Entropy= $\sum_{j=1}^{m}\frac{N_{J}}{N}\times Entropy(j)$
- Where N_j is the number of instances in the j^thchild node, N is the total number of instances, and m is the number of child nodes.
Calculate Information Gain:
- Information Gain is the difference between the entropy of the parent node and the weighted average entropy of the child nodes: Information Gain=Entropy(Parent)−Weighted Average Entropy(Children)Information Gain=Entropy(Parent)−Weighted Average Entropy(Children)
Select the Attribute with the Highest Information Gain:
- Choose the attribute (feature) that yields the highest information gain as the splitting criterion for the current node in the decision tree.