Dynamic Programming | Set 26 (Largest Independent Set Problem)
Given a Binary Tree, find size of the Largest Independent Set(LIS) in it. A subset of all tree nodes is an independent set if there is no edge between any two nodes of the subset.
For example, consider the following binary tree. The largest independent set(LIS) is {10, 40, 60, 70, 80} and size of the LIS is 5.
A Dynamic Programming solution solves a given problem using solutions of subproblems in bottom up manner. Can the given problem be solved using solutions to subproblems? If yes, then what are the subproblems? Can we find largest independent set size (LISS) for a node X if we know LISS for all descendants of X? If a node is considered as part of LIS, then its children cannot be part of LIS, but its grandchildren can be. Following is optimal substructure property.
1) Optimal Substructure:
Let LISS(X) indicates size of largest independent set of a tree with root X.
LISS(X) = MAX { (1 + sum of LISS for all grandchildren of X),
(sum of LISS for all children of X) }
The idea is simple, there are two possibilities for every node X, either X is a member of the set or not a member. If X is a member, then the value of LISS(X) is 1 plus LISS of all grandchildren. If X is not a member, then the value is sum of LISS of all children.
2) Overlapping Subproblems
Following is recursive implementation that simply follows the recursive structure mentioned above.
// A naive recursive implementation of Largest Independent Set problem
#include <stdio.h>
#include <stdlib.h>
// A utility function to find max of two integers
int max(int x, int y) { return (x > y)? x: y; }
/* A binary tree node has data, pointer to left child and a pointer to
right child */
struct node
{
int data;
struct node *left, *right;
};
// The function returns size of the largest independent set in a given
// binary tree
int LISS(struct node *root)
{
if (root == NULL)
return 0;
// Caculate size excluding the current node
int size_excl = LISS(root->left) + LISS(root->right);
// Calculate size including the current node
int size_incl = 1;
if (root->left)
size_incl += LISS(root->left->left) + LISS(root->left->right);
if (root->right)
size_incl += LISS(root->right->left) + LISS(root->right->right);
// Return the maximum of two sizes
return max(size_incl, size_excl);
}
// A utility function to create a node
struct node* newNode( int data )
{
struct node* temp = (struct node *) malloc( sizeof(struct node) );
temp->data = data;
temp->left = temp->right = NULL;
return temp;
}
// Driver program to test above functions
int main()
{
// Let us construct the tree given in the above diagram
struct node *root = newNode(20);
root->left = newNode(8);
root->left->left = newNode(4);
root->left->right = newNode(12);
root->left->right->left = newNode(10);
root->left->right->right = newNode(14);
root->right = newNode(22);
root->right->right = newNode(25);
printf ("Size of the Largest Independent Set is %d ", LISS(root));
return 0;
}
Output:
Size of the Largest Independent Set is 5
Time complexity of the above naive recursive approach is exponential. It should be noted that the above function computes the same subproblems again and again. For example, LISS of node with value 50 is evaluated for node with values 10 and 20 as 50 is grandchild of 10 and child of 20.
Since same suproblems are called again, this problem has Overlapping Subprolems property. So LISS problem has both properties (see this and this) of a dynamic programming problem. Like other typical Dynamic Programming(DP) problems, recomputations of same subproblems can be avoided by storing the solutions to subproblems and solving problems in bottom up manner.
Following is C implementation of Dynamic Programming based solution. In the following solution, an additional field ‘liss’ is added to tree nodes. The initial value of ‘liss’ is set as 0 for all nodes. The recursive function LISS() calculates ‘liss’ for a node only if it is not already set.
/* Dynamic programming based program for Largest Independent Set problem */
#include <stdio.h>
#include <stdlib.h>
// A utility function to find max of two integers
int max(int x, int y) { return (x > y)? x: y; }
/* A binary tree node has data, pointer to left child and a pointer to
right child */
struct node
{
int data;
int liss;
struct node *left, *right;
};
// A memoization function returns size of the largest independent set in
// a given binary tree
int LISS(struct node *root)
{
if (root == NULL)
return 0;
if (root->liss)
return root->liss;
if (root->left == NULL && root->right == NULL)
return (root->liss = 1);
// Caculate size excluding the current node
int liss_excl = LISS(root->left) + LISS(root->right);
// Calculate size including the current node
int liss_incl = 1;
if (root->left)
liss_incl += LISS(root->left->left) + LISS(root->left->right);
if (root->right)
liss_incl += LISS(root->right->left) + LISS(root->right->right);
// Return the maximum of two sizes
root->liss = max(liss_incl, liss_excl);
return root->liss;
}
// A utility function to create a node
struct node* newNode(int data)
{
struct node* temp = (struct node *) malloc( sizeof(struct node) );
temp->data = data;
temp->left = temp->right = NULL;
temp->liss = 0;
return temp;
}
// Driver program to test above functions
int main()
{
// Let us construct the tree given in the above diagram
struct node *root = newNode(20);
root->left = newNode(8);
root->left->left = newNode(4);
root->left->right = newNode(12);
root->left->right->left = newNode(10);
root->left->right->right = newNode(14);
root->right = newNode(22);
root->right->right = newNode(25);
printf ("Size of the Largest Independent Set is %d ", LISS(root));
return 0;
}
Output
Size of the Largest Independent Set is 5
Time Complexity: O(n) where n is the number of nodes in given Binary tree.
Following extensions to above solution can be tried as an exercise.
1) Extend the above solution for n-ary tree.
2) The above solution modifies the given tree structure by adding an additional field ‘liss’ to tree nodes. Extend the solution so that it doesn’t modify the tree structure.
3) The above solution only returns size of LIS, it doesn’t print elements of LIS. Extend the solution to print all nodes that are part of LIS.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

Intelligent
Isn't the size of largest independent set equal MAX(#of nodes at even depths, #of nodes at odd depths)? If yes, BFS based solution will do with the O(n) complexity.
// A naive recursive implementation of Largest Independent Set problem #include <stdio.h> #include <stdlib.h> // A utility function to find max of two integers int max(int x, int y) { return (x > y)? x: y; } /* A binary tree node has data, pointer to left child and a pointer to right child */ struct node { int data; struct node *left, *right; }; static int index = 0; // A utility function to create a node struct node* newNode( int data ) { struct node* temp = (struct node *) malloc( sizeof(struct node) ); temp->data = data; temp->left = temp->right = NULL; return temp; } void inorder(node* root, node** arr) { if(root == NULL) return; inorder(root->left,arr); arr[index] = root; index++; inorder(root->right,arr); return; } int LISS(node* root) { node** arr; arr = (node**)malloc(sizeof(node)*50); inorder(root,arr); int arr1[50] = {0}; arr1[0] = 0; for(int i=0; i<index; i++) { for(int j=0; j<i; j++) { if((arr[i]->left != arr[j]) && (arr[i]->right != arr[j])) arr1[i] = max(arr1[i], 1+arr1[j]); } } int res = 0; for(int i=0; i<index; i++) { res = max(res,arr1[i]); } return res; } // Driver program to test above functions int main() { // Let us construct the tree given in the above diagram struct node *root = newNode(20); root->left = newNode(8); root->left->left = newNode(4); root->left->right = newNode(12); root->left->right->left = newNode(10); root->left->right->right = newNode(14); root->right = newNode(22); root->right->right = newNode(25); printf ("Size of the Largest Independent Set is %d ", LISS(root)); return 0; }Can someone please tell how to extend it to print nodes of LIS, also without changing the structure of node how can we use memoization. Thanks.
Here's my take on the problem. Simple traversal based solution. Let me know if there's any logical error or cases I missed.
This works from bottom up. I assume that all leaf nodes are part of the solution and work from there.
Algo: all nodes contain a bool print.
1) Postorder traversal. If the node is root, set print=true
2) If node->left is null check if right node print value.
3) set current->print = !node->right->print.
4) If node->right is null set current->print=!node->left->print
5) If there are both left and right nodes, check if they are printed.
6) If any one of them is printed then skip current node, else print it.
void postorder(struct node* node) { if (node == NULL) return; postorder(node->left); postorder(node->right); if(node->left==NULL&&node->right==NULL) { Node->print= true; } Else if(node->left==NULL&&node->right!=NULL) { Node->print = !node->right->print; } Else if (node->right==NULL&&node->left!=NULL) { Node->print = !(node->left->print); } Else { Node->print = !(node->left->print||node->right->print); } if(Node->print) printf("%d ", node->data); }If this is found to be logically correct, can be easily extended to get count. If we are not allowed to change structure of node then a simple solution would be to push the bool values on to a stack and pop for the number of children of the node and compare them.
I am sure pretty sure this approach is correct,a lot more intuitive and easier to implement.
Without modifying original tree
[#include
#include
int max(int x, int y) { return (x > y)? x: y; }
struct node
{
int data;
struct node *left, *right;
};
int LISSUtil(struct node *root,int *lcref,int *rcref)
{
if (root == NULL){
*lcref=*rcref=0;
return 0;}
if (root->left == NULL && root->right == NULL){
*lcref=*rcref=0;
return 1;
}
int lcl,lcr,rcl,rcr;
// Caculate size excluding the current node
int lc=LISSUtil(root->left,&lcl,&lcr);
int rc=LISSUtil(root->right,&rcl,&rcr);
int liss_excl=lc+rc;
// Calculate size including the current node
int liss_incl = 1 + lcl + lcr + rcr + rcl;
*lcref=lc;
*rcref=rc;
// Return the maximum of two sizes
return max(liss_incl, liss_excl);
}
int LISS(struct node *root){
int lc,rc;
return LISSUtil(root,&lc,&rc);
}
// A utility function to create a node
struct node* newNode(int data)
{
struct node* temp = (struct node *) malloc( sizeof(struct node) );
temp->data = data;
temp->left = temp->right = NULL;
return temp;
}
// Driver program to test above functions
int main()
{
// Let us construct the tree given in the above diagram
struct node *root = newNode(20);
root->left = newNode(8);
root->left->left = newNode(4);
root->left->right = newNode(12);
root->left->right->left = newNode(10);
root->left->right->right = newNode(14);
root->right = newNode(22);
root->right->right = newNode(25);
printf ("Size of the Largest Independent Set is %d ", LISS(root));
return 0;
}]
What if the max arguments happens to be same. Then in that case we will have two sets of same size. Like if we add 90 to inorder last node i.e. 60. then [40, 70, 80, 10, 60] and [40, 70, 80, 30, 90].
Your programs works fine.
But at the time of printing we need to take care of this case, if it is required to display all the maximum size sets.
There is no need to have these lines in second solution. I mean, removing these lines will suffice the solution.
Cheers
if (root->left == NULL && root->right == NULL)
return (root->liss = 1);
its a recent tc div2 hard problem