Given a rooted tree (assume root is 1) of N nodes and Q queries, each of the form (Val, Node). For each query, the task is to find the number of nodes with values smaller than Val in sub-tree of Node, including itself.
Note that by definition, nodes in this tree are unique.
Input: N = 7, Q = 3 Val = 4, Node = 4 Val = 7, Node = 6 Val = 5, Node = 1 Given tree:
Output: 2 1 4 Explanation: For given queries: Q1 -> Val = 4, Node = 4 The required nodes are 2 and 3 Hence the output is 2 Q2 -> Val = 7, Node = 6 The required node is 6 only Hence the output is 1 Q3 -> Val = 5, Node = 1 The required nodes are 1, 2, 3 and 4 Hence the output is 4
Naive approach: A simple approach to solve this problem would be to run DFS from a given node for each query and count the number of nodes smaller than the given value. The parent of a given node must be excluded from the DFS.
Time complexity: O(N*Q), where Q is the number of queries and N is the number of nodes in the tree.
Efficient Approach: We can reduce the problem of finding the number of elements in a sub-tree to finding them in contiguous segments of an array. To generate such a representation we run a DFS from the root node and push the node into an array when we enter into it the first time and while exiting for the last time. This representation of a tree is known as Euler Tour of the tree.
The Euler Tour of the above tree will be:
1 4 2 2 3 3 5 5 4 6 7 7 6 1
This representation of tree has the property that the sub-tree of every node X is contained within the first and last occurrence of X in the array. Each node appears exactly twice. So counting the number of nodes smaller than Val between the first and last occurrence of Node will give us twice the answer of that query.
Using this representation, the queries can be processed offline in O(log(N)) per query using a binary indexed tree.
- We store the index of 1st and last occurrence of each node in the Tour in two arrays, start and end. Let start[X] and end[X] represent these indices for node X. This can be done in O(N)
- In the Euler Tour we store the position of the element along with node as a pair (indexInTree, indexInTour), and then sort according to indexInTree. Let this array be sortedTour
- Similarly, we maintain an array of Queries of the form (Val, Node) and sort according to Val. Let this array be sortedQuery
- Initialize a Binary Indexed Tree of size 2N with all entries as 0. Let this be bit
- Then proceed as follows. Maintain a pointer each in sortedTour and sortedQuery
- For each query in sortedQuery from the beginning select the nodes from sortedTour having indexInTree < Val, and increment their indexInTour in bit. Then the answer of that query would be half of the sum from start[Node] to end[Node]
- For the next query in sortedQuery we select any previously un-selected nodes from sortedTour having indexInTree < Val, and increment their indexInTour in bit and answer the query as done before.
- Repeating this process for each query we can answer them in O(Qlog(N)).
Below is the C++ implementation of the above approach:
3 1 4
Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready. To complete your preparation from learning a language to DS Algo and many more, please refer Complete Interview Preparation Course.
In case you wish to attend live classes with industry experts, please refer DSA Live Classes