Find LCA in Binary Tree using RMQ
The article describes an approach to solving the problem of finding the LCA of two nodes in a tree by reducing it to a RMQ problem.
Lowest Common Ancestor (LCA) of two nodes u and v in a rooted tree T is defined as the node located farthest from the root that has both u and v as descendants.
For example, in below diagram, LCA of node 4 and node 9 is node 2.
There can be many approaches to solve the LCA problem. The approaches differ in their time and space complexities. Here is a link to a couple of them (these do not involve reduction to RMQ).
Range Minimum Query (RMQ) is used on arrays to find the position of an element with the minimum value between two specified indices. Different approaches for solving RMQ have been discussed here and here. In this article, Segment Tree based approach is discussed. With segment tree, preprocessing time is O(n) and time to for range minimum query is O(Logn). The extra space required is O(n) to store the segment tree.
Reduction of LCA to RMQ:
The idea is to traverse the tree starting from root by an Euler tour (traversal without lifting pencil), which is a DFS-type traversal with preorder traversal characteristics.
Observation: The LCA of nodes 4 and 9 is node 2, which happens to be the node closest to the root amongst all those encountered between the visits of 4 and 9 during a DFS of T. This observation is the key to the reduction. Let’s rephrase: Our node is the node at the smallest level and the only node at that level amongst all the nodes that occur between consecutive occurrences (any) of u and v in the Euler tour of T.
We require three arrays for implementation:
- Nodes visited in order of Euler tour of T
- Level of each node visited in Euler tour of T
- Index of the first occurrence of a node in Euler tour of T (since any occurrence would be good, let’s track the first one)
- Do a Euler tour on the tree, and fill the euler, level and first occurrence arrays.
- Using the first occurrence array, get the indices corresponding to the two nodes which will be the corners of the range in the level array that is fed to the RMQ algorithm for the minimum value.
- Once the algorithm return the index of the minimum level in the range, we use it to determine the LCA using Euler tour array.
Below is the implementation of above algorithm.
The LCA of node 4 and node 9 is node 2.
- We assume that the nodes queried are present in the tree.
- We also assumed that if there are V nodes in tree, then keys (or data) of these nodes are in range from 1 to V.
- Euler tour: Number of nodes is V. For a tree, E = V-1. Euler tour (DFS) will take O(V+E) which is O(2*V) which can be written as O(V).
- Segment Tree construction : O(n) where n = V + E = 2*V – 1.
- Range Minimum query: O(log(n))
Overall this method takes O(n) time for preprocssing, but takes O(Log n) time for query. Therefore, it can be useful when we have a single tree on which we want to perform large number of LCA queries (Note that LCA is useful for finding shortest path between two nodes of Binary Tree)
- Euler tour array: O(n) where n = 2*V – 1
- Node Levels array: O(n)
- First Occurrences array: O(V)
- Segment Tree: O(n)
Another observation is that the adjacent elements in level array differ by 1. This can be used to convert a RMQ problem to a LCA problem.
This article is contributed by Yash Varyani. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above