Find LCA in Binary Tree using RMQ
The article describes an approach to solving the problem of finding the LCA of two nodes in a tree by reducing it to an RMQ problem.
The Lowest Common Ancestor (LCA) of two nodes u and v in a rooted tree T is defined as the node located farthest from the root that has both u and v as descendants.
For example, in the below diagram, the LCA of node 4 and node 9 is node 2.
There can be many approaches to solving the LCA problem. The approaches differ in their time and space complexities. Here is a link to a couple of them (these do not involve a reduction to RMQ).
Range Minimum Query (RMQ) is used on arrays to find the position of an element with the minimum value between two specified indices. Different approaches to solving RMQ have been discussed here and here. In this article, the Segment Tree-based approach is discussed. With a segment tree, preprocessing time is O(n) and the time for range minimum query is O(Logn). The extra space required is O(n) to store the segment tree.
Reduction of LCA to RMQ:
The idea is to traverse the tree starting from the root by an Euler tour (traversal without lifting a pencil), which is a DFS-type traversal with preorder traversal characteristics.
The LCA of nodes 4 and 9 is node 2, which happens to be the node closest to the root amongst all those encountered between the visits of 4 and 9 during a DFS of T. This observation is the key to the reduction. Let’s rephrase: Our node is the node at the smallest level and the only node at that level amongst all the nodes that occur between consecutive occurrences (any) of u and v in the Euler tour of T.
We require three arrays for implementation:
- Nodes visited in order of Euler tour of T
- The level of each node visited in the Euler tour of T
- Index of the first occurrence of a node in Euler tour of T (since any occurrence would be good, let’s track the first one)
- Do a Euler tour on the tree, and fill the euler, level and first occurrence arrays.
- Using the first occurrence array, get the indices corresponding to the two nodes which will be the corners of the range in the level array that is fed to the RMQ algorithm for the minimum value.
- Once the algorithm return the index of the minimum level in the range, we use it to determine the LCA using Euler tour array.
Below is the implementation of the above algorithm.
The LCA of node 4 and node 9 is node 2.
- We assume that the nodes queried are present in the tree.
- We also assumed that if there are V nodes in the tree, then the keys (or data) of these nodes are in the range from 1 to V.
- Euler tour: The number of nodes is V. For a tree, E = V-1. Euler tour (DFS) will take O(V+E) which is O(2*V) which can be written as O(V).
- Segment Tree construction : O(n) where n = V + E = 2*V – 1.
- Range Minimum query: O(log(n))
Overall this method takes O(n) time for preprocessing but takes O(log n) time for the query. Therefore, it can be useful when we have a single tree on which we want to perform a large number of LCA queries (Note that LCA is useful for finding the shortest path between two nodes of a Binary Tree)
- Euler tour array: O(n) where n = 2*V – 1
- Node Levels array: O(n)
- First Occurrences array: O(V)
- Segment Tree: O(n)
Another observation is that the adjacent elements in the level array differ by 1. This can be used to convert an RMQ problem to an LCA problem.