# Longest Common Extension / LCE | Set 2 ( Reduction to RMQ)

Prerequisites :

The Longest Common Extension (LCE) problem considers a string **s** and computes, for each pair (L , R), the longest sub string of **s** that starts at both L and R. In LCE, in each of the query we have to answer the length of the longest common prefix starting at indexes L and R.

**Example:**

**String** : “abbababba”

**Queries:** LCE(1, 2), LCE(1, 6) and LCE(0, 5)

Find the length of the Longest Common Prefix starting at index given as, **(1, 2), (1, 6) and (0, 5)**.

The string highlighted “green” are the longest common prefix starting at index- L and R of the respective queries. We have to find the length of the longest common prefix starting at index- **(1, 2), (1, 6) and (0, 5)**.

In Set 1, we explained about the naive method to find the length of the LCE of a string on many queries. In this set we will show how a LCE problem can be reduced to a RMQ problem, hence decreasing the asymptotic time complexity of the naive method.

**Reduction of LCE to RMQ **

Let the input string be **S** and queries be of the form**LCE(L, R)**. Let the suffix array for s be **Suff[]** and the lcp array be **lcp[]**.

The longest common extension between two suffixes S_{L} and S_{R} of S can be obtained from the lcp array in the following way.

- Let low be the rank of S
_{L}among the suffixes of S (that is, Suff[low] = L). - Let high be the rank of S
_{R}among the suffixes of S. Without loss of generality, we assume that low < high. - Then the longest common extension of S
_{L}and S_{R}is lcp(low, high) = min_{(low<=k< high)}lcp [k].

**Proof:** Let S_{L} = S_{L}…S_{L+C}…s_{n} and S_{R} = S_{R}…S_{R+c}…s_{n}, and let c be the longest common extension of S_{L} and S_{R}(i.e. S_{L}…S_{L+C-1} = s_{n}…S_{R+c-1}). We assume that the string S has a sentinel character so that no suffix of S is a prefix of any other suffix of S but itself.

- If low = high – 1 then i = low and lcp[low] = c is the longest common extension of S
_{L}and S_{R}and we are done. - If low < high -1 then select i such lcp[i] is the minimum value in the interval [low, high] of the lcp array. We then have two possible cases:
- If c < lcp[i] we have a contradiction because S
_{L}. . . S_{L+lcp[i]-1}= S_{R}. . . S_{R}+lcp[i]-1 by the definition of the LCP table, and the fact that the entries of lcp correspond to sorted suffixes of S. - if c > lcp[i], let high = Suff[i], so that S
_{high}is the suffix associated with position i. S_{i}is such that s_{high}. . . s_{high+lcp[i]-1}= S_{L}. . . S_{L+lcp[i]-1 }and s_{high}. . . s_{high+lcp[i]-1}= S_{R}. . . S_{R+lcp[i]-1}, but since S_{L}. . . S_{L+c-1}= S_{R}. . . S_{R+c-1}we have that the lcp array should be wrongly sorted which is a contradiction.

Therefore we have c = lcp[i]

- If c < lcp[i] we have a contradiction because S

Thus we have reduced our longest common extension query to a range minimum-query over a range in lcp.

**Algorithm**

- To find low and high, we must have to compute the suffix array first and then from the suffix array we compute the inverse suffix array.
- We also need lcp array, hence we use Kasai’s Algorithm to find lcp array from the suffix array.
- Once the above things are done, we simply find the minimum value in lcp array from index – low to high (as proved above) for each query.

The minimum value is the length of the LCE for that query.

Implementation

`// A C++ Program to find the length of longest common ` `// extension using Direct Minimum Algorithm ` `#include<bits/stdc++.h> ` `using` `namespace` `std; ` ` ` `// Structure to represent a query of form (L,R) ` `struct` `Query ` `{ ` ` ` `int` `L, R; ` `}; ` ` ` `// Structure to store information of a suffix ` `struct` `suffix ` `{ ` ` ` `int` `index; ` `// To store original index ` ` ` `int` `rank[2]; ` `// To store ranks and next rank pair ` `}; ` ` ` `// A utility function to get minimum of two numbers ` `int` `minVal(` `int` `x, ` `int` `y) { ` `return` `(x < y)? x: y; } ` ` ` `// A utility function to get minimum of two numbers ` `int` `maxVal(` `int` `x, ` `int` `y) { ` `return` `(x > y)? x: y; } ` ` ` `// A comparison function used by sort() to compare ` `// two suffixes Compares two pairs, returns 1 if ` `// first pair is smaller ` `int` `cmp(` `struct` `suffix a, ` `struct` `suffix b) ` `{ ` ` ` `return` `(a.rank[0] == b.rank[0])? ` ` ` `(a.rank[1] < b.rank[1]): ` ` ` `(a.rank[0] < b.rank[0]); ` `} ` ` ` `// This is the main function that takes a string 'txt' ` `// of size n as an argument, builds and return the ` `// suffix array for the given string ` `vector<` `int` `> buildSuffixArray(string txt, ` `int` `n) ` `{ ` ` ` `// A structure to store suffixes and their indexes ` ` ` `struct` `suffix suffixes[n]; ` ` ` ` ` `// Store suffixes and their indexes in an array ` ` ` `// of structures. ` ` ` `// The structure is needed to sort the suffixes ` ` ` `// alphabatically and maintain their old indexes ` ` ` `// while sorting ` ` ` `for` `(` `int` `i = 0; i < n; i++) ` ` ` `{ ` ` ` `suffixes[i].index = i; ` ` ` `suffixes[i].rank[0] = txt[i] - ` `'a'` `; ` ` ` `suffixes[i].rank[1] = ` ` ` `((i+1) < n)? (txt[i + 1] - ` `'a'` `): -1; ` ` ` `} ` ` ` ` ` `// Sort the suffixes using the comparison function ` ` ` `// defined above. ` ` ` `sort(suffixes, suffixes+n, cmp); ` ` ` ` ` `// At his point, all suffixes are sorted according ` ` ` `// to first 2 characters. Let us sort suffixes ` ` ` `// according to first 4/ characters, then first 8 ` ` ` `// and so on ` ` ` ` ` `// This array is needed to get the index in suffixes[] ` ` ` `// from original index. This mapping is needed to get ` ` ` `// next suffix. ` ` ` `int` `ind[n]; ` ` ` ` ` `for` `(` `int` `k = 4; k < 2*n; k = k*2) ` ` ` `{ ` ` ` `// Assigning rank and index values to first suffix ` ` ` `int` `rank = 0; ` ` ` `int` `prev_rank = suffixes[0].rank[0]; ` ` ` `suffixes[0].rank[0] = rank; ` ` ` `ind[suffixes[0].index] = 0; ` ` ` ` ` `// Assigning rank to suffixes ` ` ` `for` `(` `int` `i = 1; i < n; i++) ` ` ` `{ ` ` ` `// If first rank and next ranks are same as ` ` ` `// that of previous/ suffix in array, assign ` ` ` `// the same new rank to this suffix ` ` ` `if` `(suffixes[i].rank[0] == prev_rank && ` ` ` `suffixes[i].rank[1] == suffixes[i-1].rank[1]) ` ` ` `{ ` ` ` `prev_rank = suffixes[i].rank[0]; ` ` ` `suffixes[i].rank[0] = rank; ` ` ` `} ` ` ` `else` `// Otherwise increment rank and assign ` ` ` `{ ` ` ` `prev_rank = suffixes[i].rank[0]; ` ` ` `suffixes[i].rank[0] = ++rank; ` ` ` `} ` ` ` `ind[suffixes[i].index] = i; ` ` ` `} ` ` ` ` ` `// Assign next rank to every suffix ` ` ` `for` `(` `int` `i = 0; i < n; i++) ` ` ` `{ ` ` ` `int` `nextindex = suffixes[i].index + k/2; ` ` ` `suffixes[i].rank[1] = (nextindex < n)? ` ` ` `suffixes[ind[nextindex]].rank[0]: -1; ` ` ` `} ` ` ` ` ` `// Sort the suffixes according to first k characters ` ` ` `sort(suffixes, suffixes+n, cmp); ` ` ` `} ` ` ` ` ` `// Store indexes of all sorted suffixes in the suffix array ` ` ` `vector<` `int` `>suffixArr; ` ` ` `for` `(` `int` `i = 0; i < n; i++) ` ` ` `suffixArr.push_back(suffixes[i].index); ` ` ` ` ` `// Return the suffix array ` ` ` `return` `suffixArr; ` `} ` ` ` `/* To construct and return LCP */` `vector<` `int` `> kasai(string txt, vector<` `int` `> suffixArr, ` ` ` `vector<` `int` `> &invSuff) ` `{ ` ` ` `int` `n = suffixArr.size(); ` ` ` ` ` `// To store LCP array ` ` ` `vector<` `int` `> lcp(n, 0); ` ` ` ` ` `// Fill values in invSuff[] ` ` ` `for` `(` `int` `i=0; i < n; i++) ` ` ` `invSuff[suffixArr[i]] = i; ` ` ` ` ` `// Initialize length of previous LCP ` ` ` `int` `k = 0; ` ` ` ` ` `// Process all suffixes one by one starting from ` ` ` `// first suffix in txt[] ` ` ` `for` `(` `int` `i=0; i<n; i++) ` ` ` `{ ` ` ` `/* If the current suffix is at n-1, then we don’t ` ` ` `have next substring to consider. So lcp is not ` ` ` `defined for this substring, we put zero. */` ` ` `if` `(invSuff[i] == n-1) ` ` ` `{ ` ` ` `k = 0; ` ` ` `continue` `; ` ` ` `} ` ` ` ` ` `/* j contains index of the next substring to ` ` ` `be considered to compare with the present ` ` ` `substring, i.e., next string in suffix array */` ` ` `int` `j = suffixArr[invSuff[i]+1]; ` ` ` ` ` `// Directly start matching from k'th index as ` ` ` `// at-least k-1 characters will match ` ` ` `while` `(i+k<n && j+k<n && txt[i+k]==txt[j+k]) ` ` ` `k++; ` ` ` ` ` `lcp[invSuff[i]] = k; ` `// lcp for the present suffix. ` ` ` ` ` `// Deleting the starting character from the string. ` ` ` `if` `(k>0) ` ` ` `k--; ` ` ` `} ` ` ` ` ` `// return the constructed lcp array ` ` ` `return` `lcp; ` `} ` ` ` `// A utility function to find longest common extension ` `// from index - L and index - R ` `int` `LCE(vector<` `int` `> lcp, vector<` `int` `>invSuff, ` `int` `n, ` ` ` `int` `L, ` `int` `R) ` `{ ` ` ` `// Handle the corner case ` ` ` `if` `(L == R) ` ` ` `return` `(n-L); ` ` ` ` ` `int` `low = minVal(invSuff[L], invSuff[R]); ` ` ` `int` `high = maxVal(invSuff[L], invSuff[R]); ` ` ` ` ` `int` `length = lcp[low]; ` ` ` ` ` `for` `(` `int` `i=low+1; i<high; i++) ` ` ` `{ ` ` ` `if` `(lcp[i] < length) ` ` ` `length = lcp[i]; ` ` ` `} ` ` ` ` ` `return` `(length); ` `} ` ` ` `// A function to answer queries of longest common extension ` `void` `LCEQueries(string str, ` `int` `n, Query q[], ` ` ` `int` `m) ` `{ ` ` ` `// Build a suffix array ` ` ` `vector<` `int` `>suffixArr = buildSuffixArray(str, str.length()); ` ` ` ` ` `// An auxiliary array to store inverse of suffix array ` ` ` `// elements. For example if suffixArr[0] is 5, the ` ` ` `// invSuff[5] would store 0. This is used to get next ` ` ` `// suffix string from suffix array. ` ` ` `vector<` `int` `> invSuff(n, 0); ` ` ` ` ` `// Build a lcp vector ` ` ` `vector<` `int` `>lcp = kasai(str, suffixArr, invSuff); ` ` ` ` ` ` ` `for` `(` `int` `i=0; i<m; i++) ` ` ` `{ ` ` ` `int` `L = q[i].L; ` ` ` `int` `R = q[i].R; ` ` ` ` ` `printf` `(` `"LCE (%d, %d) = %d\n"` `, L, R, ` ` ` `LCE(lcp, invSuff, n, L, R)); ` ` ` `} ` ` ` ` ` `return` `; ` `} ` ` ` `// Driver Program to test above functions ` `int` `main() ` `{ ` ` ` `string str = ` `"abbababba"` `; ` ` ` `int` `n = str.length(); ` ` ` ` ` `// LCA Queries to answer ` ` ` `Query q[] = {{1, 2}, {1, 6}, {0, 5}}; ` ` ` `int` `m = ` `sizeof` `(q)/` `sizeof` `(q[0]); ` ` ` ` ` `LCEQueries(str, n, q, m); ` ` ` ` ` `return` `(0); ` `} ` |

*chevron_right*

*filter_none*

Output:

LCE (1, 2) = 1 LCE (1, 6) = 3 LCE (0, 5) = 4

**Analysis of Reduction to RMQ method**

**Time Complexity :**

- To construct the lcp and the suffix array it takes
**O(N.logN)**time. - To answer each query it takes
**O(|invSuff[R] – invSuff[L]|)**. - http://www.sciencedirect.com/science/article/pii/S1570866710000377
- Longest Common Extension / LCE | Set 1 (Introduction and Naive Method)
- Longest Common Extension / LCE | Set 3 (Segment Tree Method)
- Longest Common Substring | DP-29
- Longest Common Subsequence | DP-4
- Longest Common Prefix using Trie
- Longest Common Prefix Matching | Set-6
- Longest Common Prefix using Sorting
- Longest Common Subsequence | DP using Memoization
- LCS (Longest Common Subsequence) of three strings
- Longest Common Anagram Subsequence
- Print the longest common substring
- Printing Longest Common Subsequence
- Length of longest common subsequence containing vowels
- Program for longest common directory path
- Longest common anagram subsequence from N strings
- Edit distance and LCS (Longest Common Subsequence)
- Longest Common Prefix using Linked List
- Longest Common Prefix using Binary Search
- Longest Common Substring in an Array of Strings
- Longest common subsequence with permutations allowed

Hence the overall time complexity is **O(N.logN + Q. (|invSuff[R] – invSuff[L]|))**

where,

**Q **= Number of LCE Queries.

**N **= Length of the input string.

**invSuff[]** = Inverse suffix array of the input string.

Although this may seems like an inefficient algorithm but this algorithm generally outperforms all other algorithms to answer the LCE queries.

We will give a detail description of the performance of this method in the next set.

**Auxiliary Space: ** We use **O(N)** auxiliary space to store lcp, suffix and inverse suffix arrays.

**Reference:**

This article is contributed by **Rachit Belwariar **. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

Don’t stop now and take your learning to the next level. Learn all the important concepts of Data Structures and Algorithms with the help of the most trusted course: **DSA Self Paced**. Become industry ready at a student-friendly price.