Open In App

CSES solution-Repeating Substring

A repeating substring is a substring that occurs in two (or more) locations in the string. Your task is to find the longest repeating substring in a given string.

Example:

Input: s = "cabababc"
Output: abab

Input: s = "babababb"
Output: babab

Approach:

The solution is based on two main concepts: Suffix Arrays and Longest Common Prefix (LCP) Arrays.

Suffix Arrays: A suffix array is a sorted array of all suffixes of a given string. The suffixes are sorted in lexicographical order. The purpose of creating a suffix array is to sort all suffixes so that we can search for patterns (in this case, repeating substrings) in the sorted list of suffixes.

Longest Common Prefix (LCP) Array: The LCP array is an array that stores the longest common prefix between two consecutive suffixes in the sorted suffix array. The purpose of creating an LCP array is to find the longest common prefix between all pairs of consecutive suffixes. This helps in finding the longest repeating substring.

The core logic of the solution is as follows:

  • Step 1: Build the suffix array of the string. This is done using the Manber-Myers algorithm, which is an efficient algorithm to build a suffix array in O(n log n) time. The algorithm starts by sorting all 1-length suffixes, then 2-length, 4-length, and so on until all suffixes are sorted.
  • Step 2: Once the suffix array is built, the next step is to build the LCP array. This is done by comparing characters of suffixes one by one. If the characters match, increment the count of the longest common prefix.
  • Step 3: After the LCP array is built, the maximum value in the LCP array is the length of the longest repeating substring. The substring itself can be obtained from the suffix array.
  • Step 4: If the maximum LCP is 0, it means there are no repeating substrings, so the program outputs -1. Otherwise, it prints the longest repeating substring.

Step-by-step approach:

Below are the implementation of the above approach:

#include <bits/stdc++.h>
using namespace std;

#define int long long
#define endl '\n'

// Maximum size of the string
const int maxSize = 1e5 + 5;

// Arrays for suffix array, position, temporary, and longest
// common prefix
int suffixArray[maxSize], position[maxSize],
    temporary[maxSize], LCP[maxSize];

// Variables for gap between characters and size of the
// string
int gap, n;

// The input string
string s;

// Function to compare two suffixes
bool compareSuffixes(int x, int y)
{
    // Compare function for sorting suffix array
    if (position[x] != position[y])
        return position[x] < position[y];
    x += gap;
    y += gap;
    return (x < n && y < n) ? position[x] < position[y]
                            : x > y;
}

// Function to build the suffix array
void buildSuffixArray()
{
    // Build the suffix array using the Manber-Myers
    // algorithm

    // Initialize suffix array and position array
    for (int i = 0; i < n; i++)
        suffixArray[i] = i, position[i] = s[i];

    // Iterate over gaps (powers of 2) for sorting
    for (gap = 1;; gap <<= 1) {
        // Sort suffix array using the current gap
        sort(suffixArray, suffixArray + n, compareSuffixes);

        // Update temporary array based on the sorted suffix
        // array
        for (int i = 0; i < n - 1; i++)
            temporary[i + 1]
                = temporary[i]
                  + compareSuffixes(suffixArray[i],
                                    suffixArray[i + 1]);

        // Update position array with temporary values
        for (int i = 0; i < n; i++)
            position[suffixArray[i]] = temporary[i];

        // Check if all suffixes are at their correct
        // position
        if (temporary[n - 1] == n - 1)
            break;
    }
}

// Function to build the longest common prefix array
void buildLCP()
{
    // Build the Longest Common Prefix (LCP) array

    // Iterate over the original string to compute LCP
    // values
    for (int i = 0, k = 0; i < n; i++)
        if (position[i] != n - 1) {
            // Get the next suffix in the sorted order
            int j = suffixArray[position[i] + 1];

            // Compare characters and update LCP
            while (s[i + k] == s[j + k])
                k++;

            // Set LCP value and decrement if non-zero
            LCP[position[i]] = k;
            if (k)
                k--;
        }
     
}

// Main function
signed main()
{

    // Read the input string
    s = "cabababc";
    n = s.size();

    // Build the suffix array and longest common prefix
    // array
    buildSuffixArray();
    buildLCP();

    // Find the index of the maximum element in the LCP
    // array
    int maxIndex = max_element(LCP, LCP + n) - LCP;

    // If the maximum LCP value is 0, no common substring
    // exists
    if (LCP[maxIndex] == 0)
        return cout << -1, 0;

    // Output the longest common substring
    cout << s.substr(suffixArray[maxIndex], LCP[maxIndex]);
}
import java.util.Arrays;

public class Main {
    // Maximum size of the string
    static final int maxSize = (int) Math.pow(10, 5) + 5;

    // Variables for gap between characters and size of the string
    static int gap = 0;
    static int n = 0;

    // The input string
    static String s = "";

    // Function to compare two suffixes
    static boolean compareSuffixes(int x, int y, int[] position) {
        // Compare function for sorting suffix array
        if (position[x] != position[y]) {
            return position[x] < position[y];
        }
        x += gap;
        y += gap;
        return (x < n && y < n) && (position[x] < position[y]) || (x > y);
    }

    // Function to build the suffix array
    static int[][] buildSuffixArray() {
        // Initialize suffix array and position array
        Integer[] suffixArray = new Integer[n];
        for (int i = 0; i < n; i++) {
            suffixArray[i] = i;
        }
        int[] position = new int[n];
        for (int i = 0; i < n; i++) {
            position[i] = s.charAt(i);
        }

        // Iterate over gaps (powers of 2) for sorting
        gap = 1;
        while (true) {
            // Sort suffix array using the current gap
            Arrays.sort(suffixArray, (x, y) -> compareSuffixes(x, y, position) ? -1 : 1);

            // Check if all suffixes are at their correct position
            if (gap >= n) {
                break;
            }

            // Update temporary array based on the sorted suffix array
            int[] temporary = new int[n];
            for (int i = 0; i < n - 1; i++) {
                temporary[i + 1] = temporary[i] + (compareSuffixes(suffixArray[i], suffixArray[i + 1], position) ? 1 : 0);
            }

            // Update position array with temporary values
            for (int i = 0; i < n; i++) {
                position[suffixArray[i]] = temporary[i];
            }

            gap <<= 1;
        }

        return new int[][] {Arrays.stream(suffixArray).mapToInt(Integer::intValue).toArray(), position};
    }

    // Function to build the longest common prefix array
    static int[] buildLCP(int[] suffixArray, int[] position) {
        // Build the Longest Common Prefix (LCP) array

        // Iterate over the original string to compute LCP values
        int[] LCP = new int[n];
        int k = 0;
        for (int i = 0; i < n; i++) {
            if (position[i] != n - 1) {
                // Get the next suffix in the sorted order
                int j = suffixArray[position[i] + 1];

                // Compare characters and update LCP
                while (i + k < n && j + k < n && s.charAt(i + k) == s.charAt(j + k)) {
                    k++;
                }

                // Set LCP value and decrement if non-zero
                LCP[position[i]] = k;
                if (k > 0) {
                    k--;
                }
            }
        }
        return LCP;
    }

    public static void main(String[] args) {
        // Read the input string
        s = "cabababc";
        n = s.length();

        // Build the suffix array
        int[][] result = buildSuffixArray();
        int[] suffixArray = result[0];
        int[] position = result[1];

        // Build the longest common prefix array
        int[] LCP = buildLCP(suffixArray, position);

        // Find the index of the maximum element in the LCP array
        int maxIndex = 0;
        for (int i = 1; i < LCP.length; i++) {
            if (LCP[i] > LCP[maxIndex]) {
                maxIndex = i;
            }
        }

        // If the maximum LCP value is 0, no common substring exists
        if (LCP[maxIndex] == 0) {
            System.out.println(-1);
        } else {
            // Output the longest common substring
            System.out.println(s.substring(suffixArray[maxIndex], suffixArray[maxIndex] + LCP[maxIndex]));
        }
    }
}
# Maximum size of the string
maxSize = 10**5 + 5

# Variables for gap between characters and size of the string
gap = 0
n = 0

# The input string
s = ""

# Function to compare two suffixes
def compareSuffixes(x, y, position):
    # Compare function for sorting suffix array
    if position[x] != position[y]:
        return position[x] < position[y]
    x += gap
    y += gap
    return (x < n and y < n) and (position[x] < position[y]) or (x > y)

# Function to build the suffix array
def buildSuffixArray():
    global gap, n

    # Initialize suffix array and position array
    suffixArray = [i for i in range(n)]
    position = [ord(c) for c in s]

    # Iterate over gaps (powers of 2) for sorting
    gap = 1
    while True:
        # Sort suffix array using the current gap
        suffixArray.sort(key=lambda x: (position[x], x))

        # Check if all suffixes are at their correct position
        if gap >= n:
            break

        # Update temporary array based on the sorted suffix array
        temporary = [0] * n
        for i in range(n - 1):
            temporary[i + 1] = temporary[i] + compareSuffixes(suffixArray[i], suffixArray[i + 1], position)

        # Update position array with temporary values
        for i in range(n):
            position[suffixArray[i]] = temporary[i]

        gap <<= 1

    return suffixArray, position

# Function to build the longest common prefix array
def buildLCP(suffixArray, position):
    # Build the Longest Common Prefix (LCP) array

    # Iterate over the original string to compute LCP values
    LCP = [0] * n
    k = 0
    for i in range(n):
        if position[i] != n - 1:
            # Get the next suffix in the sorted order
            j = suffixArray[position[i] + 1]

            # Compare characters and update LCP
            while i + k < n and j + k < n and s[i + k] == s[j + k]:
                k += 1

            # Set LCP value and decrement if non-zero
            LCP[position[i]] = k
            if k:
                k -= 1
    return LCP

# Read the input string
s = "cabababc"
n = len(s)

# Build the suffix array
suffixArray, position = buildSuffixArray()

# Build the longest common prefix array
LCP = buildLCP(suffixArray, position)

# Find the index of the maximum element in the LCP array
maxIndex = LCP.index(max(LCP))

# If the maximum LCP value is 0, no common substring exists
if LCP[maxIndex] == 0:
    print(-1)
else:
    # Output the longest common substring
    print(s[suffixArray[maxIndex]: suffixArray[maxIndex] + LCP[maxIndex]])
// Maximum size of the string
const maxSize = 10 ** 5 + 5;

// Variables for gap between characters and size of the string
let gap = 0;
let n = 0;

// The input string
let s = "";

// Function to compare two suffixes
function compareSuffixes(x, y, position) {
    // Compare function for sorting suffix array
    if (position[x] !== position[y]) {
        return position[x] < position[y];
    }
    x += gap;
    y += gap;
    return (x < n && y < n) && (position[x] < position[y]) || (x > y);
}

// Function to build the suffix array
function buildSuffixArray() {
    // Initialize suffix array and position array
    let suffixArray = [...Array(n).keys()];
    let position = [...s].map(c => c.charCodeAt(0));

    // Iterate over gaps (powers of 2) for sorting
    gap = 1;
    while (true) {
        // Sort suffix array using the current gap
        suffixArray.sort((x, y) => {
            const comparison = position[x] - position[y];
            if (comparison !== 0) {
                return comparison;
            }
            return x - y;
        });

        // Check if all suffixes are at their correct position
        if (gap >= n) {
            break;
        }

        // Update temporary array based on the sorted suffix array
        let temporary = Array(n).fill(0);
        for (let i = 0; i < n - 1; i++) {
            temporary[i + 1] = temporary[i] + compareSuffixes(suffixArray[i], suffixArray[i + 1], position);
        }

        // Update position array with temporary values
        for (let i = 0; i < n; i++) {
            position[suffixArray[i]] = temporary[i];
        }

        gap <<= 1;
    }

    return [suffixArray, position];
}

// Function to build the longest common prefix array
function buildLCP(suffixArray, position) {
    // Build the Longest Common Prefix (LCP) array

    // Iterate over the original string to compute LCP values
    let LCP = Array(n).fill(0);
    let k = 0;
    for (let i = 0; i < n; i++) {
        if (position[i] !== n - 1) {
            // Get the next suffix in the sorted order
            let j = suffixArray[position[i] + 1];

            // Compare characters and update LCP
            while (i + k < n && j + k < n && s[i + k] === s[j + k]) {
                k++;
            }

            // Set LCP value and decrement if non-zero
            LCP[position[i]] = k;
            if (k !== 0) {
                k--;
            }
        }
    }
    return LCP;
}

// Read the input string
s = "cabababc";
n = s.length;

// Build the suffix array
const [suffixArray, position] = buildSuffixArray();

// Build the longest common prefix array
const LCP = buildLCP(suffixArray, position);

// Find the index of the maximum element in the LCP array
const maxIndex = LCP.indexOf(Math.max(...LCP));

// If the maximum LCP value is 0, no common substring exists
if (LCP[maxIndex] === 0) {
    console.log(-1);
} else {
    // Output the longest common substring
    console.log(s.substring(suffixArray[maxIndex], suffixArray[maxIndex] + LCP[maxIndex]));
}

Output
abab

Time Complexity: O(n log n)
Auxiliary Space: O(n)


Article Tags :