Sum of all LCP of maximum length by selecting any two Strings at a time

Given a list of strings, the task is to find the sum of all LCP (Longest Common Prefix) of maximum length by selecting any two strings at a time.

Examples:

Input: str[] = {babab, ababb, abbab, aaaaa, babaa, babbb}
Output: 6
Explaination:
Choose 1st and 5th string => length of LCP = 4,
Choose 2nd and 3rd string => length of LCP = 2
Sum of LCP = 4 + 2 = 6



Input: str = [“aa”, “aaaa”, “aaaaaaaa”, “aaaabaaaa”, “aaabaaa”]
Output: 7
Explanation:
Choose 3rd (aaaaaaaa) and 4th string (aaaabaaaa) => length of LCP (aaaa) = 4,
Choose 2nd (aaaa) and 5th (aaabaaa) string => length of LCP (aaa) = 3
Sum of LCP = 4 + 3 = 7

Naive Approach:

  • Sort the list of strings in decreasing order of their length
  • Then take the first string from the list and find the Longest Common Prefix with all other remaining string in the list and store it in the array
  • Choose the maximum value from the array and add it to variable answer and remove the pair of string from the list corresponding to that sum
  • Repeat the above procedures for all the next strings till the list is empty or you reach the last string
  • The variable answer has the required sum of all LCP of maximum length

Time Complexity: O(M*N2), where M = maximum string length and N = number of strings.

Efficient Approach:
An efficient solution can be obtained using a Trie Data Structure. To find the number of characters common between the strings we will use the variable ‘visited’ to keep track of how many times one character is visited.
Following are the steps:

  • Insert list of string in trie such that every string in the list is inserted as an individual trie node.
  • For all prefixes of maximum length, count the pairs from deepest node in the trie.
  • Use depth-first search (DFS) traversal on trie to count the pairs from deepest node.
  • If the value of visited node is more than one, it means that there two or more strings that have common prefix up till that node.
  • Add the value of that visited node to a variable count.
  • Decrease the value of that visited node from current and previous nodes such that the pair of words chosen for calculation must be removed.
  • Repeat the above steps for all nodes and return the value of count.

Below is the implementation of the above approach:

C++

filter_none

edit
close

play_arrow

link
brightness_4
code

// C++ program to find Sum of all LCP
// of maximum length by selecting
// any two Strings at a time
  
#include <bits/stdc++.h>
using namespace std;
  
class TrieNode {
public:
    char val;
  
    // Using map to store the pointers
    // of children nodes for dynamic
    // implementation, for making the
    // program space efiicient
    map<char, TrieNode*> children;
  
    // Counts the number of times the node
    // is visited while making the trie
    int visited;
  
    // Initially visited value for all
    // nodes is zero
    TrieNode(char x)
    {
        val = x;
        visited = 0;
    }
};
  
class Trie {
public:
    TrieNode* head;
  
    // Head node of the trie is initialize
    // as '\0', after this all strings add
    Trie()
    {
        head = new TrieNode('\0');
    }
  
    // Function to insert the strings in
    // the trie
    void addWord(string s)
    {
        TrieNode* temp = head;
        const unsigned int n = s.size();
  
        for (int i = 0; i < n; i++) {
  
            // Inserting character-by-character
            char ch = s[i];
  
            // If the node of ch is not present in
            // map make a new node and add in map
            if (!temp->children[ch]) {
                temp->children[ch] = new TrieNode(ch);
            }
            temp = temp->children[ch];
            temp->visited++;
        }
    }
  
    // Recursive function to calculate the
    // answer argument is passed by reference
    int dfs(TrieNode* node, int& ans, int depth)
    {
        // To store changed visited values from
        // children of this node i.e. number of
        // nodes visited by its children
        int vis = 0;
        for (auto child : node->children) {
            vis += dfs(child.second, ans, depth + 1);
        }
  
        // Updating the visited variable, telling
        // number of nodes that have
        // already been visited by its children
        node->visited -= vis;
        int string_pair = 0;
  
        // If node->visited > 1, means more than
        // one string has prefix up till this node
        // common in them
        if (node->visited > 1) {
  
            // Number of string pair with current
            // node common in them
            string_pair = (node->visited / 2);
            ans += (depth * string_pair);
  
            // Updating visited variable of current node
            node->visited -= (2 * string_pair);
        }
  
        // Returning the total number of nodes
        // already visited that needs to be
        // updated to previous node
        return (2 * string_pair + vis);
    }
  
    // Function to run the dfs function for the
    // first time and give the answer variable
    int dfshelper()
    {
  
        // Stores the final answer
        // as sum of all depths
        int ans = 0;
        dfs(head, ans, 0);
        return ans;
    }
};
  
// Driver Function
int main()
{
    Trie T;
    string str[]
        = { "babab", "ababb", "abbab",
            "aaaaa", "babaa", "babbb" };
  
    int n = 6;
    for (int i = 0; i < n; i++) {
        T.addWord(str[i]);
    }
    int ans = T.dfshelper();
    cout << ans << endl;
  
    return 0;
}

chevron_right


Java

filter_none

edit
close

play_arrow

link
brightness_4
code

// Java program to find Sum of all LCP
// of maximum length by selecting
// any two Strings at a time
import java.util.*;
  
class GFG
{
  
static class TrieNode 
{
    char val;
  
    // Using map to store the pointers
    // of children nodes for dynamic
    // implementation, for making the
    // program space efiicient
    HashMap<Character, TrieNode> children;
  
    // Counts the number of times the node
    // is visited while making the trie
    int visited;
  
    // Initially visited value for all
    // nodes is zero
    TrieNode(char x)
    {
        val = x;
        visited = 0;
        children = new HashMap<>();
    }
}
  
static class Trie 
{
  
    TrieNode head;
    int ans;
  
    // Head node of the trie is initialize
    // as '\0', after this all Strings add
    Trie()
    {
        head = new TrieNode('\0');
        ans = 0;
    }
  
    // Function to insert the Strings in
    // the trie
    void addWord(String s)
    {
        TrieNode temp = head;
        int n = s.length();
  
        for (int i = 0; i < n; i++)
        {
  
            // Inserting character-by-character
            char ch = s.charAt(i);
  
            // If the node of ch is not present in
            // map make a new node and add in map
            if (temp.children.get(ch) == null
            {
                temp.children.put(ch, new TrieNode(ch));
            }
            temp = temp.children.get(ch);
            temp.visited++;
        }
    }
  
    // Recursive function to calculate the
    // answer argument is passed by reference
    int dfs(TrieNode node, int depth)
    {
        // To store changed visited values from
        // children of this node i.e. number of
        // nodes visited by its children
        int vis = 0;
        Iterator hmIterator = node.children.entrySet().iterator(); 
        while (hmIterator.hasNext()) 
        
            Map.Entry child = (Map.Entry)hmIterator.next();
            vis += dfs((TrieNode)child.getValue(), depth + 1);
        }
  
        // Updating the visited variable, telling
        // number of nodes that have
        // already been visited by its children
        node.visited -= vis;
        int String_pair = 0;
  
        // If node.visited > 1, means more than
        // one String has prefix up till this node
        // common in them
        if (node.visited > 1)
        {
  
            // Number of String pair with current
            // node common in them
            String_pair = (node.visited / 2);
            ans += (depth * String_pair);
  
            // Updating visited variable of current node
            node.visited -= (2 * String_pair);
        }
  
        // Returning the total number of nodes
        // already visited that needs to be
        // updated to previous node
        return (2 * String_pair + vis);
    }
  
    // Function to run the dfs function for the
    // first time and give the answer variable
    int dfshelper()
    {
  
        // Stores the final answer
        // as sum of all depths
        ans = 0;
        dfs(head, 0);
        return ans;
    }
}
  
// Driver code
public static void main(String args[])
{
    Trie T = new Trie();
    String str[]
        = { "babab", "ababb", "abbab",
            "aaaaa", "babaa", "babbb" };
  
    int n = 6;
    for (int i = 0; i < n; i++) 
    {
        T.addWord(str[i]);
    }
    int ans = T.dfshelper();
    System.out.println( ans );
}
}
// This code is contributed by Arnab Kundu

chevron_right


Output:

6

Time Complexity:
For inserting all the strings in the trie: O(MN)
For performing trie traversal: O(26*M) ~ O(M)
Therefore, overall Time complexity: O(M*N), where:

N = Number of strings
M = Length of the largest string

Auxiliary Space: O(M)




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.



Improved By : andrew1234