Find alphabetical order such that words can be considered sorted

Given an array of words, find any alphabetical order in the English alphabet such that the given words can be considered sorted (increasing), if there exists such an order, otherwise output impossible.

Examples:

Input :  words[] = {"zy", "ab"}
Output : zabcdefghijklmnopqrstuvwxy
Basically we need to make sure that 'z' comes
before 'a'.

Input :  words[] = {"geeks", "gamers", "coders", 
                    "everyoneelse"}
Output : zyxwvutsrqponmlkjihgceafdb

Input : words[] = {"marvel", "superman", "spiderman", 
                                           "batman"
Output : zyxwvuptrqonmsbdlkjihgfeca

Naive approach: The brute-force approach would be to check all the possible orders, and check if any of them satisfy the given order of words. Considering there are 26 alphabets in the English language, there are 26! number of permutations that can be valid orders. Considering we check every pair for verifying an order, the complexity of this approach goes to O(26!*N^2), which is well beyond practically preferred time complexity.

Using topological sort: This solution requires knowledge of Graphs and its representation as adjacency lists, DFS and Topological sorting.

In our required order, it is required to print letters such that each letter must be followed by the letters that are placed in lower priority than them. It seems somewhat similar to what topological sort is defined as – In topological sorting, we need to print a vertex before its adjacent vertices. Let’s define each letter in the alphabet as nodes in a standard directed graph. A is said to be connected to B (A—>B) if A precedes B in the order. The algorithm can be formulated as follows:

  1. If n is 1, then any order is valid.
  2. Take the first two words. Identify the first different letter (at the same index of the words) in the words. The letter in the first word will precede the letter in the second word.
  3. If there exists no such letter, then the first string must be smaller in length than the second string.
  4. Assign the second word to the first word and input the third word into the second word. Repeat 2, 3 and 4 (n-1) times.
  5. Run a DFS traversal in topological order.
  6. Check if all the nodes are visited. In topological order, if there are cycles in the graph, the nodes in the cycles remain not visited, since it is not possible to visit these nodes after visiting every node adjacent to it. In such a case, order does not exist. In this case, it means that the order in our list contradicts itself.
filter_none

edit
close

play_arrow

link
brightness_4
code

/* CPP program to find an order of alphabets
so that given set of words are considered
sorted */
#include <bits/stdc++.h>
using namespace std;
#define MAX_CHAR 26
  
void findOrder(vector<string> v)
{
    int n = v.size();
  
    /* If n is 1, then any order works */
    if (n == 1) {
        cout << "abcdefghijklmnopqrstuvwxyz";
        return;
    }
  
    /* Adjacency list of 26 characters*/
    vector<int> adj[MAX_CHAR];
  
    /* Array tracking the number of edges that are 
    inward to each node*/
    vector<int> in(MAX_CHAR, 0);
  
    // Traverse through all words in given array
    string prev = v[0];
  
    /* (n-1) loops because we already acquired the 
    first word in the list*/
    for (int i = 1; i < n; ++i) {
        string s = v[i];
  
        /* Find first such letter in the present string that is different 
        from the letter in the previous string at the same index*/
        int j;
        for (j = 0; j < min(prev.length(), s.length()); ++j)
            if (s[j] != prev[j])
                break;
  
        if (j < min(prev.length(), s.length())) {
  
            /* The letter in the previous string precedes the the one
            in the present string, hence add the letter in the present
            string as the child of the letter in the previous string*/
            adj[prev[j] - 'a'].push_back(s[j] - 'a');
  
            /* The number of inward pointing edges to the node representing 
            the letter in the present string increases by one*/
            in[s[j] - 'a']++;
  
            /* Assign present string to previous string for the next 
            iteration. */
            prev = s;
            continue;
        }
  
        /* If there exists no such letter then the string length of 
        the previous string must be less than or equal to the 
        present string, otherwise no such order exists*/
        if (prev.length() > s.length()) {
            cout << "Impossible";
            return;
        }
  
        /* Assign present string to previous string for the next
        iteration */
        prev = s;
    }
  
    /* Topological ordering requires the source nodes 
    that have no parent nodes*/
    stack<int> stk;
    for (int i = 0; i < MAX_CHAR; ++i)
        if (in[i] == 0)
            stk.push(i);
  
    /* Vector storing required order (anyone that satisfies) */
    vector<char> out;
  
    /* Array to keep track of visited nodes */
    bool vis[26];
    memset(vis, false, sizeof(vis));
  
    /* Standard DFS */
    while (!stk.empty()) {
  
        /* Acquire present character */
        char x = stk.top();
        stk.pop();
  
        /* Mark as visited */
        vis[x] = true;
  
        /* Insert character to output vector */
        out.push_back(x + 'a');
  
        for (int i = 0; i < adj[x].size(); ++i) {
            if (vis[adj[x][i]])
                continue;
  
            /* Since we have already included the the present 
            character in the order, the number edges inward 
            to this child node can be reduced*/
            in[adj[x][i]]--;
  
            /* If the number of inward edges have been removed, 
            we can include this node as a source node*/
            if (in[adj[x][i]] == 0)
                stk.push(adj[x][i]);
        }
    }
  
    /* Check if all nodes(alphabets) have been visited.
    Order impossible if any one is unvisited*/
    for (int i = 0; i < MAX_CHAR; ++i)
        if (!vis[i]) {
            cout << "Impossible";
            return;
        }
  
    for (int i = 0; i < out.size(); ++i)
        cout << out[i];
}
  
// Driver code
int main()
{
    vector<string> v{ "efgh", "abcd" };
    findOrder(v);
    return 0;
}

chevron_right


Output :

zyxwvutsrqponmlkjihgfeadcb

The complexity of this approach is O(N*|S|) + O(V+E), where |V|=26 (number of nodes is the same as number of alphabets) and |E|<N (since at most 1 edge is created for each word as input). Hence overall complexity is O(N*|S|+N). |S| represents the length of each word.



My Personal Notes arrow_drop_up

Overexcited coder and gamer

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.