Search strings with the help of given pattern in an Array of strings

Prerequisite: Trie | (Insert and Search)

Given an array of strings words[] and a partial string str, the task is to find the strings of the given form str from the given array of string.

A partial string is a string with some missing characters. For Example: “..ta”, is a string of length 4 ending with “ta” and having two missing character at index 0 and 1.

Examples:

Input: words[] = [“moon”, “soon”, “month”, “date”, “data”], str = “..on”
Output: [“moon”, “soon”]
Explanation:
“moon” and “soon” matches the given partial string “..on”



Input: words[] = [“date”, “data”, “month”], str = “d.t.”
Output: [“date”, “data”]
Explanation:
“date” and “data” matches the given partial string “d.t.”

Approach:

Structure of a Trie Node: The idea is to use Trie to solve the given problem Below are the steps and structures of the trie:

struct TrieNode 
{
     struct TrieNode* children[26];
     bool endOfWord;
};

The following picture explains the construction of trie using keys given in the example above

           root
      /     |      \
     d      m       s
     |      |       |
     a      o       o
     |      |  \    |
     t      o   n   o
  /  |      |   |   |
 e   a      n   t   n
                |
                h

Every node is a TrieNode with pointer links to subsequent children according to the word added. Value at other pointer positions where characters are not present are marked by NULL. endOfWord are represented by blue color or leaf nodes.

Steps:

  1. Insert all the available words into the trie structure using the addWord() method.
  2. Every character of the word to be added is inserted as an individual TrieNode. The children array is an array of 26 TrieNode pointers.
  3. Each index representing a character from the English alphabet. If a new word is added, then for each character, it must be checked if the TrieNode pointer for that alphabet exists, then proceed further with next character, if not, a new TrieNode is created and the pointer is made to point this new node and the process repeats for next character at this new node. endOfWord is made true for the TrieNode pointed by the last character’s TrieNode pointer.
  4. For searching the key check for the presence of TrieNode at the index marked by the character. If present, we move down the branch and repeat the process for the next character. Similarly searching for the partial string if a ‘.’ is found, we look for all available TrieNode pointer in the children array and proceed further with each character, identified by an index, occupying the position of ‘.’ once.
  5. If the pointer location is empty at any point, we return not found. Else check for endOfWord at the last TrieNode, if false, we return not found, else word is found.

Below is the implementation of the above approach:

C++

filter_none

edit
close

play_arrow

link
brightness_4
code

// C++ program for the above approach
#include <bits/stdc++.h>
using namespace std;
  
// Dictionary Class
class Dictionary {
public:
    // Initialize your data structure
    Dictionary* children[26];
    bool endOfWord;
  
    // Constructor
    Dictionary()
    {
        this->endOfWord = false;
        for (int i = 0; i < 26; i++) {
            this->children[i] = NULL;
        }
    }
  
    // Adds a word into a data structure
    void addWord(string word)
    {
        // Crawl pointer points the object
        // in reference
        Dictionary* pCrawl = this;
  
        // Traverse the given array of words
        for (int i = 0; i < word.length(); i++) {
            int index = word[i] - 'a';
            if (!pCrawl->children[index])
                pCrawl->children[index]
                    = new Dictionary();
  
            pCrawl = pCrawl->children[index];
        }
        pCrawl->endOfWord = true;
    }
  
    // Function that returns if the word
    // is in the data structure or not
  
    // A word can contain a dot character '.'
    // to represent any one letter
    void search(string word, bool& found,
                string curr_found = "",
                int pos = 0)
    {
        Dictionary* pCrawl = this;
  
        if (pos == word.length()) {
            if (pCrawl->endOfWord) {
                cout << "Found: "
                     << curr_found << "\n";
                found = true;
            }
            return;
        }
  
        if (word[pos] == '.') {
  
            // Iterate over every letter and
            // proceed further by replacing
            // the character in place of '.'
            for (int i = 0; i < 26; i++) {
                if (pCrawl->children[i]) {
                    pCrawl
                        ->children[i]
                        ->search(word,
                                 found,
                                 curr_found + char('a' + i),
                                 pos + 1);
                }
            }
        }
        else {
  
            // Check if pointer at character
            // position is available,
            // then proceed
            if (pCrawl->children[word[pos] - 'a']) {
                pCrawl
                    ->children[word[pos] - 'a']
                    ->search(word,
                             found,
                             curr_found + word[pos],
                             pos + 1);
            }
        }
        return;
    }
  
    // Utility function for search operation
    void searchUtil(string word)
    {
        Dictionary* pCrawl = this;
  
        cout << "\nSearching for \""
             << word << "\"\n";
        bool found = false;
        pCrawl->search(word, found);
        if (!found)
            cout << "No Word Found...!!\n";
    }
};
  
// Function that search the given pattern
void searchPattern(string arr[], int N,
                   string str)
{
    // Object of the class Dictionary
    Dictionary* obj = new Dictionary();
  
    for (int i = 0; i < N; i++) {
        obj->addWord(arr[i]);
    }
  
    // Search pattern
    obj->searchUtil(str);
}
  
// Driver Code
int main()
{
    // Given an array of words
    string arr[] = { "data", "date", "month" };
  
    int N = 3;
  
    // Given pattern
    string str = "d.t.";
  
    // Function Call
    searchPattern(arr, N, str);
}

chevron_right


Output:

Searching for "d.t."
Found: data
Found: date

Time Complexity: O(M*log(N)), where N is the number of strings and M is length of the given pattern
Auxiliary Space: O(26*M)

Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.