Find all strings that match specific pattern in a dictionary

Given a dictionary of words, find all strings that matches the given pattern where every character in the pattern is uniquely mapped to a character in the dictionary.

Examples:

Input:  
dict = ["abb", "abc", "xyz", "xyy"];
pattern = "foo"
Output: [xyy abb]
xyy and abb have same character at 
index 1 and 2 like the pattern

Input: 
dict = ["abb", "abc", "xyz", "xyy"];
pat = "mno"
Output: [abc xyz]
abc and xyz have all distinct characters,
similar to the pattern.

Input:  
dict = ["abb", "abc", "xyz", "xyy"];
pattern = "aba"
Output: [] 
Pattern has same character at index 0 and 2. 
No word in dictionary follows the pattern.

Input:  
dict = ["abab", "aba", "xyz", "xyx"];
pattern = "aba"
Output: [aba xyx]
aba and xyx have same character at 
index 0 and 2 like the pattern

Method 1:

Approach: The aim is to find whether the word has the same structure as the pattern. An approach to this problem can be to make a hash of the word and pattern and compare if they are equal or not. In simple language, we assign different integers to the distinct characters of the word and make a string of integers (hash of the word) according to the occurrence of a particular character in that word and then compare it with the hash of the pattern.



Example:

Word='xxyzzaabcdd'
Pattern='mmnoopplfmm'
For word-:
map['x']=1;
map['y']=2;
map['z']=3;
map['a']=4;
map['b']=5;
map['c']=6;
map['d']=7;
Hash for Word="11233445677"

For Pattern-:
map['m']=1;
map['n']=2;
map['o']=3;
map['p']=4;
map['l']=5;
map['f']=6;
Hash for Pattern="11233445611"
Therefore in the given example Hash of word 
is not equal to Hash of pattern so this word 
is not included in the answer

Algorithm :

  1. Encode the pattern according to above approach and store the corresponding hash of pattern in a string variable hash.
  2. Algorithm to encode -:
    1. Initialise a counter i=0 which will map distinct character with distinct integers.
    2. Read the string and if the current character is not mapped to an integer, map it to the counter value and increment it.
    3. Concatenate the integer mapped to the current character to the hash string.
  3. Now read each word and make a hash of it using same algorithm .
  4. If the hash of current word is equal to hash of the pattern then that word is included in the final answer .

Pseudo Code:

int i=0
Declare map
for character in pattern:
   if(map[character]==map.end())
      map[character]=i++;
   
   hash_pattern+=to_string(mp[character])

for words in dictionary:
   i=0;
   Declare map
   if(words.length==pattern.length)
     for character in words:
         if(map[character]==map.end())
            map[character]=i++

          hash_word+=to_string(map[character)
          
      if(hash_word==hash_pattern)
      print words

C++

filter_none

edit
close

play_arrow

link
brightness_4
code

// C++ program to print all
// the strings that match the
// given pattern where every
// character in the pattern is
// uniquely mapped to a character
// in the dictionary
#include <bits/stdc++.h>
using namespace std;
  
// Function to encode given string
string encodeString(string str)
{
    unordered_map<char, int> map;
    string res = "";
    int i = 0;
  
    // for each character in given string
    for (char ch : str) {
  
        // If the character is occurring
        // for the first time, assign next
        // unique number to that char
        if (map.find(ch) == map.end())
            map[ch] = i++;
  
        // append the number associated
        // with current character into the
        // output string
        res += to_string(map[ch]);
    }
  
    return res;
}
  
// Function to print all the
// strings that match the
// given pattern where every
// character in the pattern is
// uniquely mapped to a character
// in the dictionary
void findMatchedWords(unordered_set<string> dict,
                      string pattern)
{
    // len is length of the pattern
    int len = pattern.length();
  
    // Encode the string
    string hash = encodeString(pattern);
  
    // for each word in the dictionary
    for (string word : dict) {
        // If size of pattern is same as
        // size of current dictionary word
        // and both pattern and the word
        // has same hash, print the word
        if (word.length() == len
            && encodeString(word) == hash)
            cout << word << " ";
    }
}
  
// Driver code
int main()
{
    unordered_set<string> dict = { "abb", "abc",
                                   "xyz", "xyy" };
    string pattern = "foo";
  
    findMatchedWords(dict, pattern);
  
    return 0;
}

chevron_right


Java

filter_none

edit
close

play_arrow

link
brightness_4
code

// Java program to print all the
// strings that match the
// given pattern where every
// character in the pattern is
// uniquely mapped to a character
// in the dictionary
import java.io.*;
import java.util.*;
  
class GFG {
  
    // Function to encode given string
    static String encodeString(String str)
    {
        HashMap<Character, Integer> map = new HashMap<>();
        String res = "";
        int i = 0;
  
        // for each character in given string
        char ch;
        for (int j = 0; j < str.length(); j++) {
            ch = str.charAt(j);
  
            // If the character is occurring for the first
            // time, assign next unique number to that char
            if (!map.containsKey(ch))
                map.put(ch, i++);
  
            // append the number associated with current
            // character into the output string
            res += map.get(ch);
        }
  
        return res;
    }
  
    // Function to print all
    // the strings that match the
    // given pattern where every
    // character in the pattern is
    // uniquely mapped to a character
    // in the dictionary
    static void findMatchedWords(
        String[] dict, String pattern)
    {
        // len is length of the pattern
        int len = pattern.length();
  
        // encode the string
        String hash = encodeString(pattern);
  
        // for each word in the dictionary array
        for (String word : dict) {
            // If size of pattern is same
            // as size of current
            // dictionary word and both
            // pattern and the word
            // has same hash, print the word
            if (word.length() == len
                && encodeString(word).equals(hash))
                System.out.print(word + " ");
        }
    }
  
    // Driver code
    public static void main(String args[])
    {
        String[] dict = { "abb", "abc",
                          "xyz", "xyy" };
        String pattern = "foo";
  
        findMatchedWords(dict, pattern);
    }
  
    // This code is contributed
    // by rachana soma
}

chevron_right


C#

filter_none

edit
close

play_arrow

link
brightness_4
code

// C# program to print all the strings
// that match the given pattern where
// every character in the pattern is
// uniquely mapped to a character in the dictionary
using System;
using System.Collections.Generic;
public class GFG {
  
    // Function to encode given string
    static String encodeString(String str)
    {
        Dictionary<char, int> map = new Dictionary<char, int>();
        String res = "";
        int i = 0;
  
        // for each character in given string
        char ch;
        for (int j = 0; j < str.Length; j++) {
            ch = str[j];
  
            // If the character is occurring for the first
            // time, assign next unique number to that char
            if (!map.ContainsKey(ch))
                map.Add(ch, i++);
  
            // append the number associated with current
            // character into the output string
            res += map[ch];
        }
  
        return res;
    }
  
    // Function to print all the
    // strings that match the
    // given pattern where every
    // character in the pattern is
    // uniquely mapped to a character
    // in the dictionary
    static void findMatchedWords(String[] dict, String pattern)
    {
        // len is length of the pattern
        int len = pattern.Length;
  
        // encode the string
        String hash = encodeString(pattern);
  
        // for each word in the dictionary array
        foreach(String word in dict)
        {
            // If size of pattern is same as
            // size of current dictionary word
            // and both pattern and the word
            // has same hash, print the word
            if (word.Length == len && encodeString(word).Equals(hash))
                Console.Write(word + " ");
        }
    }
  
    // Driver code
    public static void Main(String[] args)
    {
        String[] dict = { "abb", "abc", "xyz", "xyy" };
        String pattern = "foo";
  
        findMatchedWords(dict, pattern);
    }
}
  
// This code is contributed by 29AjayKumar

chevron_right


Output:

xyy abb

Complexity Analysis:

  • Time Complexity: O(N*K).
    Here ‘N’ is the number of words and ‘K’ is its length. As we have to traverse each word separately to create its hash.
  • Auxiliary Space : O(N).
    The use of hash_map data structure for mapping characters takes this amount of space.

Method 2:

Approach: Now let’s discuss a little more conceptual approach which is an even better application of maps. Instead of making hash for each word we can map the letters of the pattern itself with the corresponding letter of the word. In case the current character has not been mapped, map it to the corresponding character of the word and if it has already been mapped, then check whether the value with which it was mapped earlier is same as the current value of the word or not. The example below will make things easy to understand.

Example:



Word='xxyzzaa'
Pattern='mmnoopp'
Step 1-: map['m'] = x
Step 2-: 'm' is already mapped to some value, 
check whether that value is equal to current 
character of word-:YES ('m' is mapped to x).
Step 3-: map['n'] = y
Step 4-: map['o'] = z 
Step 5-: 'o' is already mapped to some value, 
check whether that value is equal to current 
character of word-:YES ('o' is mapped to z).
Step 6-: map['p'] = a
Step 7-: 'p' is already mapped to some value, 
check whether that value is equal to current 
character of word-: YES ('p' is mapped to a).
No contradiction so current word matches the pattern

Algorithm :

  1. Create a character array in which we can map the characters of pattern with corresponding character of word.
  2. Firstly check whether the length of word and pattern is equal or not, if no then check next word.
  3. If the length is equal, traverse the pattern and if the current character of the pattern has not been mapped yet, map it to the corresponding character of the word.
  4. If the current character is mapped, the check whether the character with which it has been mapped is equal to the current character of the word.
  5. If no then word does not follow the given pattern.
  6. If word follows the pattern till the last character then print the word.

Pseudo Code:

for words in dictionary:
   char arr_map[128]=0
   if(words.length==pattern.length)
     for character in pattern:
         if(arr_map[character]==0)
            arr_map[character]=word[character]

          else if(arr_map[character]!=word[character]
          break the loop
 
If above loop runs successfully
Print(words) 
filter_none

edit
close

play_arrow

link
brightness_4
code

// C++ program to print all
// the strings that match the
// given pattern where every
// character in the pattern is
// uniquely mapped to a character
// in the dictionary
#include <bits/stdc++.h>
using namespace std;
  
bool check(string pattern, string word)
{
    if (pattern.length() != word.length())
        return false;
  
    char ch[128] = { 0 };
  
    int len = word.length();
  
    for (int i = 0; i < len; i++) {
        if (ch[pattern[i]] == 0)
            ch[pattern[i]] = word[i];
        else if (ch[pattern[i]] != word[i])
            return false;
    }
  
    return true;
}
  
// Function to print all the
// strings that match the
// given pattern where every
// character in the pattern is
// uniquely mapped to a character
// in the dictionary
void findMatchedWords(unordered_set<string> dict,
                      string pattern)
{
    // len is length of the pattern
    int len = pattern.length();
  
    // for each word in the dictionary
    for (string word : dict) {
  
        if (check(pattern, word))
            cout << word << " ";
    }
}
  
// Driver code
int main()
{
    unordered_set<string> dict = { "abb", "abc", "xyz", "xyy" };
    string pattern = "foo";
  
    findMatchedWords(dict, pattern);
  
    return 0;
}
  
// This code is contributed by Ankur Goel

chevron_right


Output:

xyy abb

Complexity Analysis:

  • Time Complexity: O(N*K), where ‘N’ is the number of words and ‘K’ is its length.
    To traverse each word, this will be the time requirement.
  • Auxiliary Space:O(N).
    The use of hash_map data structure for mapping characters consumes N space.

This article is contributed by Aditya Goel. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

Don’t stop now and take your learning to the next level. Learn all the important concepts of Data Structures and Algorithms with the help of the most trusted course: DSA Self Paced. Become industry ready at a student-friendly price.




My Personal Notes arrow_drop_up