Word Break Problem | DP-32

Given an input string and a dictionary of words, find out if the input string can be segmented into a space-separated sequence of dictionary words. See following examples for more details.
This is a famous Google interview question, also being asked by many other companies now a days.

Consider the following dictionary 
{ i, like, sam, sung, samsung, mobile, ice, 
  cream, icecream, man, go, mango}

Input:  ilike
Output: Yes 
The string can be segmented as "i like".

Input:  ilikesamsung
Output: Yes
The string can be segmented as "i like samsung" 
or "i like sam sung".

Recursive implementation:
The idea is simple, we consider each prefix and search it in dictionary. If the prefix is present in dictionary, we recur for rest of the string (or suffix). If the recursive call for suffix r

filter_none

edit
close

play_arrow

link
brightness_4
code

def wordBreak(wordList, word):
    if word == '':
        return True
    else:
        wordLen = len(word)
        return any([(word[:i] in wordList) and wordBreak(wordList, word[i:]) for i in range(1, wordLen+1)])

chevron_right


eturns true, we return true, otherwise we try next prefix. If we have tried all prefixes and none of them resulted in a solution, we return false.



We strongly recommend to see substr function which is used extensively in following implementations.

C++

filter_none

edit
close

play_arrow

link
brightness_4
code

// A recursive program to test whether a given 
// string can be segmented into space separated 
// words in dictionary
#include <iostream>
using namespace std;
  
/* A utility function to check whether a word is 
  present in dictionary or not. An array of strings 
  is used for dictionary.  Using array of strings for
  dictionary is definitely not a good idea. We have 
  used for simplicity of the program*/
int dictionaryContains(string word)
{
    string dictionary[] = {"mobile","samsung","sam","sung",
                            "man","mango","icecream","and",
                             "go","i","like","ice","cream"};
    int size = sizeof(dictionary)/sizeof(dictionary[0]);
    for (int i = 0; i < size; i++)
        if (dictionary[i].compare(word) == 0)
           return true;
    return false;
}
  
// returns true if string can be segmented into space 
// separated words, otherwise returns false
bool wordBreak(string str)
{
    int size = str.size();
  
    // Base case
    if (size == 0)  return true;
  
    // Try all prefixes of lengths from 1 to size
    for (int i=1; i<=size; i++)
    {
        // The parameter for dictionaryContains is 
        // str.substr(0, i) which is prefix (of input 
        // string) of length 'i'. We first check whether 
        // current prefix is in  dictionary. Then we 
        // recursively check for remaining string
        // str.substr(i, size-i) which is suffix of  
        // length size-i
        if (dictionaryContains( str.substr(0, i) ) &&
            wordBreak( str.substr(i, size-i) ))
            return true;
    }
  
    // If we have tried all prefixes and 
    // none of them worked
    return false;
}
  
// Driver program to test above functions
int main()
{
    wordBreak("ilikesamsung")? cout <<"Yes\n": cout << "No\n";
    wordBreak("iiiiiiii")? cout <<"Yes\n": cout << "No\n";
    wordBreak("")? cout <<"Yes\n": cout << "No\n";
    wordBreak("ilikelikeimangoiii")? cout <<"Yes\n": cout << "No\n";
    wordBreak("samsungandmango")? cout <<"Yes\n": cout << "No\n";
    wordBreak("samsungandmangok")? cout <<"Yes\n": cout << "No\n";
    return 0;
}

chevron_right


Java

filter_none

edit
close

play_arrow

link
brightness_4
code

import java.util.*;
  
// Recursive implementation of 
// word break problem in java
public class WordBreakProblem
{
  
    // set to hold dictionary values
    private static Set<String> dictionary = new HashSet<>();
      
    public static void main(String []args)
    {
          
        // array of strings to be added in dictionary set.
        String temp_dictionary[] = {"mobile","samsung","sam","sung"
                            "man","mango","icecream","and"
                            "go","i","like","ice","cream"};
                              
        // loop to add all strings in dictionary set 
        for (String temp :temp_dictionary)
        {
            dictionary.add(temp);
        }
          
        // sample input cases
        System.out.println(wordBreak("ilikesamsung"));
        System.out.println(wordBreak("iiiiiiii"));
        System.out.println(wordBreak(""));
        System.out.println(wordBreak("ilikelikeimangoiii"));
        System.out.println(wordBreak("samsungandmango"));
        System.out.println(wordBreak("samsungandmangok"));
          
    }
      
    // returns true if the word can be segmented into parts such
    // that each part is contained in dictionary
    public static boolean wordBreak(String word)
    {
        int size = word.length();
          
        // base case
        if (size == 0)
        return true;
          
        //else check for all words
        for (int i = 1; i <= size; i++)
        {
            // Now we will first divide the word into two parts ,
            // the prefix will have a length of i and check if it is 
            // present in dictionary ,if yes then we will check for 
            // suffix of length size-i recursively. if both prefix and 
            // suffix are present the word is found in dictionary.
  
            if (dictionary.contains(word.substring(0,i)) && 
                    wordBreak(word.substring(i,size)))
            return true;
        }
          
        // if all cases failed then return false
        return false;
    }
}
  
// This code is contributed by Sparsh Singhal

chevron_right


Output:

Yes
Yes
Yes
Yes
Yes
No

Dynamic Programming
Why Dynamic Programming? The above problem exhibits overlapping sub-problems. For example, see the following partial recursion tree for string “abcde” in worst case.

wordBreak

filter_none

edit
close

play_arrow

link
brightness_4
code

// A Dynamic Programming based program to test whether a given string can
// be segmented into space separated words in dictionary
#include <iostream>
#include <string.h>
using namespace std;
  
/* A utility function to check whether a word is present in dictionary or not.
  An array of strings is used for dictionary.  Using array of strings for
  dictionary is definitely not a good idea. We have used for simplicity of
  the program*/
int dictionaryContains(string word)
{
    string dictionary[] = {"mobile","samsung","sam","sung","man","mango",
                           "icecream","and","go","i","like","ice","cream"};
    int size = sizeof(dictionary)/sizeof(dictionary[0]);
    for (int i = 0; i < size; i++)
        if (dictionary[i].compare(word) == 0)
           return true;
    return false;
}
  
// Returns true if string can be segmented into space separated
// words, otherwise returns false
bool wordBreak(string str)
{
    int size = str.size();
    if (size == 0)   return true;
  
    // Create the DP table to store results of subroblems. The value wb[i]
    // will be true if str[0..i-1] can be segmented into dictionary words,
    // otherwise false.
    bool wb[size+1];
    memset(wb, 0, sizeof(wb)); // Initialize all values as false.
  
    for (int i=1; i<=size; i++)
    {
        // if wb[i] is false, then check if current prefix can make it true.
        // Current prefix is "str.substr(0, i)"
        if (wb[i] == false && dictionaryContains( str.substr(0, i) ))
            wb[i] = true;
  
        // wb[i] is true, then check for all substrings starting from
        // (i+1)th character and store their results.
        if (wb[i] == true)
        {
            // If we reached the last prefix
            if (i == size)
                return true;
  
            for (int j = i+1; j <= size; j++)
            {
                // Update wb[j] if it is false and can be updated
                // Note the parameter passed to dictionaryContains() is
                // substring starting from index 'i' and length 'j-i'
                if (wb[j] == false && dictionaryContains( str.substr(i, j-i) ))
                    wb[j] = true;
  
                // If we reached the last character
                if (j == size && wb[j] == true)
                    return true;
            }
        }
    }
  
    /* Uncomment these lines to print DP table "wb[]"
     for (int i = 1; i <= size; i++)
        cout << " " << wb[i]; */
  
    // If we have tried all prefixes and none of them worked
    return false;
}
  
// Driver program to test above functions
int main()
{
    wordBreak("ilikesamsung")? cout <<"Yes\n": cout << "No\n";
    wordBreak("iiiiiiii")? cout <<"Yes\n": cout << "No\n";
    wordBreak("")? cout <<"Yes\n": cout << "No\n";
    wordBreak("ilikelikeimangoiii")? cout <<"Yes\n": cout << "No\n";
    wordBreak("samsungandmango")? cout <<"Yes\n": cout << "No\n";
    wordBreak("samsungandmangok")? cout <<"Yes\n": cout << "No\n";
    return 0;
}

chevron_right


Output:

Yes
Yes
Yes
Yes
Yes
No

Optimized Dynamic Programming:
In this approach, apart from the dp table, we also maintain all the indexes which have matched earlier. Then we will check the substrings from those indexes to the current index. If anyone of that matches then we can divide the string up to that index.

In this program, we are using some extra space. However, its time complexity is O(n*s) where s is the length of the largest string in the dictionary and n is the length of the given string.

filter_none

edit
close

play_arrow

link
brightness_4
code

// A Dynamic Programming based program to test
// whether a given string can  be segmented into
// space separated words in dictionary
#include <bits/stdc++.h>
using namespace std;
  
/* A utility function to check whether a word
   is present in dictionary or not. An array of 
   strings is used for dictionary.  Using array
   of strings for dictionary is definitely not 
   a good idea. We have used for simplicity of
   the program*/
int dictionaryContains(string word)
{
    string dictionary[] = { "mobile", "samsung", "sam",
                            "sung", "man", "mango",
                            "icecream", "and", "go",
                            "i", "like", "ice", "cream" };
    int size = sizeof(dictionary) / sizeof(dictionary[0]);
    for (int i = 0; i < size; i++)
        if (dictionary[i].compare(word) == 0)
            return true;
    return false;
}
  
// Returns true if string can be segmented into space
// separated words, otherwise returns false
bool wordBreak(string s)
{
    int n = s.size();
    if (n == 0)
        return true;
  
    // Create the DP table to store results of subroblems.
    // The value dp[i] will be true if str[0..i] can be
    // segmented into dictionary words, otherwise false.
    vector<bool> dp(n + 1, 0); // Initialize all values
    // as false.
  
    // matched_index array represents the indexes for which
    // dp[i] is true. Initially only -1 element is present
    // in this array.
    vector<int> matched_index;
    matched_index.push_back(-1);
  
    for (int i = 0; i < n; i++) {
        int msize = matched_index.size();
  
        // Flag value which tells that a substring matches
        // with given words or not.
        int f = 0;
  
        // Check all the substring from the indexes matched
        // earlier. If any of that substring matches than
        // make flag value = 1;
        for (int j = msize - 1; j >= 0; j--) {
  
            // sb is substring starting from matched_index[j]
            // + 1  and of length i - matched_index[j]
            string sb = s.substr(matched_index[j] + 1, i - matched_index[j]);
  
            if (dictionaryContains(sb)) {
                f = 1;
                break;
            }
        }
  
        // If substring matches than do dp[i] = 1 and
        // push that index in matched_index array.
        if (f == 1) {
            dp[i] = 1;
            matched_index.push_back(i);
        }
    }
    return dp[n - 1];
}
  
// Driver code
int main()
{
    wordBreak("ilikesamsung") ? cout << "Yes\n" : cout << "No\n";
    wordBreak("iiiiiiii") ? cout << "Yes\n" : cout << "No\n";
    wordBreak("") ? cout << "Yes\n" : cout << "No\n";
    wordBreak("ilikelikeimangoiii") ? cout << "Yes\n" : cout << "No\n";
    wordBreak("samsungandmango") ? cout << "Yes\n" : cout << "No\n";
    wordBreak("samsungandmangok") ? cout << "Yes\n" : cout << "No\n";
    return 0;
}

chevron_right


Output:

Yes
Yes
Yes
Yes
Yes
No

Word Break Problem | (Trie solution)

Exercise:
The above solutions only finds out whether a given string can be segmented or not. Extend the above Dynamic Programming solution to print all possible partitions of input string.

Examples:

Input: ilikeicecreamandmango
Output: 
i like ice cream and man go
i like ice cream and mango
i like icecream and man go
i like icecream and mango

Input: ilikesamsungmobile
Output:
i like sam sung mobile
i like samsung mobile

Refer below post for solution of exercise.
Word Break Problem using Backtracking

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above



My Personal Notes arrow_drop_up

Improved By : sparsh singhal, ChongSun