Open In App

Python | Check if given words appear together in a list of sentence

Last Updated : 10 May, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Given a list of sentences ‘sentence’ and a list of words ‘words’, write a Python program to find which sentence in the list of sentences consist of all words contained in ‘words’ and return them within a list. 

Examples:

Input : sentence = ['I love tea', 'He hates tea', 'We love tea']
        words = ['love', 'tea']
Output : ['I love tea', 'We love tea']

Input : sentence = ['python coder', 'geeksforgeeks', 'coder in geeksforgeeks']
        words = ['coder', 'geeksforgeeks']
Output : ['coder in geeksforgeeks']

  Approach #1 : Using List comprehension We first use list comprehension, to return a boolean value for each substring of the list of sentence and store it in ‘res’. Finally, return a list comprising of the desired sentences according to the boolean values in ‘res’. 

Python3




# Python3 program to Check if given words
# appear together in a list of sentence
 
 
def check(sentence, words):
    res = [all([k in s for k in words]) for s in sentence]
    return [sentence[i] for i in range(0, len(res)) if res[i]]
 
 
# Driver code
sentence = ['python coder', 'geeksforgeeks', 'coder in geeksforgeeks']
words = ['coder', 'geeksforgeeks']
print(check(sentence, words))


Output

['coder in geeksforgeeks']

  Approach #2 : List comprehension (Alternative way) For each substring in list of sentences, it checks how many words are there in the current substring and stores it in a variable ‘k’. If the length of ‘k’ matches with length of list of words, just append it to ‘res’. 

Python3




# Python3 program to Check if given words
# appear together in a list of sentence
 
 
def check(sentence, words):
    res = []
    for substring in sentence:
        k = [w for w in words if w in substring]
        if (len(k) == len(words)):
            res.append(substring)
 
    return res
 
 
# Driver code
sentence = ['python coder', 'geeksforgeeks', 'coder in geeksforgeeks']
words = ['coder', 'geeksforgeeks']
print(check(sentence, words))


Output

['coder in geeksforgeeks']

  Approach #3 : Python map() map() method applies a function on list of sentences and check if all words are contained in the list or not by splitting the list of words. It returns a boolean value for each substring of the list of sentence and store it in ‘res’. Finally, repeat the same steps as in approach #1. 

Python3




# Python3 program to Check if given words
# appear together in a list of sentence
 
 
def check(sentence, words):
    res = list(map(lambda x: all(map(lambda y: y in x.split(),
                                     words)), sentence))
    return [sentence[i] for i in range(0, len(res)) if res[i]]
 
 
# Driver code
sentence = ['python coder', 'geeksforgeeks', 'coder in geeksforgeeks']
words = ['coder', 'geeksforgeeks']
print(check(sentence, words))


Output

['coder in geeksforgeeks']

Approach #4 : Using count()+len()+for loop
 

Python3




# Python3 program to Check if given words
# appear together in a list of sentence
 
 
def check(sentence, words):
    res = []
    for i in sentence:
        c = 0
        for j in words:
            if(i.count(j) >= 1):
                c += 1
        if(c == len(words)):
            res.append(i)
    return res
 
 
# Driver code
sentence = ['python coder', 'geeksforgeeks', 'coder in geeksforgeeks']
words = ['coder', 'geeksforgeeks']
print(check(sentence, words))


Output

['coder in geeksforgeeks']

Approach #5 : Using built-in issubset

This solution converts both the list of sentences and the list of words into sets and checks if the list of words is a subset of any of the sentence sets. If it is, it adds the sentence to the result list. This approach is faster than the previous ones because it uses the built-in issubset function which has a time complexity of O(n).

Python3




def check(sentence, words):
    # list to store sentences that contain all words
    res = []
    # convert list of words into set for faster lookup
    words_set = set(words)
    # iterate through each sentence
    for s in sentence:
        # convert sentence into set for faster lookup
        sentence_set = set(s.split())
        # check if list of words is a subset of sentence set
        # if it is, add sentence to result list
        if words_set.issubset(sentence_set):
            res.append(s)
    return res
 
# Driver code
sentence = ['python coder', 'geeksforgeeks', 'coder in geeksforgeeks']
words = ['coder', 'geeksforgeeks']
print(check(sentence, words))
#This code is contributed by Edula Vinay Kumar Reddy


Output

['coder in geeksforgeeks']

Time complexity: O(n*m) where n is the number of sentences and m is the length of the longest sentence
Auxiliary Space: O(n*m)

Approach #6: Using filter and lambda

Step-by-step algorithm:

  1. Define a function named check which takes two arguments – sentence and words.
  2. In the function, use filter() function to iterate over the sentence list and filter out the sentences that contain all the words in the words list.
  3. To filter out the sentences, use the issuperset() function which returns true if the words in the words list are present in the sentence.
  4. Return the filtered sentences as a list.

Python3




#function to check if given words appear together in a list of sentence
def check(sentence, words):
    return list(filter(lambda s: set(words).issubset(s.split()), sentence))
 
 
# Driver code
sentence = ['python coder', 'geeksforgeeks', 'coder in geeksforgeeks']
words = ['coder', 'geeksforgeeks']
print(check(sentence, words))
#This code is contributed by Zaid Khan


Output

['coder in geeksforgeeks']

Time Complexity: The time complexity of the filter() function is O(n), where n is the number of elements in the list. Therefore, the overall time complexity of the algorithm is O(n).
Auxiliary Space: The space complexity of the algorithm is O(n), where n is the number of elements in the list. This is because we are creating a new list to store the filtered sentences.

Approach #6: Using Regular Expression
This approach uses regular expressions to check if a sentence contains all the words in the list of words. It creates a regular expression pattern by joining all the words in the list of words with a positive lookahead assertion. The pattern is then compiled and matched with each sentence in the list of sentences. If the sentence matches the pattern, it is added to the result list.

Algorithm:

  • Create an empty list called “result” to store the sentences that contain all the words in the list of words.
  • Create a regular expression pattern by joining all the words in the list of words with a positive lookahead assertion.
  • Compile the regular expression pattern.
  • Iterate through each sentence in the list of sentences.
  • Match the regular expression pattern with the sentence.
  • If the sentence matches the pattern, add it to the result list.
  • Return the result list.

Python3




import re
 
def check(sentence, words):
    # list to store sentences that contain all words
    result = []
    # create regular expression pattern
    pattern = '(?=.*{})'.format('.*'.join(words))
    # compile the regular expression pattern
    regex = re.compile(pattern)
    # iterate through each sentence
    for s in sentence:
        # match the regular expression pattern with the sentence
        if regex.match(s):
            result.append(s)
    return result
 
# Driver code
sentence = ['I love tea', 'He hates tea', 'We love tea']
words = ['love', 'tea']
print(check(sentence, words))
 
sentence = ['python coder', 'geeksforgeeks', 'coder in geeksforgeeks']
words = ['coder', 'geeksforgeeks']
print(check(sentence, words))


Output

['I love tea', 'We love tea']
['coder in geeksforgeeks']

Time Complexity:
The time complexity of this approach depends on the size of the list of sentences and the number of words in the list of words. Therefore, the overall time complexity of this approach is O(mn), where m is the length of the list of sentences and n is the length of the regular expression pattern.

Space Complexity:
The space complexity of this approach depends on the size of the list of sentences and the number of words in the list of words.  Therefore, the overall space complexity of this approach is O(n+k), where n is the length of the regular expression pattern and k is the number of sentences that contain all the words in the list of words.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads