Open In App

Python program for most frequent word in Strings List

Last Updated : 15 Apr, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Given Strings List, write a Python program to get word with most number of occurrences.

Example:

Input : test_list = [“gfg is best for geeks”, “geeks love gfg”, “gfg is best”] 
Output : gfg 
Explanation : gfg occurs 3 times, most in strings in total.

Input : test_list = [“geeks love gfg”, “geeks are best”] 
Output : geeks 
Explanation : geeks occurs 2 times, most in strings in total. 

Method #1 : Using loop + max() + split() + defaultdict()

In this, we perform task of getting each word using split(), and increase its frequency by memorizing it using defaultdict(). At last, max(), is used with parameter to get count of maximum frequency string.

Python3




# Python3 code to demonstrate working of
# Most frequent word in Strings List
# Using loop + max() + split() + defaultdict()
from collections import defaultdict
 
# initializing Matrix
test_list = ["gfg is best for geeks", "geeks love gfg", "gfg is best"]
 
# printing original list
print("The original list is : " + str(test_list))
 
temp = defaultdict(int)
 
# memoizing count
for sub in test_list:
    for wrd in sub.split():
        temp[wrd] += 1
 
# getting max frequency
res = max(temp, key=temp.get)
 
# printing result
print("Word with maximum frequency : " + str(res))


Output

The original list is : ['gfg is best for geeks', 'geeks love gfg', 'gfg is best']
Word with maximum frequency : gfg

Time Complexity: O(n*n)
Auxiliary Space: O(n)

Method #2 : Using list comprehension + mode()

In this, we get all the words using list comprehension and get maximum frequency using mode().

Python3




# Python3 code to demonstrate working of
# Most frequent word in Strings List
# Using list comprehension + mode()
from statistics import mode
 
# initializing Matrix
test_list = ["gfg is best for geeks", "geeks love gfg", "gfg is best"]
 
# printing original list
print("The original list is : " + str(test_list))
 
# getting all words
temp = [wrd for sub in test_list for wrd in sub.split()]
 
# getting frequency
res = mode(temp)
 
# printing result
print("Word with maximum frequency : " + str(res))


Output

The original list is : ['gfg is best for geeks', 'geeks love gfg', 'gfg is best']
Word with maximum frequency : gfg

Method #3: Using list() and Counter()

  • Append all words to empty list and calculate frequency of all words using Counter() function.
  • Find max count and print that key.

Below is the implementation:

Python3




# Python3 code to demonstrate working of
# Most frequent word in Strings List
 
from collections import Counter
 
# function which returns
# most frequent word
def mostFrequentWord(words):
   
    # Taking empty list
    lis = []
    for i in words:
       
        # Getting all words
        for j in i.split():
            lis.append(j)
             
    # Calculating frequency of all words
    freq = Counter(lis)
     
    # find max count and print that key
    max = 0
    for i in freq:
        if(freq[i] > max):
            max = freq[i]
            word = i
            return word
 
 
# Driver code
# initializing strings list
words = ["gfg is best for geeks", "geeks love gfg", "gfg is best"]
 
# printing original list
print("The original list is : " + str(words))
 
# passing this words to mostFrequencyWord function
# printing result
print("Word with maximum frequency : " + mostFrequentWord(words))
# This code is contributed by vikkycirus


Output

The original list is : ['gfg is best for geeks', 'geeks love gfg', 'gfg is best']
Word with maximum frequency : gfg

The time and space complexity for all the methods are the same:

Time Complexity: O(n2)

Space Complexity: O(n)

Method #4: Using Counter() and reduce()
Here is an approach to solve the problem using the most_common() function of the collections module’s Counter class and the reduce() function from the functools module:

Python3




from collections import Counter
from functools import reduce
 
def most_frequent_word(test_list):
    all_words = reduce(lambda a, b: a + b, [sub.split() for sub in test_list])
    word_counts = Counter(all_words)
    return word_counts.most_common(1)[0][0]
 
test_list = ["gfg is best for geeks", "geeks love gfg", "gfg is best"]
print("The original list is: ", test_list)
print("Word with most frequency: ", most_frequent_word(test_list))


Output

The original list is:  ['gfg is best for geeks', 'geeks love gfg', 'gfg is best']
Word with most frequency:  gfg

Explanation:

We use the reduce() function to concatenate the list of all words from each string in the test_list.
We then create a Counter object from the list of all words to get a count of the frequency of each word.
Finally, we use the most_common() function to get the word with the highest frequency and return it.
Time complexity: O(n * k), where n is the number of strings in the test_list and k is the average number of words in each string.

Auxiliary Space: O(n * k), since we are storing the words in a list before creating a Counter object.

Method #5: Using heapq:

  1. We start by initializing an empty list all_words, which will be used to store all the individual words from the input list.
  2. We iterate over each string in the input list using a list comprehension and split each string into individual words using the split() method.
  3. We add the resulting list of words to all_words using the extend() method.
  4. We create a Counter object from the list of words. A Counter object is a dictionary that stores the frequency of each element in the list.
  5. We use the heapq.nlargest() function to get the word with the highest frequency from the Counter object.
  6. We return the most frequent word.

Python3




import heapq
from collections import Counter
 
def most_frequent_word(test_list):
    all_words = [sub.split() for sub in test_list]
    word_counts = Counter(word for sublist in all_words for word in sublist)
    return heapq.nlargest(1, word_counts, key=word_counts.get)[0]
 
test_list = ["gfg is best for geeks", "geeks love gfg", "gfg is best"]
print("The original list is: ", test_list)
print("Word with most frequency: ", most_frequent_word(test_list))
#This code is contributed by Pushpa.


Output

The original list is:  ['gfg is best for geeks', 'geeks love gfg', 'gfg is best']
Word with most frequency:  gfg

The time complexity : O(n log k), where n is the total number of words in the input list and k is the number of unique words. The most time-consuming operation in this algorithm is the creation of the Counter object, which has a time complexity of O(n). The heapq.nlargest() function has a time complexity of O(k log k), as it maintains a heap of size k.

The auxiliary space : O(k), where k is the number of unique words in the input list. This is because we create a Counter object and a heap of size k to store the k most frequent words.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads