Given the data set, we can find k number of most frequent words.
The solution of this problem already present as Find the k most frequent words from a file. But we can solve this problem very efficiently in Python with the help of some high performance modules.
In order to do this, we’ll use a high performance data type module, which is collections. This module got some specialized container datatypes and we will use counter class from this module.
Input : "John is the son of John second. Second son of John second is William second." Output : [('second', 4), ('John', 3), ('son', 2), ('is', 2)] Explanation : 1. The string will converted into list like this : ['John', 'is', 'the', 'son', 'of', 'John', 'second', 'Second', 'son', 'of', 'John', 'second', 'is', 'William', 'second'] 2. Now 'most_common(4)' will return four most frequent words and its count in tuple. Input : "geeks for geeks is for geeks. By geeks and for the geeks." Output : [('geeks', 5), ('for', 3)] Explanation : most_common(2) will return two most frequent words and their count.
- Import Counter class from collections module.
- Split the string into list using split(), it will return the lists of words.
- Now pass the list to the instance of Counter class
- The function 'most-common()' inside Counter will return the list of most frequent words from list and its count.
Below is Python implementation of above approach :
[('Geeks', 5), ('to', 4), ('and', 4), ('article', 3)]
- Python | Program to crawl a web page and get most frequent words
- Python | Find most frequent element in a list
- Python | Find top K frequent elements from a list of tuples
- Python | Find k longest words in given list
- Python Counter to find the size of largest subset of anagram words
- Find top k (or most frequent) numbers in a stream
- Possible Words using given characters in Python
- Python | Stemming words with NLTK
- Python | Extract words from given string
- Reverse words in a given String in Python
- Python | Joining only adjacent words in list
- Python | Toggle characters in words having same case
- Removing stop words with NLTK in Python
- Python | Scramble words from a text file
- Sort the words in lexicographical order in Python
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.