Skip to content
Related Articles

Related Articles

Find the most repeated word in a text file

View Discussion
Improve Article
Save Article
  • Difficulty Level : Medium
  • Last Updated : 25 Oct, 2021
View Discussion
Improve Article
Save Article

Python provides inbuilt functions for creating, writing, and reading files. Two types of files can be handled in python, normal text files, and binary files (written in binary language,0s and 1s).

  • Text files: In this type of file, Each line of text is terminated with a special character called EOL (End of Line), which is the new line character (ā€˜\nā€™) in python by default.
  • Binary files: In this type of file, there is no terminator for a line, and the data is stored after converting it into machine-understandable binary language.

Here we are operating on the .txt file in Python. Through this program, we will find the most repeated word in a file.

Approach:

  • We will take the content of the file as input.
  • We will save each word in a list after removing spaces and punctuation from the input string.
  • Find the frequency of each word.
  • Print the word which has a maximum frequency.

Input File:

Below is the implementation of the above approach:

Python3




# Python program to find the most repeated word
# in a text file
 
# A file named "gfg", will be opened with the 
# reading mode.
file = open("gfg.txt","r")
frequent_word = ""
frequency = 0 
words = []
 
# Traversing file line by line
for line in file:
     
    # splits each line into
    # words and removing spaces
    # and punctuations from the input
    line_word = line.lower().replace(',','').replace('.','').split(" "); 
     
    # Adding them to list words
    for w in line_word: 
        words.append(w); 
         
# Finding the max occurred word
for i in range(0, len(words)): 
     
    # Declaring count
    count = 1
     
    # Count each word in the file 
    for j in range(i+1, len(words)): 
        if(words[i] == words[j]): 
            count = count + 1
 
    # If the count value is more
    # than highest frequency then
    if(count > frequency): 
        frequency = count; 
        frequent_word = words[i]; 
 
print("Most repeated word: " + frequent_word)
print("Frequency: " + str(frequency))
file.close();

Output:

Most repeated word: well
Frequency: 3

My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!