Open In App

Python Program to Find the Number of Unique Words in Text File

Last Updated : 07 Nov, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

Given a  text file, write a python program to find the number of unique words in the given text file in Python.

Examples:

Input: gfg.txt
Output: 18

Contents of gfg.txt: GeeksforGeeks was created with a goal in mind to 
 provide well written well thought and well
explained solutions for selected questions

Explanation:
Frequency of words in the file are {'geeksforgeeks': 1, 'was': 1, 'created': 1, 
'with': 1, 'a': 1, 'goal': 1,
'in': 1, 'mind': 1, 'to': 1, 'provide': 1, 'well': 3, 'written': 1, 'thought': 1, 
'and': 1, 'explained': 1,
'solutions': 1, 'for': 1, 'selected': 1, 'questions': 1}

Count of unique words are 18.

Approach:

  •  Create a file object using the open function and pass the filename as a parameter.
  •  Read the contents in the file as a string using the read() function and convert the string to lowercase using the lower() function.
  •  Split the file contents into words using the split() function.
  •  Create a dictionary for counting the number of occurrences of each word.
  •  Create a counter variable to count a number of unique words.
  •  Traverse the dictionary and increment the counter for every unique word.
  •  Close the file object.
  •  Return the count.

Below is the implementation of the above approach.

Python3




# Function to count  the number of unique words
# in the given text file.
  
  
def countUniqueWords(fileName):
    # Create a file object using open
    # function and pass filename as parameter.
    file = open(fileName, 'r')
    # Read file contents as string and convert to lowercase.
    read_file = file.read().lower()
    words_in_file = read_file.split()  
    # Creating a dictionary for counting number of occurrences.
    count_map = {}
    for i in words_in_file:
        if i in count_map:
            count_map[i] += 1  
        else:
            count_map[i] = 1
    count = 0
    # Traverse the dictionary and increment
    # the counter for every unique word.
    for i in count_map:
        if count_map[i] == 1:
            count += 1
    file.close()
    return count  # Return the count.
  
  
# Creating sample text file for testing
with open("gfg.txt", "w") as file:  
    file.write("GeeksforGeeks was created with\
    a goal in mind to provide well written well \
    thought and well explained solutions\
    for selected questions")
  
print('Number of unique words in the file are:'
      countUniqueWords('gfg.txt'))


Output:

Number of unique words in the file are: 18

Complexity analysis.:

N is the number of words.

Time complexity: O(n)

Space Complexity: O(n)



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads