Open In App

Python Program to Find the Number of Unique Words in Text File

Given a  text file, write a python program to find the number of unique words in the given text file in Python.

Examples:



Input: gfg.txt
Output: 18

Contents of gfg.txt: GeeksforGeeks was created with a goal in mind to 
 provide well written well thought and well
explained solutions for selected questions

Explanation:
Frequency of words in the file are {'geeksforgeeks': 1, 'was': 1, 'created': 1, 
'with': 1, 'a': 1, 'goal': 1,
'in': 1, 'mind': 1, 'to': 1, 'provide': 1, 'well': 3, 'written': 1, 'thought': 1, 
'and': 1, 'explained': 1,
'solutions': 1, 'for': 1, 'selected': 1, 'questions': 1}

Count of unique words are 18.

Approach:

Below is the implementation of the above approach.






# Function to count  the number of unique words
# in the given text file.
  
  
def countUniqueWords(fileName):
    # Create a file object using open
    # function and pass filename as parameter.
    file = open(fileName, 'r')
    # Read file contents as string and convert to lowercase.
    read_file = file.read().lower()
    words_in_file = read_file.split()  
    # Creating a dictionary for counting number of occurrences.
    count_map = {}
    for i in words_in_file:
        if i in count_map:
            count_map[i] += 1  
        else:
            count_map[i] = 1
    count = 0
    # Traverse the dictionary and increment
    # the counter for every unique word.
    for i in count_map:
        if count_map[i] == 1:
            count += 1
    file.close()
    return count  # Return the count.
  
  
# Creating sample text file for testing
with open("gfg.txt", "w") as file:  
    file.write("GeeksforGeeks was created with\
    a goal in mind to provide well written well \
    thought and well explained solutions\
    for selected questions")
  
print('Number of unique words in the file are:'
      countUniqueWords('gfg.txt'))

Output:

Number of unique words in the file are: 18

Complexity analysis.:

N is the number of words.

Time complexity: O(n)

Space Complexity: O(n)


Article Tags :