Open In App

Python | Remove all duplicates words from a given sentence

Improve
Improve
Like Article
Like
Save
Share
Report

Given a sentence containing n words/strings. Remove all duplicates words/strings which are similar to each others.

Examples:  

Input : Geeks for Geeks
Output : Geeks for

Input : Python is great and Java is also great
Output : is also Java Python and great

We can solve this problem quickly using python Counter() method. Approach is very simple.

1) Split input sentence separated by space into words. 
2) So to get all those strings together first we will join each string in given list of strings. 
3) Now create a dictionary using Counter method having strings as keys and their frequencies as values. 
4) Join each words are unique to form single string. 

Python




from collections import Counter
 
def remov_duplicates(input):
 
    # split input string separated by space
    input = input.split(" ")
 
    # now create dictionary using counter method
    # which will have strings as key and their
    # frequencies as value
    UniqW = Counter(input)
 
    # joins two adjacent elements in iterable way
    s = " ".join(UniqW.keys())
    print (s)
 
# Driver program
if __name__ == "__main__":
    input = 'Python is great and Java is also great'
    remov_duplicates(input)


Output

and great Java Python is also

Time Complexity: O(N)
Auxiliary Space: O(N)

Method 2:  

Python




# Program without using any external library
s = "Python is great and Java is also great"
l = s.split()
k = []
for i in l:
 
    # If condition is used to store unique string
    # in another list 'k'
    if (s.count(i)>=1 and (i not in k)):
        k.append(i)
print(' '.join(k))


Output

Python is great and Java also

Time Complexity: O(N*N)
Auxiliary Space: O(N)

Method 3: Another shorter implementation:

Python3




# Python3 program
 
string = 'Python is great and Java is also great'
 
print(' '.join(dict.fromkeys(string.split())))


Output

Python is great and Java also

Time Complexity: O(N)
Auxiliary Space: O(N)

Method 4: Using set() 

Python3




string = 'Python is great and Java is also great'
print(' '.join(set(string.split())))


Output

Java also great and Python is

Time Complexity: O(N)
Auxiliary Space: O(N)

Method 5:  using operator.countOf()

Python3




# Program using operator.countOf()
import operator as op
s = "Python is great and Java is also great"
l = s.split()
k = []
for i in l:
  # If condition is used to store unique string
  # in another list 'k'
  if (op.countOf(l,i)>=1 and (i not in k)):
    k.append(i)
print(' '.join(k))


Output

Python is great and Java also

Time Complexity: O(N)
Auxiliary Space: O(N)

Method 6:  

It uses a loop to traverse through each word of the sentence, and stores the unique words in a separate list using an if condition to check if the word is already present in the list.

Follow the steps below to implement the above idea:

  • Split the given sentence into words/strings and store it in a list.
  • Create an empty set to store the distinct words/strings.
  • Iterate over the list of words/strings, and for each word, check if it is already in the set.
  • If the word is not in the set, add it to the set.
  • If the word is already in the set, skip it.
  • Finally, join the words in the set using a space and return it as the output.

Below is the implementation of the above approach:

Python3




def remove_duplicates(sentence):
    words = sentence.split(" ")
    result = []
    for word in words:
        if word not in result:
            result.append(word)
    return " ".join(result)
 
sentence = "Python is great and Java is also great"
print(remove_duplicates(sentence))


Output

Python is great and Java also

Time complexity: O(n^2) because of the list result that stores unique words, which is searched for every word in the input sentence. 
Auxiliary space: O(n) because we are storing unique words in the result list.

Method 7:  Using Recursive method.

Algorithm:

  1. Split the input sentence into words.
  2. If there is only one word, return it.
  3. If the first word is present in the rest of the words, call the function recursively with the rest of the words.
  4. If the first word is not present in the rest of the words, concatenate it with the result of calling the function recursively with the rest of the words.
  5. Return the final result as a string.

Python3




def remove_duplicates(sentence):
    words = sentence.split(" ")
    if len(words) == 1:
        return words[0]
    if words[0] in words[1:]:
        return remove_duplicates(" ".join(words[1:]))
    else:
        return words[0] + " " + remove_duplicates(" ".join(words[1:]))
 
sentence = "Python is great and Java is also great"
print(remove_duplicates(sentence))


Output

Python and Java is also great

Time complexity:
The time complexity of this algorithm is O(n^2), where n is the number of words in the input sentence. This is because for each word in the input sentence, we are checking if it is present in the rest of the words using the in operator, which has a time complexity of O(n) in the worst case. Therefore, the total time complexity of the algorithm is O(n^2).

Space complexity:
The space complexity of this algorithm is O(n), where n is the number of words in the input sentence. This is because we are using recursion to call the function with smaller subsets of the input sentence, which results in a recursive call stack. The maximum depth of the call stack is equal to the number of words in the input sentence, so the space complexity is O(n). Additionally, we are creating a list to store the words in the output, which also takes O(n) space. Therefore, the total space complexity of the algorithm is O(n).

Method #8:Using reduce

  1. The remove_duplicates function takes an input string as input and splits it into a list of words using the split() method. This takes O(n) time where n is the length of the input string.
  2. The function initializes an empty list unique_words to store the unique words in the input string.
  3. The function uses the reduce() function from the functools module to iterate over the list of words and remove duplicates. The reduce() function takes O(n) time to execute where n is the number of words in the input string.
  4. The lambda function inside the reduce() function checks if a word is already in the accumulator list x and either returns x unchanged or appends the new word y to the list x.
  5. Finally, the function returns a string joined from the list of unique words using the join() method. This takes O(n) time where n is the length of the output string.

Python3




from functools import reduce
 
def remove_duplicates(input_str):
    words = input_str.split()
    unique_words = reduce(lambda x, y: x if y in x else x + [y], [[], ] + words)
    return ' '.join(unique_words)
 
input_str = 'Python is great and Java is also great'
print(remove_duplicates(input_str))
#This code is contributed by Vinay Pinjala.


Output

Python is great and Java also

The time complexity of the remove_duplicates() function is O(n^2) where n is the number of words in the input string.

This is because the reduce() function inside the remove_duplicates() function iterates over each word in the input string, and for each word, it checks whether that word already exists in the list of unique words, which takes O(n) time in the worst case.

Therefore, the time complexity of the function is O(n^2) because it has to perform this check for each word in the input string.

The auxiliary space of the remove_duplicates() function is O(n) because it needs to store all the unique words in the output list.

In the worst case, when there are no duplicates in the input string, the size of the output list is equal to the size of the input list, so the space complexity is O(n).



Last Updated : 18 Mar, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads