Open In App

Python | Percentage similarity of lists

Improve
Improve
Like Article
Like
Save
Share
Report

Sometimes, while working with Python list, we have a problem in which we need to find how much a list is similar to other list. The similarity quotient of both the list is what is required in many scenarios we might have. Let’s discuss a way in which this task can be performed. 

Method 1: Using “|” operator + “&” operator + set() The method which is formally applied to calculate the similarity among lists is finding the distinct elements and also common elements and computing it’s quotient. The result is then multiplied by 100, to get the percentage. 

Python3




# Python3 code to demonstrate working of
# Percentage similarity of lists
# using "|" operator + "&" operator + set()
 
# initialize lists
test_list1 = [1, 4, 6, 8, 9, 10, 7]
test_list2 = [7, 11, 12, 8, 9]
 
# printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
 
# Percentage similarity of lists
# using "|" operator + "&" operator + set()
res = len(set(test_list1) and set(test_list2)) / float(len(set(test_list1) or set(test_list2))) * 100
 
# printing result
print("Percentage similarity among lists is : " + str(res))


Output

The original list 1 is : [1, 4, 6, 8, 9, 10, 7]
The original list 2 is : [7, 11, 12, 8, 9]
Percentage similarity among lists is : 71.42857142857143

Time Complexity: O(m*n) where n is the number of elements in the list “test_lists”.  
Auxiliary Space: O(1), constant extra space is required

Method#2 : Using set() + intersection()+ union

Find the intersection and union of the two lists and then we calculate the percentage.

Step-by-step approach:

  • First, define the two original lists. 
  • Use the set data type to find the number of common elements in both lists.
  • Do this by taking the intersection of the two sets, and finding its length.
  • Use the set data type again to find the total number of unique elements in both lists. 
  • Do this by taking the union of the two sets, and finding its length.
  • Calculate the percentage similarity by dividing the number of common elements by the total number of unique elements, and multiplying the result by 100. 
  • Use the format method to print the result with two decimal places. The resulting output shows that the two lists have a similarity of 33.33%.

Python3




# Define the two original lists
original_list1 = [1, 4, 6, 8, 9, 10, 7]
original_list2 = [7, 11, 12, 8, 9]
 
# Find the number of common elements in both lists
common_elements = set(original_list1).intersection(set(original_list2))
num_common_elements = len(common_elements)
 
# Find the total number of unique elements in both lists
total_elements = set(original_list1).union(set(original_list2))
num_total_elements = len(total_elements)
 
# Calculate the percentage similarity
percentage_similarity = (num_common_elements / num_total_elements) * 100
 
# Print the result
print("Percentage similarity among lists is : {:.2f}".format(percentage_similarity))


Output

Percentage similarity among lists is : 33.33

Time complexity: O(n), where n is the total number of elements in both lists. This is because we are creating sets from both lists, which takes O(n) time, and then we are performing set operations on these sets, which take constant time on average. Finally, we are doing simple arithmetic calculations, which also take constant time. Therefore, the overall time complexity is O(n).

Auxiliary Space: O(n), where n is the total number of elements in both lists. This is because we are creating two sets, one for each list, which can potentially contain all the elements in the list. Therefore, the space taken by the sets is proportional to the total number of elements in both lists. Additionally, we are also creating some variables to store the number of common and total elements, which take constant space. Therefore, the overall space complexity is O(n).

Method #3 : Using Counter() from collections module:

Step-by-step approach:

  • Import Counter from collections module.
  • Initialize two Counter objects for each input list.
  • Calculate the intersection of the two Counter objects to get a new Counter object with the count of common elements in the two lists.
  • Get the elements of the common elements Counter object as a list.
  • Calculate the percentage similarity as the ratio of the length of the common elements list to the length of the set of all elements in both input lists.
  • Print the percentage similarity

Python3




from collections import Counter
 
original_list1 = [1, 4, 6, 8, 9, 10, 7]
original_list2 = [7, 11, 12, 8, 9]
 
counter1 = Counter(original_list1)
counter2 = Counter(original_list2)
 
common_elements = list((counter1 & counter2).elements())
percentage_similarity = len(common_elements) / len(set(original_list1 + original_list2)) * 100
 
print("Percentage similarity among lists is : {:.2f}".format(percentage_similarity))
#This code  is contributed by Jyothi pinjala


Output

Percentage similarity among lists is : 33.33

Time Complexity:
The time complexity of this algorithm is O(n), where n is the total number of elements in both input lists. The Counter objects are created in O(n) time and the intersection operation also takes O(n) time in the worst case.

Auxiliary Space:
The space complexity of this algorithm is O(n), where n is the total number of elements in both input lists. The Counter objects are created to store the count of each element in each input list, and the common elements list can have a maximum of n elements.

Method #4: Using list comprehension and set()

  • Define the two original lists
  • Create a list comprehension to get all the common elements in both lists
  • Find the total number of unique elements in both lists using set union
  • Calculate the percentage similarity using the common elements count and total elements count
  • Print the result

Python3




# Define the two original lists
original_list1 = [1, 4, 6, 8, 9, 10, 7]
original_list2 = [7, 11, 12, 8, 9]
 
# Create a list comprehension to get all the common elements in both lists
common_elements = [element for element in original_list1 if element in original_list2]
 
# Find the total number of unique elements in both lists using set union
total_elements = set(original_list1).union(set(original_list2))
num_total_elements = len(total_elements)
 
# Calculate the percentage similarity using the common elements count and total elements count
percentage_similarity = (len(common_elements) / num_total_elements) * 100
 
# Print the result
print("Percentage similarity among lists is : {:.2f}".format(percentage_similarity))


Output

Percentage similarity among lists is : 33.33

Time complexity: O(n+m), where n and m are the lengths of the two lists
Auxiliary space: O(p), where p is the number of unique elements in both lists



Last Updated : 10 May, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads