Open In App

Python | Percentage similarity of lists

Sometimes, while working with Python list, we have a problem in which we need to find how much a list is similar to other list. The similarity quotient of both the list is what is required in many scenarios we might have. Let’s discuss a way in which this task can be performed. 

Method 1: Using “|” operator + “&” operator + set() The method which is formally applied to calculate the similarity among lists is finding the distinct elements and also common elements and computing it’s quotient. The result is then multiplied by 100, to get the percentage. 




# Python3 code to demonstrate working of
# Percentage similarity of lists
# using "|" operator + "&" operator + set()
 
# initialize lists
test_list1 = [1, 4, 6, 8, 9, 10, 7]
test_list2 = [7, 11, 12, 8, 9]
 
# printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
 
# Percentage similarity of lists
# using "|" operator + "&" operator + set()
res = len(set(test_list1) and set(test_list2)) / float(len(set(test_list1) or set(test_list2))) * 100
 
# printing result
print("Percentage similarity among lists is : " + str(res))

Output
The original list 1 is : [1, 4, 6, 8, 9, 10, 7]
The original list 2 is : [7, 11, 12, 8, 9]
Percentage similarity among lists is : 71.42857142857143

Time Complexity: O(m*n) where n is the number of elements in the list “test_lists”.  
Auxiliary Space: O(1), constant extra space is required

Method#2 : Using set() + intersection()+ union

Find the intersection and union of the two lists and then we calculate the percentage.

Step-by-step approach:




# Define the two original lists
original_list1 = [1, 4, 6, 8, 9, 10, 7]
original_list2 = [7, 11, 12, 8, 9]
 
# Find the number of common elements in both lists
common_elements = set(original_list1).intersection(set(original_list2))
num_common_elements = len(common_elements)
 
# Find the total number of unique elements in both lists
total_elements = set(original_list1).union(set(original_list2))
num_total_elements = len(total_elements)
 
# Calculate the percentage similarity
percentage_similarity = (num_common_elements / num_total_elements) * 100
 
# Print the result
print("Percentage similarity among lists is : {:.2f}".format(percentage_similarity))

Output
Percentage similarity among lists is : 33.33

Time complexity: O(n), where n is the total number of elements in both lists. This is because we are creating sets from both lists, which takes O(n) time, and then we are performing set operations on these sets, which take constant time on average. Finally, we are doing simple arithmetic calculations, which also take constant time. Therefore, the overall time complexity is O(n).

Auxiliary Space: O(n), where n is the total number of elements in both lists. This is because we are creating two sets, one for each list, which can potentially contain all the elements in the list. Therefore, the space taken by the sets is proportional to the total number of elements in both lists. Additionally, we are also creating some variables to store the number of common and total elements, which take constant space. Therefore, the overall space complexity is O(n).

Method #3 : Using Counter() from collections module:

Step-by-step approach:




from collections import Counter
 
original_list1 = [1, 4, 6, 8, 9, 10, 7]
original_list2 = [7, 11, 12, 8, 9]
 
counter1 = Counter(original_list1)
counter2 = Counter(original_list2)
 
common_elements = list((counter1 & counter2).elements())
percentage_similarity = len(common_elements) / len(set(original_list1 + original_list2)) * 100
 
print("Percentage similarity among lists is : {:.2f}".format(percentage_similarity))
#This code  is contributed by Jyothi pinjala

Output
Percentage similarity among lists is : 33.33

Time Complexity:
The time complexity of this algorithm is O(n), where n is the total number of elements in both input lists. The Counter objects are created in O(n) time and the intersection operation also takes O(n) time in the worst case.

Auxiliary Space:
The space complexity of this algorithm is O(n), where n is the total number of elements in both input lists. The Counter objects are created to store the count of each element in each input list, and the common elements list can have a maximum of n elements.

Method #4: Using list comprehension and set()




# Define the two original lists
original_list1 = [1, 4, 6, 8, 9, 10, 7]
original_list2 = [7, 11, 12, 8, 9]
 
# Create a list comprehension to get all the common elements in both lists
common_elements = [element for element in original_list1 if element in original_list2]
 
# Find the total number of unique elements in both lists using set union
total_elements = set(original_list1).union(set(original_list2))
num_total_elements = len(total_elements)
 
# Calculate the percentage similarity using the common elements count and total elements count
percentage_similarity = (len(common_elements) / num_total_elements) * 100
 
# Print the result
print("Percentage similarity among lists is : {:.2f}".format(percentage_similarity))

Output
Percentage similarity among lists is : 33.33

Time complexity: O(n+m), where n and m are the lengths of the two lists
Auxiliary space: O(p), where p is the number of unique elements in both lists


Article Tags :