Python | Records Union

Last Updated : 02 May, 2023

Sometimes, while working with data, we may have a problem in which we require to find all records between two lists that we receive. This is a very common problem and records usually occur as a tuple. Let’s discuss certain ways in which this problem can be solved.

Method #1: Using list comprehension

List comprehension can opt as a method to perform this task in one line rather than running a loop to find the union elements. In this, we just iterate for a single list and check if any element occurs in the other one. If not, we again populate the made list.

Python3

# Python3 code to demonstrate working of 
# Records Union 
# Using list comprehension 
 
# Initializing lists 
test_list1 = [('gfg', 1), ('is', 2), ('best', 3)] 
test_list2 = [('i', 3), ('love', 4), ('gfg', 1)] 
 
# printing original lists 
print("The original list 1 is : " + str(test_list1)) 
print("The original list 2 is : " + str(test_list2)) 
 
# Records Union 
# Using list comprehension 
res1 = [ele1 for ele1 in test_list1] 
res2 = [ele2 for ele2 in test_list2 if ele2 not in res1] 
res = res1 + res2 
 
# printing result 
print("The union of data records is : " + str(res)) 

Output

The original list 1 is : [('gfg', 1), ('is', 2), ('best', 3)]
The original list 2 is : [('i', 3), ('love', 4), ('gfg', 1)]
The union of data records is : [('gfg', 1), ('is', 2), ('best', 3), ('i', 3), ('love', 4)]

Time complexity: O(M^N) as the number of combinations generated is M choose N.
Auxiliary space: O(M^N) as the size of the resultant list is also M choose N.

Method #2: Using set.union()

This task can also be performed in a smaller way using the generic set union. In this, we first convert the list of records to a set and then perform its union using union() function.

Python3

# Python3 code to demonstrate working of
# Records Union using set.union()
 
# Initializing lists
test_list1 = [('gfg', 1), ('is', 2), ('best', 3)]
test_list2 = [('i', 3), ('love', 4), ('gfg', 1)]
 
# Printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
 
# Records Union using set.union()
res = list(set(test_list1).union(set(test_list2)))
 
# Printing the result
print("The union of data records is : " + str(res))

Output

The original list 1 is : [('gfg', 1), ('is', 2), ('best', 3)]
The original list 2 is : [('i', 3), ('love', 4), ('gfg', 1)]
The union of data records is : [('best', 3), ('i', 3), ('gfg', 1), ('is', 2), ('love', 4)]

Time complexity: O(M^N) as the number of combinations generated is M choose N.
Auxiliary space: O(M^N) as the size of the resultant list is also M choose N.

Method #3: Using loop and not in operator

Lists can be inserted into a single list and then insert the elements in a new list using a loop and not in operator to check if the element is present in the new list, if the element is not present then add that element to the list.

Python3

# Python3 code to demonstrate working of
# Records Union
 
# Initializing lists
test_list1 = [('gfg', 1), ('is', 2), ('best', 3)]
test_list2 = [('i', 3), ('love', 4), ('gfg', 1)]
 
# Printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
 
# Records Union
x = []
 
x.extend(test_list1)
x.extend(test_list2)
 
res = []
 
for i in x:
    if i not in res:
        res.append(i)
 
# Printing the result
print("The union of data records is : " + str(res))

Output

The original list 1 is : [('gfg', 1), ('is', 2), ('best', 3)]
The original list 2 is : [('i', 3), ('love', 4), ('gfg', 1)]
The union of data records is : [('gfg', 1), ('is', 2), ('best', 3), ('i', 3), ('love', 4)]

Method #4: Using collections.Counter

Python3

from collections import Counter
 
# Initializing lists
test_list1 = [('gfg', 1), ('is', 2), ('best', 3)]
test_list2 = [('i', 3), ('love', 4), ('gfg', 1)]
  
# printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
  
# Records Union
res = list(Counter(test_list1) | Counter(test_list2))
 
# Printing the result
print("The union of data records is : " + str(res))
 
# This code is contributed by Edula Vinay Kumar Reddy

Output

The original list 1 is : [('gfg', 1), ('is', 2), ('best', 3)]
The original list 2 is : [('i', 3), ('love', 4), ('gfg', 1)]
The union of data records is : [('gfg', 1), ('is', 2), ('best', 3), ('i', 3), ('love', 4)]

This method uses the | operator to perform a union between two Counter objects. The Counter class is a ‘dict’ subclass from collections module that is used to count the occurrences of elements in a list. The | operator returns a new Counter that contains the elements present in either of the input Counters. This method is useful when you want to maintain the order of the records and also the count of the occurrences of each record in the original lists.

Time Complexity: O(n)
Auxiliary Space: O(n)

Method #5: Using heapq

Algorithm:

Import the heapq module.
Initialize two lists test_list1 and test_list2 with some data.
Print the original lists.
Merge the two lists using heapq.merge() function.
Convert the merged list to a dictionary to eliminate duplicates.
Convert the dictionary back to a list of tuples.
Print the resulting list.

Python3

import heapq
 
# Initializing lists
test_list1 = [('gfg', 1), ('is', 2), ('best', 3)]
test_list2 = [('i', 3), ('love', 4), ('gfg', 1)]
 
# printing original lists
print("The original list 1 is : " + str(test_list1))
print("The original list 2 is : " + str(test_list2))
 
# Records Union using heapq.merge()
res = list(dict(heapq.merge(test_list1, test_list2)).items())
 
# printing result
print("The union of data records is : " + str(res))
 
# This code is contributed by Rayudu.

Output

The original list 1 is : [('gfg', 1), ('is', 2), ('best', 3)]
The original list 2 is : [('i', 3), ('love', 4), ('gfg', 1)]
The union of data records is : [('gfg', 1), ('i', 3), ('is', 2), ('best', 3), ('love', 4)]

Time complexity: O(Nlogk), where N is the total number of elements in the input lists and k is the number of input lists being merged.
Converting the merged list to a dictionary and back to a list of tuples takes O(N) time. Therefore, the overall time complexity of this approach is O(Nlogk), where k=2.

Auxiliary Space: O(N), as we create a new dictionary and list of tuples that can potentially contain all the elements from the input lists. However, since we’re using heapq.merge() instead of creating a new list with all the elements, the actual space used at any given time is much lower.

Suggest improvement

Python | True Record

Share your thoughts in the comments