Open In App

How to Remove Duplicate Elements from NumPy Array

Last Updated : 25 Sep, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will see how to remove duplicate elements from NumPy Array. Here we will learn how to Remove Duplicate Elements from a 1-D NumPy Array and 2-D NumPy Array.

Input1:  [1 2 3 4 5 1 2 3 1 2 9]
Output1: [1 2 3 4 5 9]
Explanation: In this example, we have removed duplicate elements from our input NumPy Array
Input2: [ [1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]
[1 2 3 4 5]
[0 1 2 9 8] ]
Output2: [[0 1 2 9 8]
[1 2 3 4 5]
[1 4 9 8 5]]
Explanation: In this example, we have removed duplicate elements from 2-D NumPy Array

Remove Duplicate Element from 1-D NumPy Array

Below are the methods by which we can remove duplicate elements from 1-D NumPy Array in Python:

Remove Duplicate Element using Python set()

We all knew that set() has a unique property that it eliminates duplicate elements. So in this example, we are using this property to remove our duplicate elements from our 1-D NumPy Array.

Python3




import numpy as np
 
#declaring original array
org = np.array([1, 2, 3, 4, 5, 1, 2, 3, 1, 2 , 9])
 
#displaying the original array
print("Original Array : ")
print(org,"\n")
 
#using the set operation
new = set(org)
new = np.array(list(new))
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[1 2 3 4 5 1 2 3 1 2 9]
New Array :
[1 2 3 4 5 9]

Time complexity: O(n), where n is the number of elements
Auxiliary Space: O(n), where n is the number of elements

Remove Duplicate Elements using numpy unique()

In this example, we are using numpy.unique() function to remove the duplicate elements from the NumPy Array.

Python3




import numpy as np
 
#declaring original array
org = np.array([1, 2, 3, 4, 5, 1, 2, 3, 1, 2 , 9])
 
#displaying the original array
print("Original Array : ")
print(org,"\n")
 
#using the .unique()
new = np.unique(org)
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[1 2 3 4 5 1 2 3 1 2 9]
New Array :
[1 2 3 4 5 9]

Time complexity: O(N*log N)
Auxiliary Space: O(N)

Remove Duplicate Elements using iteration

In this example, we are using iteration to remove duplicate elements from NumPy Array.

Python3




import numpy as np
 
#declaring original array
org = np.array([1, 2, 3, 4, 5, 1, 2, 3, 1, 2 , 9])
 
#displaying the original array
print("Original Array : ")
print(org,"\n")
 
l = []
for i in org:
    if i not in l:
        l.append(i)
 
new = np.array(l)
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[1 2 3 4 5 1 2 3 1 2 9]
New Array :
[1 2 3 4 5 9]


Time complexity : O(N*M) where N is the number of elements and M is the number of unique elements in the array.
Auxiliary Space : O(N) where N is the number of elements in the array.

Remove Duplicate Element in 2-D NumPy Array

Below are the methods by which we can remove duplicate element from 1-D NumPy Array:

Remove Duplicate Element Using np.unique()

In this example, we are using numpy.unique() function to remove the duplicate elements from the NumPy Array. It is as same as we used in 1-D NumPy Array but the difference is that we have to specify that we want to remove only duplicate rows and not the columns. So we have specify that by sing axis=0 that we want only duplicate rows to be remove without hampering column data.

Python3




import numpy as np
 
org = np.array([
    [1,2,3,4,5],
    [0,1,2,9,8],
    [1,4,9,8,5],
    [1,2,3,4,5],
    [0,1,2,9,8]])
     
#displaying the original array
print("Original Array : ")
print(org,"\n")
     
new = [list(i) for i in org]
#using the .unique() with axis=0
new = np.unique(new,axis=0)
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[[1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]
[1 2 3 4 5]
[0 1 2 9 8]]
New Array :
[[0 1 2 9 8]
[1 2 3 4 5]
[1 4 9 8 5]]


Time complexity : O(NM * log(NM))
Auxiliary Space : O(NM)

Remove Duplicate Element in 2-D NumPy Array Using set().

In this example, we are using set() same as we use in the 1-D array but with frozenset() to preserve the order.

Python3




import numpy as np
 
org = np.array([
    [1,2,3,4,5],
    [0,1,2,9,8],
    [1,4,9,8,5],
    [1,2,3,4,5],
    [0,1,2,9,8]])
     
#displaying the original array
print("Original Array : ")
print(org,"\n")
     
# Convert each row of original array to a frozenset
rows = set()
for i in org:
    rows.add(frozenset(i))
 
# Convert the frozensets back to NumPy array
new = np.array([list(i) for i in rows])
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[[1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]
[1 2 3 4 5]
[0 1 2 9 8]]

New Array :
[[1 2 3 4 5]
[1 4 5 8 9]
[0 1 2 8 9]]


Time complexity : O(N*M) where N is the number of rows and M is the number of columns in the original array.
Auxiliary Space : O(N*M) where N is the number of rows and M is the number of columns in the original array.

Remove Duplicate Element in 2-D NumPy Array Using iteration

We can also use iteration same as we have done in 1-D part. It is one of the simplest or naive approach. We can simply iterate through the 2-D numpy array and remove the element that appear more than one time.

Python3




import numpy as np
 
org = np.array([
    [1,2,3,4,5],
    [0,1,2,9,8],
    [1,4,9,8,5],
    [1,2,3,4,5],
    [0,1,2,9,8]])
     
#displaying the original array
print("Original Array : ")
print(org,"\n")
     
new = [] #defining a new array    
     
#iterating through each element of org array
for i in org:
    if list(i) not in new:
        new.append(list(i))
 
new = np.array(new)
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[[1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]
[1 2 3 4 5]
[0 1 2 9 8]]
New Array :
[[1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]]


Time complexity : O(N*M*R) where N is the number of rows, M is the number of columns and R is the number of unique rows in array.
Auxiliary Space : O(N*M) where N is the number of rows and M is the number of columns in the original array.

Remove Duplicate Element Using numpy.lexsort() and np.diff()

We can use numpy.lexsort() and np.diff() to eliminate the repeated elements. The numpy.lexsort() returns the array of indices of the elements from smallest to largest. This can be later used to obatin the sorted array. The np.diff() is used to compute the Nth order discrete difference between the consecutive elements of the passed array.

Python3




import numpy as np
 
org = np.array([
    [1,2,3,4,5],
    [0,1,2,9,8],
    [1,4,9,8,5],
    [1,2,3,4,5],
    [0,1,2,9,8]])
     
#displaying the original array
print("Original Array : ")
print(org,"\n")
     
new = np.lexsort(org.T) #pssing transpose of org array to lexsort()
 
 
new01 =  org[new,:]
#it gets the indices value given by "new array" and create a new01 array
 
x = np.concatenate(([True], np.any(np.diff(new01, axis=0), axis=1)))
result=np.array(new01[x])
 
#displaying the new array with updated/unique elements
print("Result Array : ")
print(result)


Output

Original Array : 
[[1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]
[1 2 3 4 5]
[0 1 2 9 8]]
Result Array :
[[1 2 3 4 5]
[1 4 9 8 5]
[0 1 2 9 8]]

Time complexity : O(M * N * log(N)) where N is the number of rows, M is the number of columns and R is the number of unique rows in array.
Auxiliary Space : O(N*M) where N is the number of rows and M is the number of columns in the original array.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads