Open In App

How to Remove Duplicate Elements from NumPy Array

Last Updated : 25 Sep, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will see how to remove duplicate elements from NumPy Array. Here we will learn how to Remove Duplicate Elements from a 1-D NumPy Array and 2-D NumPy Array.

Input1:  [1 2 3 4 5 1 2 3 1 2 9]
Output1: [1 2 3 4 5 9]
Explanation: In this example, we have removed duplicate elements from our input NumPy Array
Input2: [ [1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]
[1 2 3 4 5]
[0 1 2 9 8] ]
Output2: [[0 1 2 9 8]
[1 2 3 4 5]
[1 4 9 8 5]]
Explanation: In this example, we have removed duplicate elements from 2-D NumPy Array

Remove Duplicate Element from 1-D NumPy Array

Below are the methods by which we can remove duplicate elements from 1-D NumPy Array in Python:

Remove Duplicate Element using Python set()

We all knew that set() has a unique property that it eliminates duplicate elements. So in this example, we are using this property to remove our duplicate elements from our 1-D NumPy Array.

Python3




import numpy as np
 
#declaring original array
org = np.array([1, 2, 3, 4, 5, 1, 2, 3, 1, 2 , 9])
 
#displaying the original array
print("Original Array : ")
print(org,"\n")
 
#using the set operation
new = set(org)
new = np.array(list(new))
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[1 2 3 4 5 1 2 3 1 2 9]
New Array :
[1 2 3 4 5 9]

Time complexity: O(n), where n is the number of elements
Auxiliary Space: O(n), where n is the number of elements

Remove Duplicate Elements using numpy unique()

In this example, we are using numpy.unique() function to remove the duplicate elements from the NumPy Array.

Python3




import numpy as np
 
#declaring original array
org = np.array([1, 2, 3, 4, 5, 1, 2, 3, 1, 2 , 9])
 
#displaying the original array
print("Original Array : ")
print(org,"\n")
 
#using the .unique()
new = np.unique(org)
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[1 2 3 4 5 1 2 3 1 2 9]
New Array :
[1 2 3 4 5 9]

Time complexity: O(N*log N)
Auxiliary Space: O(N)

Remove Duplicate Elements using iteration

In this example, we are using iteration to remove duplicate elements from NumPy Array.

Python3




import numpy as np
 
#declaring original array
org = np.array([1, 2, 3, 4, 5, 1, 2, 3, 1, 2 , 9])
 
#displaying the original array
print("Original Array : ")
print(org,"\n")
 
l = []
for i in org:
    if i not in l:
        l.append(i)
 
new = np.array(l)
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[1 2 3 4 5 1 2 3 1 2 9]
New Array :
[1 2 3 4 5 9]


Time complexity : O(N*M) where N is the number of elements and M is the number of unique elements in the array.
Auxiliary Space : O(N) where N is the number of elements in the array.

Remove Duplicate Element in 2-D NumPy Array

Below are the methods by which we can remove duplicate element from 1-D NumPy Array:

Remove Duplicate Element Using np.unique()

In this example, we are using numpy.unique() function to remove the duplicate elements from the NumPy Array. It is as same as we used in 1-D NumPy Array but the difference is that we have to specify that we want to remove only duplicate rows and not the columns. So we have specify that by sing axis=0 that we want only duplicate rows to be remove without hampering column data.

Python3




import numpy as np
 
org = np.array([
    [1,2,3,4,5],
    [0,1,2,9,8],
    [1,4,9,8,5],
    [1,2,3,4,5],
    [0,1,2,9,8]])
     
#displaying the original array
print("Original Array : ")
print(org,"\n")
     
new = [list(i) for i in org]
#using the .unique() with axis=0
new = np.unique(new,axis=0)
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[[1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]
[1 2 3 4 5]
[0 1 2 9 8]]
New Array :
[[0 1 2 9 8]
[1 2 3 4 5]
[1 4 9 8 5]]


Time complexity : O(NM * log(NM))
Auxiliary Space : O(NM)

Remove Duplicate Element in 2-D NumPy Array Using set().

In this example, we are using set() same as we use in the 1-D array but with frozenset() to preserve the order.

Python3




import numpy as np
 
org = np.array([
    [1,2,3,4,5],
    [0,1,2,9,8],
    [1,4,9,8,5],
    [1,2,3,4,5],
    [0,1,2,9,8]])
     
#displaying the original array
print("Original Array : ")
print(org,"\n")
     
# Convert each row of original array to a frozenset
rows = set()
for i in org:
    rows.add(frozenset(i))
 
# Convert the frozensets back to NumPy array
new = np.array([list(i) for i in rows])
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[[1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]
[1 2 3 4 5]
[0 1 2 9 8]]

New Array :
[[1 2 3 4 5]
[1 4 5 8 9]
[0 1 2 8 9]]


Time complexity : O(N*M) where N is the number of rows and M is the number of columns in the original array.
Auxiliary Space : O(N*M) where N is the number of rows and M is the number of columns in the original array.

Remove Duplicate Element in 2-D NumPy Array Using iteration

We can also use iteration same as we have done in 1-D part. It is one of the simplest or naive approach. We can simply iterate through the 2-D numpy array and remove the element that appear more than one time.

Python3




import numpy as np
 
org = np.array([
    [1,2,3,4,5],
    [0,1,2,9,8],
    [1,4,9,8,5],
    [1,2,3,4,5],
    [0,1,2,9,8]])
     
#displaying the original array
print("Original Array : ")
print(org,"\n")
     
new = [] #defining a new array    
     
#iterating through each element of org array
for i in org:
    if list(i) not in new:
        new.append(list(i))
 
new = np.array(new)
 
#displaying the new array with updated/unique elements
print("New Array : ")
print(new)


Output

Original Array : 
[[1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]
[1 2 3 4 5]
[0 1 2 9 8]]
New Array :
[[1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]]


Time complexity : O(N*M*R) where N is the number of rows, M is the number of columns and R is the number of unique rows in array.
Auxiliary Space : O(N*M) where N is the number of rows and M is the number of columns in the original array.

Remove Duplicate Element Using numpy.lexsort() and np.diff()

We can use numpy.lexsort() and np.diff() to eliminate the repeated elements. The numpy.lexsort() returns the array of indices of the elements from smallest to largest. This can be later used to obatin the sorted array. The np.diff() is used to compute the Nth order discrete difference between the consecutive elements of the passed array.

Python3




import numpy as np
 
org = np.array([
    [1,2,3,4,5],
    [0,1,2,9,8],
    [1,4,9,8,5],
    [1,2,3,4,5],
    [0,1,2,9,8]])
     
#displaying the original array
print("Original Array : ")
print(org,"\n")
     
new = np.lexsort(org.T) #pssing transpose of org array to lexsort()
 
 
new01 =  org[new,:]
#it gets the indices value given by "new array" and create a new01 array
 
x = np.concatenate(([True], np.any(np.diff(new01, axis=0), axis=1)))
result=np.array(new01[x])
 
#displaying the new array with updated/unique elements
print("Result Array : ")
print(result)


Output

Original Array : 
[[1 2 3 4 5]
[0 1 2 9 8]
[1 4 9 8 5]
[1 2 3 4 5]
[0 1 2 9 8]]
Result Array :
[[1 2 3 4 5]
[1 4 9 8 5]
[0 1 2 9 8]]

Time complexity : O(M * N * log(N)) where N is the number of rows, M is the number of columns and R is the number of unique rows in array.
Auxiliary Space : O(N*M) where N is the number of rows and M is the number of columns in the original array.



Similar Reads

NumPy ndarray.size() Method | Get Number of Elements in NumPy Array
The ndarray.size() method returns the number of elements in the NumPy array. It works the same as np.prod(a.shape), i.e., the product of the dimensions of the array. Example C/C++ Code import numpy as np arr = np.zeros((3, 4, 2), dtype = np.complex128) gfg = arr.size print (gfg) Output : 24Syntax Syntax: numpy.ndarray.size(arr) Parameters arr : [ar
1 min read
NumPy ndarray.__abs__() | Find Absolute Value of Elements in NumPy Array
The ndarray.__abs__() method returns the absolute value of every element in the NumPy array. It is automatically invoked when we use Python's built-in method abs() on a NumPy array. Example C/C++ Code import numpy as np gfg = np.array([1.45, 2.32, 3.98, 4.41, 5.55, 6.12]) print(gfg.__abs__()) Output[ 1 2 3 4 5 6] SyntaxSyntax: ndarray.__abs__() Ret
1 min read
NumPy ndarray.__ilshift__() | Shift NumPy Array Elements to Left
The ndarray.__ilshift__() method is an in-place left-shift operation. It shifts elements in the array to the left of the number of positions specified. Example C/C++ Code import numpy as np gfg = np.array([1, 2, 3, 4, 5]) # applying ndarray.__ilshift__() method print(gfg.__ilshift__(2)) Output[ 4 8 12 16 20] SyntaxSyntax: ndarray.__ilshift__($self,
1 min read
NumPy ndarray.__irshift__() | Shift NumPy Array Elements to Right
The ndarray.__irshift__() method returns a new array where each element is right-shifted by the value that is passed as a parameter. Example C/C++ Code import numpy as np gfg = np.array([1, 2, 3, 4, 5]) # applying ndarray.__irshift__() method print(gfg.__irshift__(2)) Output[0 0 0 1 1] SyntaxSyntax: ndarray.__irshift__($self, value, /) Parameter se
1 min read
Python | Remove unordered duplicate elements from a list
Given a list, the task is to remove the duplicate elements. All the elements which are not in same order but made of same characters/numbers are considered as duplicates. Examples: Input : ['gfg', 'ggf', 'fgg', 'for', 'orf', 'ofr', 'rfo', 'rof', 'fro'] Output : ['for', 'fgg'] Input: ['110', '101', '001', '010', '100'] Output: ['001', '011'] Method
2 min read
Python - Remove Columns of Duplicate Elements
Given a Matrix, write a Python program to remove whole column if duplicate occurs in any row. Examples: Input : test_list = [[4, 3, 5, 2, 3], [6, 4, 2, 1, 1], [4, 3, 9, 3, 9], [5, 4, 3, 2, 1]] Output : [[4, 3, 5], [6, 4, 2], [4, 3, 9], [5, 4, 3]] Explanation : 1 has duplicate as next element hence 5th column is removed. 3 occurs as 2nd and 4th inde
3 min read
Python program to remove duplicate elements index from other list
Given two lists, the task is to write a Python program to remove all the index elements from 2nd list which are duplicate element indices from 1st list. Examples: Input : test_list1 = [3, 5, 6, 5, 3, 7, 8, 6], test_list2 = [1, 7, 6, 3, 7, 9, 10, 11] Output : [1, 7, 6, 9, 10] Explanation : 3, 7 and 11 correspond to 2nd occurrence of 5, 3 and 6, henc
7 min read
NumPy Array Sorting | How to sort NumPy Array
Sorting an array is a very important step in data analysis as it helps in ordering data, and makes it easier to search and clean. In this tutorial, we will learn how to sort an array in NumPy. You can sort an array in NumPy: Using np.sort() functionin-line sortsorting along different axesUsing np.argsort() functionUsing np.lexsort() functionUsing s
4 min read
How to remove specific elements from a NumPy array ?
In this article, we will discuss how to remove specific elements from the NumPy Array. Remove specific elements from a NumPy 1D arrayDeleting element from NumPy array using np.delete() The delete(array_name ) method will be used to do the same. Where array_name is the name of the array to be deleted and index-value is the index of the element to be
3 min read
Difference between Numpy array and Numpy matrix
While working with Python many times we come across the question that what exactly is the difference between a numpy array and numpy matrix, in this article we are going to read about the same. What is np.array() in PythonThe Numpy array object in Numpy is called ndarray. We can create ndarray using numpy.array() function. It is used to convert a l
3 min read