Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

Find duplicate rows in a binary matrix

  • Difficulty Level : Medium
  • Last Updated : 13 Dec, 2021

Given a binary matrix whose elements are only 0 and 1, we need to print the rows which are duplicate of rows which are already present in the matrix.
Examples: 
 

Input : {1, 1, 0, 1, 0, 1},
    {0, 0, 1, 0, 0, 1},
    {1, 0, 1, 1, 0, 0},
    {1, 1, 0, 1, 0, 1},
    {0, 0, 1, 0, 0, 1},
    {0, 0, 1, 0, 0, 1}.

Output :
There is a duplicate row at position: 4 
There is a duplicate row at position: 5 
There is a duplicate row at position: 6 

 

This problem is mainly an extension of find unique rows in a binary matrix.
A Simple Solution is to traverse all rows one by one. For every row, check if it is present anywhere else. If yes print the row. 
Time complexity : O(ROW^2 x COL) 
Auxiliary Space : O(1)
Optimal solution using Trie Trie is an efficient data structure used for storing and retrieval of data where the character set is small. The searching complexity is optimal as key length. 
The solution approach towards the question is to first insert the matrix in the binary trie and then if the new added row is already present in the trie then we will now that it is a duplicate row 
 

C




// C++ program to find duplicate rows
// in a binary matrix.
#include<bits/stdc++.h>
 
const int MAX = 100;
 
/*struct the Trie*/
struct Trie
{
    bool leaf;
    Trie* children[2];
};
 
/*function to get Trienode*/
Trie* getNewTrieNode()
{
    Trie* node = new Trie;
    node->children[0] = node->children[1] = NULL;
    node->leaf = false;
    return node;
}
 
/* function to insert a row in Trie*/
bool insert(Trie*& head, bool* arr, int N)
{
    Trie* curr = head;
 
    for (int i = 0; i < N; i++)
    {
        /*creating a new path if it don not exist*/
        if (curr->children[arr[i]] == NULL)
            curr->children[arr[i]] = getNewTrieNode();
 
        curr = curr->children[arr[i]];
    }
 
    /*if the row already exist return false*/
    if (curr->leaf)
        return false;
 
    /* making leaf node tree and return true*/
    return (curr->leaf = true);
}
 
void printDuplicateRows(bool mat[][MAX], int M, int N)
{
    Trie* head = getNewTrieNode();
 
    /*inserting into Trie and checking for duplicates*/
    for (int i = 0; i < M; i++)
 
        // If already exists
        if (!insert(head, mat[i], N))
            printf("There is a duplicate row"
                  " at position: %d \n", i+1);
 
}
 
/*driver function to check*/
int main()
{
    bool mat[][MAX] =
    {
        {1, 1, 0, 1, 0, 1},
        {0, 0, 1, 0, 0, 1},
        {1, 0, 1, 1, 0, 0},
        {1, 1, 0, 1, 0, 1},
        {0, 0, 1, 0, 0, 1},
        {0, 0, 1, 0, 0, 1},
    };
 
    printDuplicateRows(mat, 6, 6);
    return 0;
}

Output: 
 

There is a duplicate row at position: 4 
There is a duplicate row at position: 5 
There is a duplicate row at position: 6 

Another approach without using Trie but does not work for large number of columns 
Another approach is be to convert the decimal equivalent of row and check if a new row has the same decimal equivalent then it is a duplicate row. It will not work if the number of columns is large .

Here is the implementation of the above approach.

C++




#include<iostream>
#include<vector>
#include<set>
using namespace std;
vector<int> repeatedRows(vector<vector<int>> matrix, int M, int N)
{
     
    set<int>s;
     
    // vector to store the repeated rows
    vector<int>res;
     
    for(int i=0;i<M;i++){
        // calculating decimal equivalent of the row
        int no=0;
        for(int j=0;j<N;j++){
            no+=(matrix[i][j]<<j);
        }
         
        /*
        rows with same decimal equivatent will be same,
        therefore, checking through set if the calculated equivalent was
        present before;
        if yes then add to thee result otherwise insert in the set
        */
        if(s.find(no)!=s.end()){
            res.push_back(i);
        }
        else{
            s.insert(no);
        }
    }
     
    return res;
   
}
int main() {
 
  vector<vector<int>>matrix={
        {1, 1, 0, 1, 0, 1},
        {0, 0, 1, 0, 0, 1},
        {1, 0, 1, 1, 0, 0},
        {1, 1, 0, 1, 0, 1},
        {0, 0, 1, 0, 0, 1},
        {0, 0, 1, 0, 0, 1},};
     
  int m=matrix.size();
  int n=matrix[0].size();
  vector<int>res=repeatedRows(matrix,m,n);
  for(int e:res){
     cout<< "There is a duplicate row at position: "<<e+1 << '\n';
  }
     
   
    return 0;
}
Output
There is a duplicate row at position: 4
There is a duplicate row at position: 5
There is a duplicate row at position: 6

Time Complexity=O(M*N)

Space Complexity=O(M)  where M is number of rows

This article is contributed by Pranav. If you like GeeksforGeeks and would like to contribute, you can also write an article using write.geeksforgeeks.org or mail your article to review-team@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
 


My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!