Find duplicate rows in a binary matrix

Given a binary matrix whose elements are only 0 and 1, we need to print the rows which are duplicate of rows which are already present in the matrix.

Examples:

Input : {1, 1, 0, 1, 0, 1},
    {0, 0, 1, 0, 0, 1},
    {1, 0, 1, 1, 0, 0},
    {1, 1, 0, 1, 0, 1},
    {0, 0, 1, 0, 0, 1},
    {0, 0, 1, 0, 0, 1}.

Output :
There is a duplicate row at position: 4 
There is a duplicate row at position: 5 
There is a duplicate row at position: 6 

This problem is mainly an extension of find unique rows in a binary matrix.



A Simple Solution is to traverse all rows one by one. For every row, check if it is present anywhere else. If yes print the row.

Time complexity : O(ROW^2 x COL)
Auxiliary Space : O(1)

Optimal solution using Trie Trie is an efficient data structure used for strong and retrieval of data where character set is small. The searching complexity is optimal as key length.
The solution approach towards the question is to first insert the matrix in the binary trie and then if the new added row is already present in the trie then we will now that it is a duplicate row

filter_none

edit
close

play_arrow

link
brightness_4
code

// C++ program to find duplicate rows
// in a binary matrix.
#include<bits/stdc++.h>
  
const int MAX = 100;
  
/*struct the Trie*/
struct Trie
{
    bool leaf;
    Trie* children[2];
};
  
/*function to get Trienode*/
Trie* getNewTrieNode()
{
    Trie* node = new Trie;
    node->children[0] = node->children[1] = NULL;
    node->leaf = false;
    return node;
}
  
/* function to insert a row in Trie*/
bool insert(Trie*& head, bool* arr, int N)
{
    Trie* curr = head;
  
    for (int i = 0; i < N; i++)
    {
        /*creating a new path if it don not exist*/
        if (curr->children[arr[i]] == NULL)
            curr->children[arr[i]] = getNewTrieNode();
  
        curr = curr->children[arr[i]];
    }
  
    /*if the row already exist return false*/
    if (curr->leaf)
        return false;
  
    /* making leaf node tree and return true*/
    return (curr->leaf = true);
}
  
void printDuplicateRows(bool mat[][MAX], int M, int N)
{
    Trie* head = getNewTrieNode();
  
    /*inserting into Trie and checking for dulpicates*/
    for (int i = 0; i < M; i++)
  
        // If already exists
        if (!insert(head, mat[i], N))
            printf("There is a duplicate row"
                  " at position: %d \n", i+1);
  
}
  
/*driver function to check*/
int main()
{
    bool mat[][MAX] =
    {
        {1, 1, 0, 1, 0, 1},
        {0, 0, 1, 0, 0, 1},
        {1, 0, 1, 1, 0, 0},
        {1, 1, 0, 1, 0, 1},
        {0, 0, 1, 0, 0, 1},
        {0, 0, 1, 0, 0, 1},
    };
  
    printDuplicateRows(mat, 6, 6);
    return 0;
}

chevron_right


Output:

There is a duplicate row at position: 4 
There is a duplicate row at position: 5 
There is a duplicate row at position: 6 

Another approach without using Trie but does not work for large number of columns
Another approach is be to convert the decimal equivalent of row and check if a new row has the same decimal equivalent then it is a duplicate row. It will not work if the number of columns is large .

This article is contributed by Pranav. If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.



My Personal Notes arrow_drop_up


Article Tags :
Practice Tags :


1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.