Array range queries for searching an element

Given an array of N elements and Q queries of the form L R X. For each query, you have to output if the element X exists in the array between the indices L and R(included).

Prerequisite : Mo’s Algorithms

Examples :

Input : N = 5
        arr = [1, 1, 5, 4, 5]
        Q = 3
        1 3 2
        2 5 1
        3 5 5         
Output : No
         Yes
         Yes
Explanation :
For the first query, 2 does not exist between the indices 1 and 3.
For the second query, 1 exists between the indices 2 and 5.
For the third query, 5 exists between the indices 3 and 5.

Naive Approach :
The naive method would be to traverse the elements from L to R for each query, linearly searching for X. In the worst case, there can be N elements from L to R, hence the worst case time complexity for each query would be O(N). Therefore, for all the Q queries, the time complexity would turn out to be O(Q*N).

Using Union-Find Method :
This method checks only one element among all the consecutive equal values. If X is not equal to these values, then the algorithm skips all the other the other equal elements and continues traversal with the next different element. This algorithm is evidently useful only when there are consecutive equal elements in large amounts.



Algorithm :

  1. Merge all the consecutive equal elements in one group.
  2. While processing a query, start from R. Let index = R.
  3. Compare a[index] with X. If they are equal, then print “Yes”
    and break out of traversing the rest of the range. Else, skip all
    the consecutive elements belonging to the group of a[index]. Index
    becomes equal to one less than the index of the root of this group.

  4. Continue the above step either till X is found or till
    index becomes less than L.

  5. If index becomes less than L, print “No”.

Below is the C++ implementation of the above idea.

// Program to determine if the element
// exists for different range queries
#include <bits/stdc++.h>
using namespace std;

// Structure to represent a query range
struct Query
{
    int L, R, X;
};

const int maxn = 100;

int root[maxn];

// Find the root of the group containing
// the element at index x
int find(int x)
{
    return x == root[x] ? x : root[x] =
                find(root[x]);
}

// merge the two groups containing elements
// at indices x and y into one group
int uni(int x, int y)
{
    int p = find(x), q = find(y);
    if (p != q) {
        root[p] = root[q];
    }
}

void initialize(int a[], int n, Query q[], int m)
{
    // make n subsets with every
    // element as its root
    for (int i = 0; i < n; i++)
        root[i] = i;

    // consecutive elements equal in value are
    // merged into one single group
    for (int i = 1; i < n; i++)
        if (a[i] == a[i - 1])
            uni(i, i - 1);
}

// Driver code
int main()
{
    int a[] = { 1, 1, 5, 4, 5 };
    int n = sizeof(a) / sizeof(a[0]);
    Query q[] = { { 0, 2, 2 }, { 1, 4, 1 },
                  { 2, 4, 5 } };
    int m = sizeof(q) / sizeof(q[0]);
    initialize(a, n, q, m);

    for (int i = 0; i < m; i++)
    {
        int flag = 0;
        int l = q[i].L, r = q[i].R, x = q[i].X;
        int p = r;

        while (p >= l)
        {

            // check if the current element in
            // consideration is equal to x or not
            // if it is equal, then x exists in the range
            if (a[p] == x)
            {
                flag = 1;
                break;
            }
            p = find(p) - 1;
        }

        // Print if x exists or not
        if (flag != 0)
            cout << x << " exists between [" << l 
                 << ", " << r << "] " << endl;
        else
            cout << x << " does not exist between [" 
                << l << ", " << r  << "] " << endl;
    }
}
Output:

2 does not exist between [0, 2] 
1 exists between [1, 4] 
5 exists between [2, 4]

Efficient Approach(Using Mo’s Algorithm) :
Mo’s algorithm is one of the finest applications for square root decomposition.
It is based on the basic idea of using the answer to the previous query to compute the answer for the current query. This is made possible because the Mo’s algorithm is constructed in such a way that if F([L, R]) is known, then F([L + 1, R]), F([L – 1, R]), F([L, R + 1]) and F([L, R – 1]) can be computed easily, each in O(F) time.

Answering queries in the order they are asked, then the time complexity is not improved to what is needed to be. To reduce the time complexity considerably, the queries are divided into blocks and then sorted. The exact algorithm to sort the queries is as follows :

  • Denote BLOCK_SIZE = sqrt(N)
  • All the queries with the same L/BLOCK_SIZE are put in the same block
  • Within a block, the queries are sorted based on their R values
  • The sort function thus compares two queries, Q1 and Q2 as follows:
    Q1 must come before Q2 if:
    1. L1/BLOCK_SIZE<L2/BLOCK_SIZE
    2. L1/BLOCK_SIZE=L2/BLOCK_SIZE and R1<R2

After sorting the queries, the next step is to compute the answer to the first query and consequently answer rest of the queries. To determine if a particular element exists or not, check the frequency of the element in that range. A non zero frequency confirms the existence of the element in that range.
To store the frequency of the elements, STL map has been used in the following code.
In the example given, first query after sorting the array of queries is {0, 2, 2}. Hash the frequencies of the elements in [0, 2] and then check the frequency of the element 2 from the map. Since, 2 occurs 0 times, print “No”.
While processing the next query, which is {1, 4, 1} in this case, decrement the frequencies of the elements in the range [0, 1) and increment the frequencies of the elements in range [3, 4]. This step gives the frequencies of elements in [1, 4] and it can easily be seen from the map that 1 exists in this range.

Time complexity :
The pre-processing part, that is sorting the queries takes O(m Log m) time.
The index variable for R changes at most O(n * \sqrt{n}) times throughout the run and that for L changes its value at most O(m * \sqrt{n}) times. Hence, processing all queries takes O(n * \sqrt{n}) + O(m * \sqrt{n}) = O((m+n) * \sqrt{n}) time.

Below is the C++ implementation of the above idea :

// CPP code to determine if the element
// exists for different range queries
#include <bits/stdc++.h>

using namespace std;

// Variable to represent block size.
// This is made global, so compare() 
// of sort can use it.
int block;

// Structure to represent a query range
struct Query 
{
    int L, R, X;
};

// Function used to sort all queries so
// that all queries of same block are
// arranged together and within a block,
// queries are sorted in increasing order 
// of R values.
bool compare(Query x, Query y)
{
    // Different blocks, sort by block.
    if (x.L / block != y.L / block)
        return x.L / block < y.L / block;

    // Same block, sort by R value
    return x.R < y.R;
}

// Determines if the element is present for all
// query ranges. m is number of queries
// n is size of array a[].
void queryResults(int a[], int n, Query q[], int m)
{
    // Find block size
    block = (int)sqrt(n);

    // Sort all queries so that queries of same
    // blocks are arranged together.
    sort(q, q + m, compare);

    // Initialize current L, current R
    int currL = 0, currR = 0;

    // To store the frequencies of 
    // elements of the given range
    map<int, int> mp;

    // Traverse through all queries
    for (int i = 0; i < m; i++) {
        
        // L and R values of current range
        int L = q[i].L, R = q[i].R, X = q[i].X;

        // Decrement frequencies of extra elements
        // of previous range. For example if previous
        // range is [0, 3] and current range is [2, 5],
        // then the frequencies of a[0] and a[1] are decremented
        while (currL < L) 
        {
            mp[a[currL]]--;
            currL++;
        }

        // Increment frequencies of elements of current Range
        while (currL > L) 
        {
            mp[a[currL - 1]]++;
            currL--;
        }
        while (currR <= R) 
        {
            mp[a[currR]]++;
            currR++;
        }

        // Decrement frequencies of elements of previous
        // range.  For example when previous range is [0, 10] 
        // and current range is [3, 8], then frequencies of 
        // a[9] and a[10] are decremented
        while (currR > R + 1) 
        {
            mp[a[currR - 1]]--;
            currR--;
        }

        // Print if X exists or not
        if (mp[X] != 0)
            cout << X << " exists between [" << L
                 << ", " << R << "] " << endl;
        else
            cout << X << " does not exist between [" 
                 << L << ", " << R << "] " << endl;
    }
}

// Driver program
int main()
{
    int a[] = { 1, 1, 5, 4, 5 };
    int n = sizeof(a) / sizeof(a[0]);
    Query q[] = { { 0, 2, 2 }, { 1, 4, 1 }, { 2, 4, 5 } };
    int m = sizeof(q) / sizeof(q[0]);
    queryResults(a, n, q, m);
    return 0;
}
Output:

2 does not exist between [0, 2] 
1 exists between [1, 4] 
5 exists between [2, 4]




Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.

Recommended Posts:



0 Average Difficulty : 0/5.0
No votes yet.