# Count of distinct numbers in an Array in a range for Online Queries using Merge Sort Tree

• Last Updated : 26 May, 2022

Given an array arr[] of size N and Q queries of the form [L, R], the task is to find the number of distinct values in this array in the given range. Examples:

Input: arr[] = {4, 1, 9, 1, 3, 3}, Q = {{1, 3}, {1, 5}} Output: 3 4 Explanation: For query {1, 3}, elements are {4, 1, 9}. Therefore, count of distinct elements = 3 For query {1, 5}, elements are {4, 1, 9, 1, 3}. Therefore, count of distinct elements = 4 Input: arr[] = {4, 2, 1, 1, 4}, Q = {{2, 4}, {3, 5}} Output: 3 2

Naive Approach: A simple solution is that for every Query, iterate array from L to R and insert elements in a set. Finally, the Size of the set gives the number of distinct elements from L to R. Time Complexity: O(Q * N) Efficient Approach: The idea is to use Merge Sort Tree to solve this problem.

1. We will store the next occurrence of the element in a temporary array.
2. Then for every query from L to R, we will find the number of elements in the temporary array whose values are greater than R in range L to R.

Step 1: Take an array next_right, where next_right[i] holds the next right index of the number i in the array a. Initialize this array as N(length of the array). Step 2: Make a Merge Sort Tree from next_right array and make queries. Queries to calculate the number of distinct elements from L to R is equivalent to find the number of elements from L to R which are greater than R.

Construction of Merge Sort Tree from given array

• Every time we divide the current segment into two halves if it has not yet become a segment of length 1. Then call the same procedure on both halves, and for each such segment, we store the sorted array in each segment as in merge sort.
• Also, the tree will be a Full Binary Tree because we always divide segments into two halves at every level.
• Since the constructed tree is always a full binary tree with n leaves, there will be N-1 internal nodes. So the total number of nodes will be 2*N – 1.

Here is an example. Say 1 5 2 6 9 4 7 1 be an array.

|1 1 2 4 5 6 7 9|
|1 2 5 6|1 4 7 9|
|1 5|2 6|4 9|1 7|
|1|5|2|6|9|4|7|1|

Construction of next_right array

• We store the next right occurrence of every element.
• If the element has the last occurrence then we store ‘N'(Length of the array) Example:
arr = [2, 3, 2, 3, 5, 6];
next_right = [2, 3, 6, 6, 6, 6]

Below is the implementation of the above approach:

## C++

 // C++ implementation to find// count of distinct elements// in a range L to R for Q queries #include using namespace std; // Function to merge the right// and the left treevoid merge(vector tree[],                 int treeNode){    int len1 =      tree[2 * treeNode].size();    int len2 =      tree[2 * treeNode + 1].size();    int index1 = 0, index2 = 0;     // Fill this array in such a    // way such that values    // remain sorted similar to mergesort    while (index1 < len1 && index2 < len2) {         // If the element on the left part        // is greater than the right part        if (tree[2 * treeNode][index1] >              tree[2 * treeNode + 1][index2]) {             tree[treeNode].push_back(                tree[2 * treeNode + 1][index2]                );            index2++;        }        else {            tree[treeNode].push_back(                tree[2 * treeNode][index1]                );            index1++;        }    }     // Insert the leftover elements    // from the left part    while (index1 < len1) {        tree[treeNode].push_back(            tree[2 * treeNode][index1]            );        index1++;    }     // Insert the leftover elements    // from the right part    while (index2 < len2) {        tree[treeNode].push_back(            tree[2 * treeNode + 1][index2]            );        index2++;    }    return;} // Recursive function to build// segment tree by merging the// sorted segments in sorted wayvoid build(vector tree[],    int* arr, int start, int end,                  int treeNode){    // Base case    if (start == end) {        tree[treeNode].push_back(            arr[start]);        return;    }    int mid = (start + end) / 2;     // Building the left tree    build(tree, arr, start,          mid, 2 * treeNode);     // Building the right tree    build(tree, arr, mid + 1, end,                 2 * treeNode + 1);     // Merges the right tree    // and left tree    merge(tree, treeNode);    return;} // Function similar to query() method// as in segment treeint query(vector tree[],     int treeNode, int start, int end,            int left, int right){     // Current segment is out of the range    if (start > right || end < left) {        return 0;    }    // Current segment completely    // lies inside the range    if (start >= left && end <= right) {         // as the elements are in sorted order        // so number of elements greater than R        // can be find using binary        // search or upper_bound        return tree[treeNode].end() -          upper_bound(tree[treeNode].begin(),            tree[treeNode].end(), right);    }     int mid = (start + end) / 2;     // Query on the left tree    int op1 = query(tree, 2 * treeNode,              start, mid, left, right);    // Query on the Right tree    int op2 = query(tree, 2 * treeNode + 1,            mid + 1, end, left, right);    return op1 + op2;} // Driver Codeint main(){     int n = 5;    int arr[] = { 1, 2, 1, 4, 2 };     int next_right[n];    // Initialising the tree    vector tree[4 * n];     unordered_map ump;     // Construction of next_right    // array to store the    // next index of occurrence    // of elements    for (int i = n - 1; i >= 0; i--) {        if (ump[arr[i]] == 0) {            next_right[i] = n;            ump[arr[i]] = i;        }        else {            next_right[i] = ump[arr[i]];            ump[arr[i]] = i;        }    }    // building the mergesort tree    // by using next_right array    build(tree, next_right, 0, n - 1, 1);     int ans;    // Queries one based indexing    // Time complexity of each    // query is log(N)     // first query    int left1 = 0;    int right1 = 2;    ans = query(tree, 1, 0, n - 1,                  left1, right1);    cout << ans << endl;     // Second Query    int left2 = 1;    int right2 = 4;    ans = query(tree, 1, 0, n - 1,                  left2, right2);    cout << ans << endl;}

Output:

2
3

Time Complexity: O(Q*log N)

My Personal Notes arrow_drop_up