# Count of distinct numbers in an Array in a range for Online Queries using Merge Sort Tree

• Last Updated : 26 May, 2022

Given an array arr[] of size N and Q queries of the form [L, R], the task is to find the number of distinct values in this array in the given range. Examples:

Input: arr[] = {4, 1, 9, 1, 3, 3}, Q = {{1, 3}, {1, 5}} Output: 3 4 Explanation: For query {1, 3}, elements are {4, 1, 9}. Therefore, count of distinct elements = 3 For query {1, 5}, elements are {4, 1, 9, 1, 3}. Therefore, count of distinct elements = 4 Input: arr[] = {4, 2, 1, 1, 4}, Q = {{2, 4}, {3, 5}} Output: 3 2

Naive Approach: A simple solution is that for every Query, iterate array from L to R and insert elements in a set. Finally, the Size of the set gives the number of distinct elements from L to R. Time Complexity: O(Q * N) Efficient Approach: The idea is to use Merge Sort Tree to solve this problem.

1. We will store the next occurrence of the element in a temporary array.
2. Then for every query from L to R, we will find the number of elements in the temporary array whose values are greater than R in range L to R.

Step 1: Take an array next_right, where next_right[i] holds the next right index of the number i in the array a. Initialize this array as N(length of the array). Step 2: Make a Merge Sort Tree from next_right array and make queries. Queries to calculate the number of distinct elements from L to R is equivalent to find the number of elements from L to R which are greater than R.

Construction of Merge Sort Tree from given array

• We start with a segment arr[0 . . . n-1].
• Every time we divide the current segment into two halves if it has not yet become a segment of length 1. Then call the same procedure on both halves, and for each such segment, we store the sorted array in each segment as in merge sort.
• Also, the tree will be a Full Binary Tree because we always divide segments into two halves at every level.
• Since the constructed tree is always a full binary tree with n leaves, there will be N-1 internal nodes. So the total number of nodes will be 2*N – 1.

Here is an example. Say 1 5 2 6 9 4 7 1 be an array.

```|1 1 2 4 5 6 7 9|
|1 2 5 6|1 4 7 9|
|1 5|2 6|4 9|1 7|
|1|5|2|6|9|4|7|1|```

Construction of next_right array

• We store the next right occurrence of every element.
• If the element has the last occurrence then we store ‘N'(Length of the array) Example:
```arr = [2, 3, 2, 3, 5, 6];
next_right = [2, 3, 6, 6, 6, 6]```

Below is the implementation of the above approach:

## C++

 `// C++ implementation to find``// count of distinct elements``// in a range L to R for Q queries` `#include ``using` `namespace` `std;` `// Function to merge the right``// and the left tree``void` `merge(vector<``int``> tree[],``                 ``int` `treeNode)``{``    ``int` `len1 =``      ``tree[2 * treeNode].size();``    ``int` `len2 =``      ``tree[2 * treeNode + 1].size();``    ``int` `index1 = 0, index2 = 0;` `    ``// Fill this array in such a``    ``// way such that values``    ``// remain sorted similar to mergesort``    ``while` `(index1 < len1 && index2 < len2) {` `        ``// If the element on the left part``        ``// is greater than the right part``        ``if` `(tree[2 * treeNode][index1] >``              ``tree[2 * treeNode + 1][index2]) {` `            ``tree[treeNode].push_back(``                ``tree[2 * treeNode + 1][index2]``                ``);``            ``index2++;``        ``}``        ``else` `{``            ``tree[treeNode].push_back(``                ``tree[2 * treeNode][index1]``                ``);``            ``index1++;``        ``}``    ``}` `    ``// Insert the leftover elements``    ``// from the left part``    ``while` `(index1 < len1) {``        ``tree[treeNode].push_back(``            ``tree[2 * treeNode][index1]``            ``);``        ``index1++;``    ``}` `    ``// Insert the leftover elements``    ``// from the right part``    ``while` `(index2 < len2) {``        ``tree[treeNode].push_back(``            ``tree[2 * treeNode + 1][index2]``            ``);``        ``index2++;``    ``}``    ``return``;``}` `// Recursive function to build``// segment tree by merging the``// sorted segments in sorted way``void` `build(vector<``int``> tree[],``    ``int``* arr, ``int` `start, ``int` `end,``                  ``int` `treeNode)``{``    ``// Base case``    ``if` `(start == end) {``        ``tree[treeNode].push_back(``            ``arr[start]);``        ``return``;``    ``}``    ``int` `mid = (start + end) / 2;` `    ``// Building the left tree``    ``build(tree, arr, start,``          ``mid, 2 * treeNode);` `    ``// Building the right tree``    ``build(tree, arr, mid + 1, end,``                 ``2 * treeNode + 1);` `    ``// Merges the right tree``    ``// and left tree``    ``merge(tree, treeNode);``    ``return``;``}` `// Function similar to query() method``// as in segment tree``int` `query(vector<``int``> tree[],``     ``int` `treeNode, ``int` `start, ``int` `end,``            ``int` `left, ``int` `right)``{` `    ``// Current segment is out of the range``    ``if` `(start > right || end < left) {``        ``return` `0;``    ``}``    ``// Current segment completely``    ``// lies inside the range``    ``if` `(start >= left && end <= right) {` `        ``// as the elements are in sorted order``        ``// so number of elements greater than R``        ``// can be find using binary``        ``// search or upper_bound``        ``return` `tree[treeNode].end() -``          ``upper_bound(tree[treeNode].begin(),``            ``tree[treeNode].end(), right);``    ``}` `    ``int` `mid = (start + end) / 2;` `    ``// Query on the left tree``    ``int` `op1 = query(tree, 2 * treeNode,``              ``start, mid, left, right);``    ``// Query on the Right tree``    ``int` `op2 = query(tree, 2 * treeNode + 1,``            ``mid + 1, end, left, right);``    ``return` `op1 + op2;``}` `// Driver Code``int` `main()``{` `    ``int` `n = 5;``    ``int` `arr[] = { 1, 2, 1, 4, 2 };` `    ``int` `next_right[n];``    ``// Initialising the tree``    ``vector<``int``> tree[4 * n];` `    ``unordered_map<``int``, ``int``> ump;` `    ``// Construction of next_right``    ``// array to store the``    ``// next index of occurrence``    ``// of elements``    ``for` `(``int` `i = n - 1; i >= 0; i--) {``        ``if` `(ump[arr[i]] == 0) {``            ``next_right[i] = n;``            ``ump[arr[i]] = i;``        ``}``        ``else` `{``            ``next_right[i] = ump[arr[i]];``            ``ump[arr[i]] = i;``        ``}``    ``}``    ``// building the mergesort tree``    ``// by using next_right array``    ``build(tree, next_right, 0, n - 1, 1);` `    ``int` `ans;``    ``// Queries one based indexing``    ``// Time complexity of each``    ``// query is log(N)` `    ``// first query``    ``int` `left1 = 0;``    ``int` `right1 = 2;``    ``ans = query(tree, 1, 0, n - 1,``                  ``left1, right1);``    ``cout << ans << endl;` `    ``// Second Query``    ``int` `left2 = 1;``    ``int` `right2 = 4;``    ``ans = query(tree, 1, 0, n - 1,``                  ``left2, right2);``    ``cout << ans << endl;``}`

Output:

```2
3```

Time Complexity: O(Q*log N)

My Personal Notes arrow_drop_up