Related Articles

# Count of distinct numbers in an Array in a range for Online Queries using Merge Sort Tree

• Last Updated : 27 Jun, 2020

Given an array arr[] of size N and Q queries of the form [L, R], the task is to find the number of distinct values in this array in the given range.

Examples:

Input: arr[] = {4, 1, 9, 1, 3, 3}, Q = {{1, 3}, {1, 5}}
Output:
3
4
Explanation:
For query {1, 3}, elements are {4, 1, 9}. Therefore, count of distinct elements = 3
For query {1, 5}, elements are {4, 1, 9, 1, 3}. Therefore, count of distinct elements = 4

Input: arr[] = {4, 2, 1, 1, 4}, Q = {{2, 4}, {3, 5}}
Output:
3
2

## Recommended: Please try your approach on {IDE} first, before moving on to the solution.

Naive Approach: A simple solution is that for every Query, iterate array from L to R and insert elements in a set. Finally, the Size of the set gives the number of distinct elements from L to R.

Time Complexity: O(Q * N)

Efficient Approach: The idea is to use Merge Sort Tree to solve this problem.

1. We will store the next occurrence of the element in a temporary array.
2. Then for every query from L to R, we will find the number of elements in the temporary array whose values are greater than R in range L to R.

Step 1: Take an array next_right, where next_right[i] holds the next right index of the number i in the array a. Initialize this array as N(length of the array).
Step 2: Make a Merge Sort Tree from next_right array and make queries. Queries to calculate the number of distinct elements from L to R is equivalent to find the number of elements from L to R which are greater than R.

Construction of Merge Sort Tree from given array

• Every time we divide the current segment into two halves if it has not yet become a segment of length 1. Then call the same procedure on both halves, and for each such segment, we store the sorted array in each segment as in merge sort.
• Also, the tree will be a Full Binary Tree because we always divide segments into two halves at every level.
• Since the constructed tree is always a full binary tree with n leaves, there will be N-1 internal nodes. So the total number of nodes will be 2*N – 1.

Here is an example. Say 1 5 2 6 9 4 7 1 be an array.

```|1 1 2 4 5 6 7 9|
|1 2 5 6|1 4 7 9|
|1 5|2 6|4 9|1 7|
|1|5|2|6|9|4|7|1|
```

Construction of next_right array

• We store the next right occurence of every element.
• If the element has the last occurence then we store ‘N'(Length of the array)
Example:
```arr = [2, 3, 2, 3, 5, 6];
next_right = [2, 3, 6, 6, 6, 6]
```

Below is the implementation of the above approach:

## C++

 `// C++ implementation to find``// count of distinct elements ``// in a range L to R for Q queries`` ` `#include ``using` `namespace` `std;`` ` `// Function to merge the right``// and the left tree``void` `merge(vector<``int``> tree[], ``                 ``int` `treeNode)``{``    ``int` `len1 = ``      ``tree[2 * treeNode].size();``    ``int` `len2 = ``      ``tree[2 * treeNode + 1].size();``    ``int` `index1 = 0, index2 = 0;`` ` `    ``// Fill this array in such a ``    ``// way such that values``    ``// remain sorted similar to mergesort``    ``while` `(index1 < len1 && index2 < len2) {`` ` `        ``// If the element on the left part``        ``// is greater than the right part``        ``if` `(tree[2 * treeNode][index1] > ``              ``tree[2 * treeNode + 1][index2]) {`` ` `            ``tree[treeNode].push_back(``                ``tree[2 * treeNode + 1][index2]``                ``);``            ``index2++;``        ``}``        ``else` `{``            ``tree[treeNode].push_back(``                ``tree[2 * treeNode][index1]``                ``);``            ``index1++;``        ``}``    ``}`` ` `    ``// Insert the leftover elements``    ``// from the left part``    ``while` `(index1 < len1) {``        ``tree[treeNode].push_back(``            ``tree[2 * treeNode][index1]``            ``);``        ``index1++;``    ``}`` ` `    ``// Insert the leftover elements``    ``// from the right part``    ``while` `(index2 < len2) {``        ``tree[treeNode].push_back(``            ``tree[2 * treeNode + 1][index2]``            ``);``        ``index2++;``    ``}``    ``return``;``}`` ` `// Recursive function to build ``// segment tree by merging the ``// sorted segments in sorted way``void` `build(vector<``int``> tree[], ``    ``int``* arr, ``int` `start, ``int` `end, ``                  ``int` `treeNode)``{``    ``// Base case``    ``if` `(start == end) {``        ``tree[treeNode].push_back(``            ``arr[start]);``        ``return``;``    ``}``    ``int` `mid = (start + end) / 2;`` ` `    ``// Building the left tree``    ``build(tree, arr, start, ``          ``mid, 2 * treeNode);`` ` `    ``// Building the right tree``    ``build(tree, arr, mid + 1, end, ``                 ``2 * treeNode + 1);`` ` `    ``// Merges the right tree``    ``// and left tree``    ``merge(tree, treeNode);``    ``return``;``}`` ` `// Function similar to query() method``// as in segment tree``int` `query(vector<``int``> tree[], ``     ``int` `treeNode, ``int` `start, ``int` `end, ``            ``int` `left, ``int` `right)``{`` ` `    ``// Current segment is out of the range``    ``if` `(start > right || end < left) {``        ``return` `0;``    ``}``    ``// Current segment completely ``    ``// lies inside the range``    ``if` `(start >= left && end <= right) {`` ` `        ``// as the elements are in sorted order``        ``// so number of elements greater than R``        ``// can be find using binary ``        ``// search or upper_bound``        ``return` `tree[treeNode].end() - ``          ``upper_bound(tree[treeNode].begin(), ``            ``tree[treeNode].end(), right);``    ``}`` ` `    ``int` `mid = (start + end) / 2;`` ` `    ``// Query on the left tree``    ``int` `op1 = query(tree, 2 * treeNode, ``              ``start, mid, left, right);``    ``// Query on the Right tree``    ``int` `op2 = query(tree, 2 * treeNode + 1, ``            ``mid + 1, end, left, right);``    ``return` `op1 + op2;``}`` ` `// Driver Code``int` `main()``{`` ` `    ``int` `n = 5;``    ``int` `arr[] = { 1, 2, 1, 4, 2 };`` ` `    ``int` `next_right[n];``    ``// Initialising the tree``    ``vector<``int``> tree[4 * n];`` ` `    ``unordered_map<``int``, ``int``> ump;`` ` `    ``// Construction of next_right ``    ``// array to store the``    ``// next index of occurence ``    ``// of elements``    ``for` `(``int` `i = n - 1; i >= 0; i--) {``        ``if` `(ump[arr[i]] == 0) {``            ``next_right[i] = n;``            ``ump[arr[i]] = i;``        ``}``        ``else` `{``            ``next_right[i] = ump[arr[i]];``            ``ump[arr[i]] = i;``        ``}``    ``}``    ``// building the mergesort tree``    ``// by using next_right array``    ``build(tree, next_right, 0, n - 1, 1);`` ` `    ``int` `ans;``    ``// Queries one based indexing``    ``// Time complexity of each ``    ``// query is log(N)`` ` `    ``// first query``    ``int` `left1 = 0;``    ``int` `right1 = 2;``    ``ans = query(tree, 1, 0, n - 1, ``                  ``left1, right1);``    ``cout << ans << endl;`` ` `    ``// Second Query``    ``int` `left2 = 1;``    ``int` `right2 = 4;``    ``ans = query(tree, 1, 0, n - 1, ``                  ``left2, right2);``    ``cout << ans << endl;``}`
Output:
```2
3
```

Time Complexity: O(Q*log N)

My Personal Notes arrow_drop_up