How to handle duplicates in Binary Search Tree?
In a Binary Search Tree (BST), all keys in left subtree of a key must be smaller and all keys in right subtree must be greater. So a Binary Search Tree by definition has distinct keys.
How to allow duplicates where every insertion inserts one more key with a value and every deletion deletes one occurrence?
A Simple Solution is to allow same keys on right side (we could also choose left side). For example consider insertion of keys 12, 10, 20, 9, 11, 10, 12, 12 in an empty Binary Search Tree
12 / \ 10 20 / \ / 9 11 12 / \ 10 12
A Better Solution is to augment every tree node to store count together with regular fields like key, left and right pointers.
Insertion of keys 12, 10, 20, 9, 11, 10, 12, 12 in an empty Binary Search Tree would create following.
12(3) / \ 10(2) 20(1) / \ 9(1) 11(1) Count of a key is shown in bracket
This approach has following advantages over above simple approach.
- Height of tree is small irrespective of number of duplicates. Note that most of the BST operations (search, insert and delete) have time complexity as O(h) where h is height of BST. So if we are able to keep the height small, we get advantage of less number of key comparisons.
- Search, Insert and Delete become easier to do. We can use same insert, search and delete algorithms with small modifications (see below code).
- This approach is suited for self-balancing BSTs (AVL Tree, Red-Black Tree, etc) also. These trees involve rotations, and a rotation may violate BST property of simple solution as a same key can be in either left side or right side after rotation.
Below is implementation of normal Binary Search Tree with count with every key. This code basically is taken from code for insert and delete in BST. The changes made for handling duplicates are highlighted, rest of the code is same.
Inorder traversal of the given tree 9(1) 10(2) 11(1) 12(3) 20(1) Delete 20 Inorder traversal of the modified tree 9(1) 10(2) 11(1) 12(3) Delete 12 Inorder traversal of the modified tree 9(1) 10(2) 11(1) 12(2) Delete 9 Inorder traversal of the modified tree 10(2) 11(1) 12(2)
Time Complexity: The time complexity of all operations like search, insert, and delete is O(h) where h is the height of the BST.
Auxiliary Space: The space complexity is O(h) which is required for the recursive function calls.