# Overview of Data Structures | Set 2 (Binary Tree, BST, Heap and Hash)

We have discussed Overview of Array, Linked List, Queue and Stack. In this article following Data Structures are discussed.

**5. Binary Tree** **6. Binary Search Tree** **7. Binary Heap** **9. Hashing**

**Binary Tree**

Unlike Arrays, Linked Lists, Stack, and queues, which are linear data structures, trees are hierarchical data structures.

A binary tree is a tree data structure in which each node has at most two children, which are referred to as the left child and the right child. It is implemented mainly using Links.

**Binary Tree Representation:** A tree is represented by a pointer to the topmost node in the tree. If the tree is empty, then the value of the root is NULL. A Binary Tree node contains the following parts.

1. Data

2. Pointer to left child

3. Pointer to the right child

A Binary Tree can be traversed in two ways:

Depth First Traversal: Inorder (Left-Root-Right), Preorder (Root-Left-Right), and Postorder (Left-Right-Root)

Breadth-First Traversal: Level Order Traversal

**Binary Tree Properties:**

The maximum number of nodes at level ‘l’ = 2^{l}. Maximum number of nodes = 2^{h + 1}– 1. Here h is height of a tree. Height is considered as the maximum number of edges on a path from root to leaf. Minimum possible height = ceil(Log_{2}(n+1)) - 1 In Binary tree, number of leaf nodes is always one more than nodes with two children. Time Complexity of Tree Traversal: O(n)

**Basic Operation On Binary Tree:**

- Inserting an element.
- Removing an element.
- Searching for an element.
- Traversing an element.

**Auxiliary Operation On Binary Tree:**

- Finding the height of the tree.
- Find the level of the tree.
- Finding the size of the entire tree.

**Applications of Binary Tree:**

- Huffman coding trees are used in data compression algorithms.
- Priority Queue is another application of binary tree that is used for searching maximum or minimum in O(logn) time complexity.
- In compilers, Expression Trees are used which is an application of binary tree.

**Binary Tree Traversals:**

**Preorder Traversal:**Here, the traversal is : root – left child – right child. It means that the root node is traversed first then its left child and finally the right child.**Inorder Traversal:**Here, the traversal is : left child – root – right child. It means that the left child is traversed first then its root node and finally the right child.**Postorder Traversal:**Here, the traversal is : left child – right child – root. It means that the left child is traversed first then the right child and finally its root node.

**Examples:** One reason to use binary trees or trees, in general, is for the things that form a hierarchy. They are useful in File structures where each file is located in a particular directory and there is a specific hierarchy associated with files and directories. Another example where Trees are useful is storing hierarchical objects like JavaScript Document Object Model considers HTML page as a tree with nesting of tags as parent-child relations.

**Binary Search Tree**

Binary Search Tree (BST) is a tree whose main function is to search a specific element.

Binary Search Tree is a Binary Tree with the following additional properties:

1. The left subtree of a node contains only nodes with keys less than the node’s key.

2. The right subtree of a node contains only nodes with keys greater than the node’s key.

3. The left and right subtree each must also be a binary search tree.

**Binary Search Tree Declaration**

struct BinarySearchTree{ int data; struct BinarySearchTree* left; struct BinarySearchTree* right; };

Since it is a Binary Tree, its declaration is similar to the Binary Tree.

**Primary BST Operations:**

- Finding minimum or maximum element.
- Deleting a particular element from the tree.
- Inserting a particular element in the tree.

**Auxiliary BST Operations:**

- Finding kth smallest element.
- To identify whether the given binary tree is a BST or not.

Time Complexities:

Search : O(h) Insertion : O(h) Deletion : O(h) Extra Space : O(n) for pointersh:Height of BSTn:Number of nodes in BST If Binary Search Tree is Height Balanced, then h = O(Log n) Self-Balancing BSTs such as AVL Tree, Red-Black Tree and Splay Tree make sure that height of BST remains O(Log n)

BST provides moderate access/search (quicker than Linked List and slower than arrays).

BST provides moderate insertion/deletion (quicker than Arrays and slower than Linked Lists).

**Examples:** Its main use is in search applications where data is constantly entering/leaving and data needs to be printed in sorted order. For example in implementation in E-commerce websites where a new product is added or product goes out of stock and all products are listed in sorted order.

**Binary Heap**

A Binary Heap is a Binary Tree with the following properties.

1) It’s a complete tree (All levels are completely filled except possibly the last level and the last level has all keys as left as possible). This property of Binary Heap makes them suitable to be stored in an array.

2) A Binary Heap is either Min Heap or Max Heap. In a Min Binary Heap, the key at the root must be minimum among all keys present in Binary Heap. The same property must be recursively true for all nodes in Binary Tree. Max Binary Heap is similar to Min Heap. It is mainly implemented using an array.

Get Minimum in Min Heap: O(1) [Or Get Max in Max Heap] Extract Minimum Min Heap: O(Log n) [Or Extract Max in Max Heap] Decrease Key in Min Heap: O(Log n) [Or Decrease Key in Max Heap] Insert: O(Log n) Delete: O(Log n)

**Example:** Used in implementing efficient priority queues, which in turn are used for scheduling processes in operating systems. Priority Queues are also used in Dijkstra’s and Prim’s graph algorithms.

The Heap data structure can be used to efficiently find the k smallest (or largest) elements in an array, merging k sorted arrays, a median of a stream, etc.

Heap is a special data structure and it cannot be used for searching a particular element.

**Hashing: **Hashing is a popular technique for storing and retrieving data as fast as possible. The main reason behind using hashing is that it gives optimal results as it performs optimal searches.

**Why to use Hashing? : **

If you observe carefully, in a balanced binary search tree, if we try to search , insert or delete any element then the time complexity for the same is O(logn). Now there might be a situation when our applications want to do the same operations in a faster way i.e. in a more optimized way and here hashing comes into play. In hashing, all the above operations can be performed in O(1) i.e. constant time. It is important to understand that the worst case time complexity for hashing remains O(n) but the average case time complexity is O(1).

**Hash Function:** A function that converts a given big phone number to a small practical integer value. The mapped integer value is used as an index in hash table. So, in simple terms we can say that a hash function is used to transform a given key into a specific slot index. Its main job is to map each and every possible key into a unique slot index. If every key is mapped into a unique slot index, then the hash function is known as a perfect hash function. It is very difficult to create a perfect hash function but our job as a programmer is to create such a hash function with the help of which the number of collisions are as few as possible. Collision is discussed ahead.

A good hash function should have following properties:

1) Efficiently computable.

2) Should uniformly distribute the keys (Each table position equally likely for each key) . 3)Should minimize collisions 4)Should have a high load factor(number of items in table divided by size of the table).

For example for phone numbers a bad hash function is to take first three digits. A better function is consider last three digits. Please note that this may not be the best hash function. There may be better ways.

**Hash Table: **An array that stores pointers to records corresponding to a given phone number. An entry in hash table is NIL if no existing phone number has hash function value equal to the index for the entry. In simple terms, we can say that hash table is a generalization of array. Hash table gives the functionality in which a collection of data is stored in such a way that it is easy to find those items later if required. This makes searching of an element very efficient.

**Collision Handling:** Since a hash function gets us a small number for a key which is a big integer or string, there is the possibility that two keys result in the same value. The situation where a newly inserted key maps to an already occupied slot in the hash table is called collision and must be handled using some collision handling technique. Following are the ways to handle collisions:

**Chaining: **The idea is to make each cell of the hash table point to a linked list of records that have the same hash function value. Chaining is simple but requires additional memory outside the table.

Open Addressing: In open addressing, all elements are stored in the hash table itself. Each table entry contains either a record or NIL. When searching for an element, we one by one examine table slots until the desired element is found or it is clear that the element is not in the table.

Space : O(n) Search : O(1) [Average] O(n) [Worst case] Insertion : O(1) [Average] O(n) [Worst Case] Deletion : O(1) [Average] O(n) [Worst Case]

Hashing seems better than BST for all the operations. But in hashing, elements are unordered and in BST elements are stored in an ordered manner. Also, BST is easy to implement but hash functions can sometimes be very complex to generate. In BST, we can also efficiently find floor and ceil of values.

**Example:** Hashing can be used to remove duplicates from a set of elements. Can also be used to find the frequency of all items. For example, in web browsers, we can check visited URLs using hashing. In firewalls, we can use hashing to detect spam. We need to hash IP addresses. Hashing can be used in any situation where want search() insert() and delete() in O(1) time.

This article is contributed by **Abhiraj Smit**. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.