Overview of Data Structures | Set 2 (Binary Tree, BST, Heap and Hash)
We have discussed Overview of Array, Linked List, Queue and Stack. In this article following Data Structures are discussed.
Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready. To complete your preparation from learning a language to DS Algo and many more, please refer Complete Interview Preparation Course.
Unlike Arrays, Linked Lists, Stack, and queues, which are linear data structures, trees are hierarchical data structures.
A binary tree is a tree data structure in which each node has at most two children, which are referred to as the left child and the right child. It is implemented mainly using Links.
Binary Tree Representation: A tree is represented by a pointer to the topmost node in the tree. If the tree is empty, then the value of the root is NULL. A Binary Tree node contains the following parts.
2. Pointer to left child
3. Pointer to the right child
A Binary Tree can be traversed in two ways:
Depth First Traversal: Inorder (Left-Root-Right), Preorder (Root-Left-Right), and Postorder (Left-Right-Root)
Breadth-First Traversal: Level Order Traversal
Binary Tree Properties:
The maximum number of nodes at level ‘l’ = 2l. Maximum number of nodes = 2h + 1 – 1. Here h is height of a tree. Height is considered as the maximum number of edges on a path from root to leaf. Minimum possible height = ceil(Log2(n+1)) - 1 In Binary tree, number of leaf nodes is always one more than nodes with two children. Time Complexity of Tree Traversal: O(n)
Binary Search Tree
In Binary Search Tree is a Binary Tree with the following additional properties:
1. The left subtree of a node contains only nodes with keys less than the node’s key.
2. The right subtree of a node contains only nodes with keys greater than the node’s key.
3. The left and right subtree each must also be a binary search tree.
Search : O(h) Insertion : O(h) Deletion : O(h) Extra Space : O(n) for pointers h: Height of BST n: Number of nodes in BST If Binary Search Tree is Height Balanced, then h = O(Log n) Self-Balancing BSTs such as AVL Tree, Red-Black Tree and Splay Tree make sure that height of BST remains O(Log n)
BST provides moderate access/search (quicker than Linked List and slower than arrays).
BST provides moderate insertion/deletion (quicker than Arrays and slower than Linked Lists).
Examples: Its main use is in search applications where data is constantly entering/leaving and data needs to be printed in sorted order. For example in implementation in E-commerce websites where a new product is added or product goes out of stock and all products are listed in sorted order.
A Binary Heap is a Binary Tree with the following properties.
1) It’s a complete tree (All levels are completely filled except possibly the last level and the last level has all keys as left as possible). This property of Binary Heap makes them suitable to be stored in an array.
2) A Binary Heap is either Min Heap or Max Heap. In a Min Binary Heap, the key at the root must be minimum among all keys present in Binary Heap. The same property must be recursively true for all nodes in Binary Tree. Max Binary Heap is similar to Min Heap. It is mainly implemented using an array.
Get Minimum in Min Heap: O(1) [Or Get Max in Max Heap] Extract Minimum Min Heap: O(Log n) [Or Extract Max in Max Heap] Decrease Key in Min Heap: O(Log n) [Or Decrease Key in Max Heap] Insert: O(Log n) Delete: O(Log n)
Example: Used in implementing efficient priority queues, which in turn are used for scheduling processes in operating systems. Priority Queues are also used in Dijkstra’s and Prim’s graph algorithms.
The Heap data structure can be used to efficiently find the k smallest (or largest) elements in an array, merging k sorted arrays, a median of a stream, etc.
Heap is a special data structure and it cannot be used for searching a particular element.
HashingHash Function: A function that converts a given big input key to a small practical integer value. The mapped integer value is used as an index in the hash table. A good hash function should have the following properties
1) Efficiently computable.
2) Should uniformly distribute the keys (Each table position equally likely for each key)
Hash Table: An array that stores pointers to records corresponding to a given phone number. An entry in a hash table is NIL if no existing phone number has a hash function value equal to the index for the entry.
Collision Handling: Since a hash function gets us a small number for a key which is a big integer or string, there is the possibility that two keys result in the same value. The situation where a newly inserted key maps to an already occupied slot in the hash table is called collision and must be handled using some collision handling technique. Following are the ways to handle collisions:
Chaining: The idea is to make each cell of the hash table point to a linked list of records that have the same hash function value. Chaining is simple but requires additional memory outside the table.
Open Addressing: In open addressing, all elements are stored in the hash table itself. Each table entry contains either a record or NIL. When searching for an element, we one by one examine table slots until the desired element is found or it is clear that the element is not in the table.
Space : O(n) Search : O(1) [Average] O(n) [Worst case] Insertion : O(1) [Average] O(n) [Worst Case] Deletion : O(1) [Average] O(n) [Worst Case]
Hashing seems better than BST for all the operations. But in hashing, elements are unordered and in BST elements are stored in an ordered manner. Also, BST is easy to implement but hash functions can sometimes be very complex to generate. In BST, we can also efficiently find floor and ceil of values.
Example: Hashing can be used to remove duplicates from a set of elements. Can also be used to find the frequency of all items. For example, in web browsers, we can check visited URLs using hashing. In firewalls, we can use hashing to detect spam. We need to hash IP addresses. Hashing can be used in any situation where want search() insert() and delete() in O(1) time.
This article is contributed by Abhiraj Smit. Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.