Implementation of Hash Table in C/C++ using Separate Chaining
Hashing is a technique that maps a large set of data to a small set of data. It uses a hash function for doing this mapping. It is an irreversible process and we cannot find the original value of the key from its hashed value because we are trying to map a large set of data into a small set of data, which may cause collisions. It is not uncommon to encounter collisions when mapping a large dataset into a smaller one. Suppose, We have three buckets and each bucket can store 1L of water in it and we have 5L of water also. We have to put all the water in these three buckets and this kind of situation is known as a collision. URL shorteners are an example of hashing as it maps large size URL to small size
Some Examples of Hash Functions:
- key % number of buckets
- ASCII value of character * PrimeNumberx. Where x = 1, 2, 3….n
- You can make your own hash function but it should be a good hash function that gives less number of collisions.
The value returned by the Hash function is the bucket index for a key in a separate chaining method. Each index in the array is called a bucket as it is a bucket of a linked list.
Rehashing is a concept that reduces collision when the elements are increased in the current hash table. It will make a new array of doubled size and copy the previous array elements to it and it is like the internal working of vector in C++. Obviously, the Hash function should be dynamic as it should reflect some changes when the capacity is increased. The hash function includes the capacity of the hash table in it, therefore, While copying key values from the previous array hash function gives different bucket indexes as it is dependent on the capacity (buckets) of the hash table. Generally, When the value of the load factor is greater than 0.5 rehashings are done.
- Double the size of the array.
- Copy the elements of the previous array to the new array. We use the hash function while copying each node to a new array again therefore, It will reduce collision.
- Delete the previous array from the memory and point your hash map’s inside array pointer to this new array.
- Generally, Load Factor = number of elements in Hash Map / total number of buckets (capacity).
Collision is the situation when the bucket index is not empty. It means that a linked list head is present at that bucket index. We have two or more values that map to the same bucket index.
Major Functions in our Program
- Hash Function
Implementation without Rehashing:
Manish Anjali Vartika Mayank GeeksforGeeks Oops! No data found. After deletion : Oops! No data found.
- insertion: Inserts the key-value pair at the head of a linked list which is present at the given bucket index.
- hashFunction: Gives the bucket index for the given key. Our hash function = ASCII value of character * primeNumberx. The prime number in our case is 31 and the value of x is increasing from 1 to n for consecutive characters in a key.
- deletion: Deletes key-value pair from the hash table for the given key. It deletes the node from the linked list which holds the key-value pair.
- Search: Search for the value of the given key.
- This implementation does not use the rehashing concept. It is a fixed-sized array of linked lists.
- Key and value both are strings in the given example.
Time Complexity and Space Complexity:
The time complexity of hash table insertion and deletion operations is O(1) on average. There is some mathematical calculation that proves it.
- Time Complexity of Insertion: In the average case it is constant. In the worst case, it is linear.
- Time Complexity of Search: In the average case it is constant. In the worst case, it is linear.
- Time Complexity of Deletion: In average cases it is constant. In the worst case, it is linear.
- Space Complexity: O(n) as it has n number of elements.
Please Login to comment...