Program for hashing with chaining
In hashing there is a hash function that maps keys to some values. But these hashing functions may lead to a collision that is two or more keys are mapped to same value. Chain hashing avoids collision. The idea is to make each cell of hash table point to a linked list of records that have same hash function value.
Let’s create a hash function, such that our hash table has ‘N’ number of buckets.
To insert a node into the hash table, we need to find the hash index for the given key. And it could be calculated using the hash function.
Example: hashIndex = key % noOfBuckets
Insert: Move to the bucket corresponding to the above-calculated hash index and insert the new node at the end of the list.
Delete: To delete a node from hash table, calculate the hash index for the key, move to the bucket corresponding to the calculated hash index, and search the list in the current bucket to find and remove the node with the given key (if found).
Please refer Hashing | Set 2 (Separate Chaining) for details.
We use a list in C++ which is internally implemented as linked list (Faster insertion and deletion).
Method – 1 :
This method has not concept of rehashing. It only has a fixed size array i.e. fixed numbers of buckets.
0 1 --> 15 --> 8 2 3 4 --> 11 5 6 --> 27
- Search : O(1+(n/m))
- Delete : O(1+(n/m))
where n = Total elements in hash table
m = Size of hash table
- Here n/m is the Load Factor.
- Load Factor (∝) must be as small as possible.
- If load factor increases,then possibility of collision increases.
- Load factor is trade of space and time .
- Assume , uniform distribution of keys ,
- Expected chain length : O(∝)
- Expected time to search : O( 1 + ∝ )
- Expected time to insert/ delete : O( 1 + ∝ )
Auxiliary Space: O(1), since no extra space has been taken.
Method – 2 :
Let’s discuss another method where we have no boundation on number of buckets. Number of buckets will increase when value of load factor is greater than 0.5.
We will do rehashing when the value of load factor is greater than 0.5. In rehashing, we double the size of array and add all the values again to new array (doubled size array is new array) based on hash function. Hash function should also be change as it is depends on number of buckets. Therefore, hash function behaves differently from the previous one.
- Our Hash function is : (ascii value of character * some prime number ^ x) % total number of buckets. In this case prime number is 31.
- Load Factor = number of elements in Hash Map / total number of buckets
- Our key should be string in this case.
- We can make our own Hash Function but it should be depended on the size of array because if we do rehashing then it must reflect changes and number of collisions should reduce.
Value of GeeksForGeeks : 11 Value of ITT : 5 Value of Manish : 16 Value of Vartika : 14 Value of elite_Programmer : 4 Value of pluto14 : 14 Oops!! Data not found.
Complexity analysis of Insert:
- Time Complexity: O(N), It takes O(N) time complexity because we are checking the load factor each time and when it is greater than 0.5 we call rehashing function which takes O(N) time.
- Space Complexity: O(N), It takes O(N) space complexity because we are creating a new array of doubled size and copying all the elements to the new array.
Complexity analysis of Search:
- Time Complexity: O(N), It takes O(N) time complexity because we are searching in a linked list of size N.
- Space Complexity: O(1), It takes O(1) space complexity because we are not using any extra space for searching.
Please Login to comment...