Prerequisites: Hashing Introduction and Collision handling by separate chaining
How hashing works:
For insertion of a key(K) – value(V) pair into a hash map, 2 steps are required:
- K is converted into a small integer (called its hash code) using a hash function.
- The hash code is used to find an index (hashCode % arrSize) and the entire linked list at that index(Separate chaining) is first searched for the presence of the K already.
- If found, it’s value is updated and if not, the K-V pair is stored as a new node in the list.
Complexity and Load Factor
- For the first step, the time taken depends on the K and the hash function.
For example, if the key is a string “abcd”, then it’s hash function may depend on the length of the string. But for very large values of n, the number of entries into the map, and length of the keys is almost negligible in comparison to n so hash computation can be considered to take place in constant time, i.e, O(1). - For the second step, traversal of the list of K-V pairs present at that index needs to be done. For this, the worst case may be that all the n entries are at the same index. So, time complexity would be O(n). But, enough research has been done to make hash functions uniformly distribute the keys in the array so this almost never happens.
- So, on an average, if there are n entries and b is the size of the array there would be n/b entries on each index. This value n/b is called the load factor that represents the load that is there on our map.
- This Load Factor needs to be kept low, so that number of entries at one index is less and so is the complexity almost constant, i.e., O(1).
Rehashing:
Rehashing is the process of increasing the size of a hashmap and redistributing the elements to new buckets based on their new hash values. It is done to improve the performance of the hashmap and to prevent collisions caused by a high load factor.
When a hashmap becomes full, the load factor (i.e., the ratio of the number of elements to the number of buckets) increases. As the load factor increases, the number of collisions also increases, which can lead to poor performance. To avoid this, the hashmap can be resized and the elements can be rehashed to new buckets, which decreases the load factor and reduces the number of collisions.
During rehashing, all elements of the hashmap are iterated and their new bucket positions are calculated using the new hash function that corresponds to the new size of the hashmap. This process can be time-consuming but it is necessary to maintain the efficiency of the hashmap.
Why rehashing?
Rehashing is needed in a hashmap to prevent collision and to maintain the efficiency of the data structure.
As elements are inserted into a hashmap, the load factor (i.e., the ratio of the number of elements to the number of buckets) increases. If the load factor exceeds a certain threshold (often set to 0.75), the hashmap becomes inefficient as the number of collisions increases. To avoid this, the hashmap can be resized and the elements can be rehashed to new buckets, which decreases the load factor and reduces the number of collisions. This process is known as rehashing.
Rehashing can be costly in terms of time and space, but it is necessary to maintain the efficiency of the hashmap.
How Rehashing is done?
Rehashing can be done as follows:
- For each addition of a new entry to the map, check the load factor.
- If it’s greater than its pre-defined value (or default value of 0.75 if not given), then Rehash.
- For Rehash, make a new array of double the previous size and make it the new bucketarray.
- Then traverse to each element in the old bucketArray and call the insert() for each so as to insert it into the new larger bucket array.
Program to implement Rehashing:
#include <iostream> #include <vector> #include <functional> class Map {
private :
class MapNode {
public :
int key;
int value;
MapNode* next;
MapNode( int key, int value) {
this ->key = key;
this ->value = value;
this ->next = NULL;
}
};
// The bucket array where
// the nodes containing K-V pairs are stored
std::vector<MapNode*> buckets;
// No. of pairs stored - n
int size;
// Size of the bucketArray - b
int numBuckets;
// Default loadFactor
double DEFAULT_LOAD_FACTOR = 0.75;
int getBucketInd( int key) {
// Using the inbuilt function from the object class
int hashCode = std::hash< int >()(key);
// array index = hashCode%numBuckets
return (hashCode % numBuckets);
}
public :
Map() {
numBuckets = 5;
buckets.resize(numBuckets);
std::cout << "HashMap created" << std::endl;
std::cout << "Number of pairs in the Map: " << size << std::endl;
std::cout << "Size of Map: " << numBuckets << std::endl;
std::cout << "Default Load Factor : " << DEFAULT_LOAD_FACTOR << std::endl;
}
void insert( int key, int value) {
// Getting the index at which it needs to be inserted
int bucketInd = getBucketInd(key);
// The first node at that index
MapNode* head = buckets[bucketInd];
// First, loop through all the nodes present at that index
// to check if the key already exists
while (head != NULL) {
// If already present the value is updated
if (head->key == key) {
head->value = value;
return ;
}
head = head->next;
}
// new node with the K and V
MapNode* newElementNode = new MapNode(key, value);
// The head node at the index
head = buckets[bucketInd];
// the new node is inserted
// by making it the head
// and it's next is the previous head
newElementNode->next = head;
buckets[bucketInd] = newElementNode;
std::cout << "Pair(" << key << ", " << value << ") inserted successfully." << std::endl;
// Incrementing size
// as new K-V pair is added to the map
size++;
// Load factor calculated
double loadFactor = (1 * size) / numBuckets;
std::cout << "Current Load factor = " << loadFactor << std::endl;
// If the load factor is > 0.75, rehashing is done
if (loadFactor > DEFAULT_LOAD_FACTOR) {
std::cout << loadFactor << " is greater than " << DEFAULT_LOAD_FACTOR << std::endl;
std::cout << "Therefore Rehashing will be done." << std::endl;
// Rehash
rehash();
std::cout << "New Size of Map: " << numBuckets << std::endl;
}
std::cout << "Number of pairs in the Map: " << size << std::endl;
} void rehash() {
std::cout << "\n***Rehashing Started***\n" << std::endl;
// The present bucket list is made temp
std::vector<MapNode*> temp = buckets;
// New bucketList of double the old size is created
buckets.resize(2 * numBuckets);
for ( int i = 0; i < 2 * numBuckets; i++) {
// Initialised to null
buckets[i] = NULL;
}
// Now size is made zero
// and we loop through all the nodes in the original bucket list(temp)
// and insert it into the new list
size = 0;
numBuckets *= 2;
for ( int i = 0; i < temp.size(); i++) {
// head of the chain at that index
MapNode* head = temp[i];
while (head != NULL) {
int key = head->key;
int val = head->value;
// calling the insert function for each node in temp
// as the new list is now the bucketArray
insert(key, val);
head = head->next;
}
}
std::cout << "***Rehashing Done***\n" << std::endl;
} }; int main() {
Map map; // Inserting elements map.insert(1, 1); map.insert(2, 2); map.insert(3, 3); map.insert(4, 4); map.insert(5, 5); map.insert(6, 6); map.insert(7, 7); map.insert(8, 8); map.insert(9, 9); map.insert(10, 10); return 0;
} |
// Java program to implement Rehashing import java.util.ArrayList;
class Map<K, V> {
class MapNode<K, V> {
K key;
V value;
MapNode<K, V> next;
public MapNode(K key, V value)
{
this .key = key;
this .value = value;
next = null ;
}
}
// The bucket array where
// the nodes containing K-V pairs are stored
ArrayList<MapNode<K, V> > buckets;
// No. of pairs stored - n
int size;
// Size of the bucketArray - b
int numBuckets;
// Default loadFactor
final double DEFAULT_LOAD_FACTOR = 0.75 ;
public Map()
{
numBuckets = 5 ;
buckets = new ArrayList<>(numBuckets);
for ( int i = 0 ; i < numBuckets; i++) {
// Initialising to null
buckets.add( null );
}
System.out.println( "HashMap created" );
System.out.println( "Number of pairs in the Map: " + size);
System.out.println( "Size of Map: " + numBuckets);
System.out.println( "Default Load Factor : " + DEFAULT_LOAD_FACTOR + "\n" );
}
private int getBucketInd(K key)
{
// Using the inbuilt function from the object class
int hashCode = key.hashCode();
// array index = hashCode%numBuckets
return (hashCode % numBuckets);
}
public void insert(K key, V value)
{
// Getting the index at which it needs to be inserted
int bucketInd = getBucketInd(key);
// The first node at that index
MapNode<K, V> head = buckets.get(bucketInd);
// First, loop through all the nodes present at that index
// to check if the key already exists
while (head != null ) {
// If already present the value is updated
if (head.key.equals(key)) {
head.value = value;
return ;
}
head = head.next;
}
// new node with the K and V
MapNode<K, V> newElementNode = new MapNode<K, V>(key, value);
// The head node at the index
head = buckets.get(bucketInd);
// the new node is inserted
// by making it the head
// and it's next is the previous head
newElementNode.next = head;
buckets.set(bucketInd, newElementNode);
System.out.println( "Pair(" + key + ", " + value + ") inserted successfully.\n" );
// Incrementing size
// as new K-V pair is added to the map
size++;
// Load factor calculated
double loadFactor = ( 1.0 * size) / numBuckets;
System.out.println( "Current Load factor = " + loadFactor);
// If the load factor is > 0.75, rehashing is done
if (loadFactor > DEFAULT_LOAD_FACTOR) {
System.out.println(loadFactor + " is greater than " + DEFAULT_LOAD_FACTOR);
System.out.println( "Therefore Rehashing will be done.\n" );
// Rehash
rehash();
System.out.println( "New Size of Map: " + numBuckets + "\n" );
}
System.out.println( "Number of pairs in the Map: " + size);
System.out.println( "Size of Map: " + numBuckets + "\n" );
}
private void rehash()
{
System.out.println( "\n***Rehashing Started***\n" );
// The present bucket list is made temp
ArrayList<MapNode<K, V> > temp = buckets;
// New bucketList of double the old size is created
buckets = new ArrayList<MapNode<K, V> >( 2 * numBuckets);
for ( int i = 0 ; i < 2 * numBuckets; i++) {
// Initialised to null
buckets.add( null );
}
// Now size is made zero
// and we loop through all the nodes in the original bucket list(temp)
// and insert it into the new list
size = 0 ;
numBuckets *= 2 ;
for ( int i = 0 ; i < temp.size(); i++) {
// head of the chain at that index
MapNode<K, V> head = temp.get(i);
while (head != null ) {
K key = head.key;
V val = head.value;
// calling the insert function for each node in temp
// as the new list is now the bucketArray
insert(key, val);
head = head.next;
}
}
System.out.println( "\n***Rehashing Ended***\n" );
}
public void printMap()
{
// The present bucket list is made temp
ArrayList<MapNode<K, V> > temp = buckets;
System.out.println( "Current HashMap:" );
// loop through all the nodes and print them
for ( int i = 0 ; i < temp.size(); i++) {
// head of the chain at that index
MapNode<K, V> head = temp.get(i);
while (head != null ) {
System.out.println( "key = " + head.key + ", val = " + head.value);
head = head.next;
}
}
System.out.println();
}
//Function to get an element from Map
public V getValue(K key){
//Get actual index of the key
int actualIndex = getBucketInd(key);
MapNode<K,V> temp = buckets.get(actualIndex);
//Search for key in list
while (temp != null ){
if (temp.key == key){
return temp.value;
}
temp = temp.next;
}
return null ;
}
} public class GFG {
public static void main(String[] args)
{
// Creating the Map
Map<Integer, String> map = new Map<Integer, String>();
// Inserting elements
map.insert( 1 , "Geeks" );
map.printMap();
map.insert( 2 , "forGeeks" );
map.printMap();
map.insert( 3 , "A" );
map.printMap();
map.insert( 4 , "Computer" );
map.printMap();
map.insert( 5 , "Portal" );
map.printMap();
//Get element from Map
int key = 4 ;
String value = map.getValue(key);
System.out.println( "Value at key " + key + " is: " + value);
}
} |
# Python3 program to implement Rehashing class Map :
class MapNode:
def __init__( self ,key,value):
self .key = key
self .value = value
self . next = None
# The bucket array where
# the nodes containing K-V pairs are stored
buckets = list ()
# No. of pairs stored - n
size = 0
# Size of the bucketArray - b
numBuckets = 0
# Default loadFactor
DEFAULT_LOAD_FACTOR = 0.75
def __init__( self ):
Map .numBuckets = 5
Map .buckets = [ None ] * Map .numBuckets
print ( "HashMap created" )
print ( "Number of pairs in the Map: " + str ( Map .size))
print ( "Size of Map: " + str ( Map .numBuckets))
print ( "Default Load Factor : " + str ( Map .DEFAULT_LOAD_FACTOR) + "\n" )
def getBucketInd( self ,key):
# Using the inbuilt function from the object class
hashCode = hash (key)
# array index = hashCode%numBuckets
return (hashCode % Map .numBuckets)
def insert( self ,key,value):
# Getting the index at which it needs to be inserted
bucketInd = self .getBucketInd(key)
# The first node at that index
head = Map .buckets[bucketInd]
# First, loop through all the nodes present at that index
# to check if the key already exists
while (head ! = None ):
# If already present the value is updated
if (head.key = = key):
head.value = value
return
head = head. next
# new node with the K and V
newElementNode = Map .MapNode(key, value)
# The head node at the index
head = Map .buckets[bucketInd]
# the new node is inserted
# by making it the head
# and it's next is the previous head
newElementNode. next = head
Map .buckets[bucketInd] = newElementNode
print ( "Pair(\" {} \", \" {} \") inserted successfully." . format (key,value))
# Incrementing size
# as new K-V pair is added to the map
Map .size + = 1
# Load factor calculated
loadFactor = ( 1 * Map .size) / Map .numBuckets
print ( "Current Load factor = " + str (loadFactor))
# If the load factor is > 0.75, rehashing is done
if (loadFactor > Map .DEFAULT_LOAD_FACTOR):
print ( str (loadFactor) + " is greater than " + str ( Map .DEFAULT_LOAD_FACTOR))
print ( "Therefore Rehashing will be done." )
# Rehash
self .rehash()
print ( "New Size of Map: " + str ( Map .numBuckets))
print ( "Number of pairs in the Map: " + str ( Map .size))
print ( "Size of Map: " + str ( Map .numBuckets))
def rehash( self ):
print ( "\n***Rehashing Started***\n" )
# The present bucket list is made temp
temp = Map .buckets
# New bucketList of double the old size is created
buckets = ( 2 * Map .numBuckets)
for i in range ( 2 * Map .numBuckets):
# Initialised to null
Map .buckets.append( None )
# Now size is made zero
# and we loop through all the nodes in the original bucket list(temp)
# and insert it into the new list
Map .size = 0
Map .numBuckets * = 2
for i in range ( len (temp)):
# head of the chain at that index
head = temp[i]
while (head ! = None ):
key = head.key
val = head.value
# calling the insert function for each node in temp
# as the new list is now the bucketArray
self .insert(key, val)
head = head. next
print ( "\n***Rehashing Ended***" )
def printMap( self ):
# The present bucket list is made temp
temp = Map .buckets
print ( "Current HashMap:" )
# loop through all the nodes and print them
for i in range ( len (temp)):
# head of the chain at that index
head = temp[i]
while (head ! = None ):
print ( "key = \" {} \", val = {}" . format (head.key,head.value))
head = head. next
print ()
if __name__ = = '__main__' :
# Creating the Map
map = Map ()
# Inserting elements
map .insert( 1 , "Geeks" )
map .printMap()
map .insert( 2 , "forGeeks" )
map .printMap()
map .insert( 3 , "A" )
map .printMap()
map .insert( 4 , "Computer" )
map .printMap()
map .insert( 5 , "Portal" )
map .printMap()
# This code is contributed by Amartya Ghosh |
using System;
using System.Collections.Generic;
public class Map {
private class MapNode {
public int key;
public int value;
public MapNode next;
public MapNode( int key, int value)
{
this .key = key;
this .value = value;
this .next = null ;
}
}
private List<MapNode> buckets;
private int size;
private int numBuckets;
private double DEFAULT_LOAD_FACTOR = 0.75;
public Map()
{
numBuckets = 5;
buckets = new List<MapNode>(numBuckets);
for ( int i = 0; i < numBuckets; i++) {
buckets.Add( null );
}
}
private int getBucketInd( int key)
{
int hashCode = key.GetHashCode();
return (hashCode % numBuckets);
}
public void insert( int key, int value)
{
int bucketInd = getBucketInd(key);
MapNode head = buckets[bucketInd];
while (head != null ) {
if (head.key == key) {
head.value = value;
return ;
}
head = head.next;
}
MapNode newElementNode = new MapNode(key, value);
head = buckets[bucketInd];
newElementNode.next = head;
buckets[bucketInd] = newElementNode;
size++;
double loadFactor = (1.0 * size) / numBuckets;
if (loadFactor > DEFAULT_LOAD_FACTOR) {
Console.WriteLine(loadFactor
+ " is greater than "
+ DEFAULT_LOAD_FACTOR);
Console.WriteLine(
"Therefore Rehashing will be done." );
rehash();
Console.WriteLine( "New Size of Map: "
+ numBuckets);
}
Console.WriteLine( "Number of pairs in the Map: "
+ size);
}
private void rehash()
{
Console.WriteLine( "\n***Rehashing Started***\n" );
// The present bucket list is made temp
List<MapNode> temp = buckets;
// New bucketList of double the old size is created
numBuckets *= 2;
buckets = new List<MapNode>(numBuckets);
for ( int i = 0; i < numBuckets; i++) {
buckets.Add( null );
}
// Now size is made zero
// and we loop through all the nodes in the original
// bucket list(temp) and insert it into the new list
size = 0;
for ( int i = 0; i < temp.Count; i++) {
// head of the chain at that index
MapNode head = temp[i];
while (head != null ) {
int key = head.key;
int val = head.value;
// calling the insert function for each node
// in temp as the new list is now the
// bucketArray
insert(key, val);
head = head.next;
}
}
Console.WriteLine( "***Rehashing Done***\n" );
}
} class Program {
static void Main( string [] args)
{
Map map = new Map();
// Inserting elements
map.insert(1, 1);
map.insert(2, 2);
map.insert(3, 3);
map.insert(4, 4);
map.insert(5, 5);
map.insert(6, 6);
map.insert(7, 7);
map.insert(8, 8);
map.insert(9, 9);
map.insert(10, 10);
}
} //This Code is Contributed by NarasingaNikhil |
// javascript program to implement Rehashing class Map { constructor() {
// The bucket array where
// the nodes containing K-V pairs are stored
this .buckets = [];
this .numBuckets = 5;
// No. of pairs stored - n
this .size = 0;
// Default loadFactor
this .DEFAULT_LOAD_FACTOR = 0.75;
for (let i = 0; i < this .numBuckets; i++) {
this .buckets.push( null );
}
}
getBucketInd(key) {
//Using the inbuilt function from the object class
let hashCode = key.toString().hashCode();
// array index = hashCode%numBuckets
return (hashCode % this .numBuckets);
}
insert(key, value) {
//Getting the index at which it needs to be inserted
let bucketInd = this .getBucketInd(key);
// The first node at that index
let head = this .buckets[bucketInd];
// First, loop through all the nodes present at that index
// to check if the key already exists
while (head) {
//If already present the value is updated
if (head.key == key) {
head.value = value;
return ;
}
head = head.next;
}
//new node with the K and V let newElementNode = new MapNode(key, value);
//The head node at the index
head = this .buckets[bucketInd];
// the new node is inserted
// by making it the head
// and it's next is the previous head
newElementNode.next = head;
this .buckets[bucketInd] = newElementNode;
this .size++;
let loadFactor = (1.0 * this .size) / this .numBuckets;
if (loadFactor > this .DEFAULT_LOAD_FACTOR) {
console.log(loadFactor
+ " is greater than "
+ this .DEFAULT_LOAD_FACTOR);
//If the load factor is > 0.75, rehashing is done
console.log(
"Therefore Rehashing will be done." );
//Rehash this .rehash();
console.log( "New Size of Map: "
+ this .numBuckets);
}
console.log( "Number of pairs in the Map: "
+ this .size);
}
rehash() {
console.log( "\n***Rehashing Started***\n" );
//he present bucket list is made temp let temp = this .buckets;
// New bucketList of double the old size is created
this .numBuckets *= 2;
//Initialised to null
this .buckets = [];
for (let i = 0; i < this .numBuckets; i++) {
this .buckets.push( null );
}
//Now size is made zero
// and we loop through all the nodes in the original bucket list(temp)
// and insert it into the new list
this .size = 0;
for (let i = 0; i < temp.length; i++) {
let head = temp[i];
while (head) {
let key = head.key;
let val = head.value;
//calling the insert function for each node in temp // as the new list is now the bucketArray
this .insert(key, val);
head = head.next;
}
}
console.log( "***Rehashing Done***\n" );
}
} class MapNode { constructor(key, value) {
this .key = key;
this .value = value;
this .next = null ;
}
} String.prototype.hashCode = function () {
let hash = 0;
if ( this .length == 0) {
return hash;
}
for (let i = 0; i < this .length; i++) {
let char = this .charCodeAt(i);
hash = ((hash << 5) - hash) + char;
hash = hash & hash; // Convert to 32bit integer
}
return hash;
} let map = new Map();
// Inserting elements map.insert(1, 1); map.insert(2, 2); map.insert(3, 3); map.insert(4, 4); map.insert(5, 5); map.insert(6, 6); map.insert(7, 7); map.insert(8, 8); map.insert(9, 9); map.insert(10, 10); //Code is contributed by NarasingaNikhil |
HashMap created Number of pairs in the Map: 0 Size of Map: 5 Default Load Factor : 0.75 Pair(1, Geeks) inserted successfully. Current Load factor = 0.2 Number of pairs in the Map: 1 Size of Map: 5 Current HashMap: key = 1, val = Geeks Pair(2, forGeeks) inserted successfully. Current Load factor = 0.4 Number of pairs in the Map: 2 Size of Map: 5 Current HashMap: key = 1, val = Geeks key = 2, val = forGeeks Pair(3, A) inserted successfully. Current Load factor = 0.6 Number of pairs in the Map: 3 Size of Map: 5 Current HashMap: key = 1, val = Geeks key = 2, val = forGeeks key = 3, val = A Pair(4, Computer) inserted successfully. Current Load factor = 0.8 0.8 is greater than 0.75 Therefore Rehashing will be done. ***Rehashing Started*** Pair(1, Geeks) inserted successfully. Current Load factor = 0.1 Number of pairs in the Map: 1 Size of Map: 10 Pair(2, forGeeks) inserted successfully. Current Load factor = 0.2 Number of pairs in the Map: 2 Size of Map: 10 Pair(3, A) inserted successfully. Current Load factor = 0.3 Number of pairs in the Map: 3 Size of Map: 10 Pair(4, Computer) inserted successfully. Current Load factor = 0.4 Number of pairs in the Map: 4 Size of Map: 10 ***Rehashing Ended*** New Size of Map: 10 Number of pairs in the Map: 4 Size of Map: 10 Current HashMap: key = 1, val = Geeks key = 2, val = forGeeks key = 3, val = A key = 4, val = Computer Pair(5, Portal) inserted successfully. Current Load factor = 0.5 Number of pairs in the Map: 5 Size of Map: 10 Current HashMap: key = 1, val = Geeks key = 2, val = forGeeks key = 3, val = A key = 4, val = Computer key = 5, val = Portal
The time complexity of the insert operation is O(1) and the
Auxiliary space : O(n).
The time complexity of the rehash operation is O(n) and the
Auxiliary space: O(n).