We strongly recommend to refer below post as a prerequisite of this.

Hashing | Set 1 (Introduction)

**What is Collision?**

Since a hash function gets us a small number for a key which is a big integer or string, there is possibility that two keys result in same value. The situation where a newly inserted key maps to an already occupied slot in hash table is called collision and must be handled using some collision handling technique.

**What are the chances of collisions with large table?**

Collisions are very likely even if we have big table to store keys. An important observation is Birthday Paradox. With only 23 persons, the probability that two people have same birthday is 50%.

**How to handle Collisions?**

There are mainly two methods to handle collision:

1) Separate Chaining

2) Open Addressing

In this article, only separate chaining is discussed. We will be discussing Open addressing in next post.

** Separate Chaining:**

The idea is to make each cell of hash table point to a linked list of records that have same hash function value.

Let us consider a simple hash function as “**key mod 7**” and sequence of keys as 50, 700, 76, 85, 92, 73, 101.

**C++ program for hashing with chaining**

**Advantages:**

1) Simple to implement.

2) Hash table never fills up, we can always add more elements to chain.

3) Less sensitive to the hash function or load factors.

4) It is mostly used when it is unknown how many and how frequently keys may be inserted or deleted.

**Disadvantages:**

1) Cache performance of chaining is not good as keys are stored using linked list. Open addressing provides better cache performance as everything is stored in same table.

2) Wastage of Space (Some Parts of hash table are never used)

3) If the chain becomes long, then search time can become O(n) in worst case.

4) Uses extra space for links.

**Performance of Chaining:**

Performance of hashing can be evaluated under the assumption that each key is equally likely to be hashed to any slot of table (simple uniform hashing).

m = Number of slots in hash table n = Number of keys to be inserted in hash table Load factor α = n/m Expected time to search = O(1 + α) Expected time to insert/delete = O(1 + α) Time complexity of search insert and delete is O(1) if α is O(1)

**Next Post:**

Open Addressing for Collision Handling

**References:**

http://courses.csail.mit.edu/6.006/fall09/lecture_notes/lecture05.pdf

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above

## Recommended Posts:

- Convert an array to reduced form | Set 1 (Simple and Hashing)
- Implementing our Own Hash Table with Separate Chaining in Java
- Cuckoo Hashing - Worst case O(1) Lookup!
- Top 20 Hashing Technique based Interview Questions
- Hashing | Set 1 (Introduction)
- Hashing | Set 3 (Open Addressing)
- Union and Intersection of two linked lists | Set-3 (Hashing)
- Index Mapping (or Trivial Hashing) with negatives allowed
- Practice Problems on Hashing
- C++ program for hashing with chaining
- Double Hashing
- Hashtables Chaining with Doubly Linked Lists
- Coalesced hashing
- Applications of Hashing
- Hashing in Java