Difference between Nested Loop Join and Hash Join

1. Nested Loop Join :
This is a type of physical join algorithm that is used in case of joining 2 relations. This join is an internal join technique, meaning that we cannot see the join. This is the simplest of all types of joins. This is the best-suited algorithm for small data and smaller transactions. In the case of 2 relations named R and S, the algorithm for the Nested Loop join would be as follows :

For each record x of R read in, do
Use the index on B for S
Get all the matching records (having B=x.A)

2. Hash Join :
Hash Join is also a type of physical join algorithm that is used in case of joining two tables internally. The join being an internal join technique means that we cannot see the join. The join selection is done automatically by the query optimizer. The hash join is performed using two steps, the build, and probe. In the case of 2 relations named R and S, the algorithm for Hash join would be as follows :

Hash records of R, one by one, using A values
(Use the same M buckets and same hash function h)
Hash matching pair of records into the same bucket

Difference between Nested Loop Join and Hash Join :

S.No. Nested Loop Join Hash Join
1. It is processed by forming an outer loop within an inner loop after which the inner loop is individually processed for the fewer entries that it has. It is specifically used in case of joining of larger tables.
2. The nested join has the least performance in case of large tables. It has best performance in case of large and sorted and non-indexed inputs.
3. There are two phases in this, outer and inner loop processing. The two phases in this are build and probe.
4. Steps involved include identifying an outer driving table and assigning the inner table to it, and processing the rows of inner table for every outer table row. The steps involved are building a Hash table on a small table. It is used to probe the hash value of the Hash table is applicable for each element in the second row.
5. Index range scan is done here. Full-table scan of the smaller table is done in case of hash join.
6. This uses lesser RAM resources. It uses more RAM resources.
7. It is the most common type of join. It is not as common as the nested loop join.
8. Least number of comparisons are required in case of nested loop join. It needs more comparisons than the nested loop join thereby using more RAM resources.
9. It is the fastest join algorithm due to least number of comparisons. It is not as fast due to more number of comparisons.
10. It is better than all other types of join for small transactions and small data. It is not as good as nested loop join in case of smaller data.
11. It is of three types, namely, nested loop join, indexed nested loop join and Temporary indexed nested loop join. Its types are classic hash join, Grace hash join, hybrid hash join, hash anti join, hash semi-join, recursive hash join and hash bailout.
12. It is not automatically selected. This join is automatically selected in case there is no specific reason to adopt other types of join algorithms. It is also known as the go-to guy of all the join operators.

Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.

My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.

Article Tags :
Practice Tags :

Be the First to upvote.

Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.