Difference between Bloom filters and Hashtable
Hashtable is designed to use a special function called the Hash function which is used to map a given value with a particular key for faster access of elements. It is used where fast lookups are required.(Under reasonable assumptions, average time for element lookup in a hash table is O(1) ). Dictionary in Python is implemented using HashTables. Java also implements HashTable class.
Some applications of hashing can be found here.
A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. It is used where we just need to know the element belongs to the object or not. A bloom filter uses k hash functions and array of n bits, where array bit set to 0, means element doesn’t exist and 1 indicates that element is present. Some applications of bloom filters are:
- Google Bigtable, Apache HBase and Apache Cassandra and PostgreSQL use Bloom filters to reduce the disk lookups for non-existent rows or columns. Avoiding costly disk lookups considerably increases the performance of a database query operation.
- The Google Chrome web browser used to use a Bloom filter to identify malicious URLs. Any URL was first checked against a local Bloom filter, and only if the Bloom filter returned a positive result was a full check of the URL performed (and the user warned, if that too returned a positive result).
Let’s see the difference between hashtables and bloom filters:
S No. Hash Tables Bloom Filters 1 In hash table the object gets stored to the bucket(index position in the hashtable) the hash function maps to. Bloom filters doesn’t store the associated object. It just tells whether it is there in the bloom filter or not. 2 Hash tables are less space efficient. Bloom filters are more space efficient. it’s size is even the less than the associated object which it is mapping. 3 Supports deletions. It is not possible to delete elements from bloom filters. 4 Hashtables give accurate results. Bloom filters have small false positive probability. ( False positive means it might be in bloom filter but actually it is not.) 5 In a hashtable either we should implement multiple hash functions or have a strong hash function to minimize collisions. A bloom filter uses many hash functions. There is no need to handle collisions. 6 Hashtables are used in compiler operations, programming languages(hash table based data structures),password verification, etc. Bloom filters find application in network routers, in web browsers(to detect the malicious urls), in password checkers(to not a set a weak or guessable or list of forbidden passwords), etc.
HashTables and bloom filters are closely related to each other, therefore, it is wise to compare these two data structures and use them wisely as per your application/need demands.