Utility of Hashing In Recent Technologies

Last Updated : 18 Oct, 2021

In one sentence Hashing is the transformation of a key ( input ) into a different value. It is basically done by using a function or method which takes the original data as input and does some calculations on it so that the resulting value is nowhere near to the original one and the function is called a hash function while the output is the hash code. We can then use this hash to search the data quickly instead of going over the whole storage.

Benefits of Hashing:

In Situations where we need to see if two files are the same without checking through them personally is where we use hashing. When we try to transfer files so as to ensure that the transferred file is not corrupted or changed, a user can compare the hash value of both files and check for equality. We can also say that hashing provides a more authentic and flexible way in order to retrieve any data than any other data structure.

We can see various applications of hashing in our day-to-day lives. Let’s look into few applications of Hashing:

Blockchain

The foundation of cryptocurrency is the blockchain, which is nothing but the connection of individual blocks of transaction data. When we are dealing with cryptocurrency ( for example bitcoin) the details related to the transactions are given to a hashing algorithm ( SHA-256 ) the result of which is a hash value having a constant length using which we can’t guess what the actual data was. Since all these blocks are linked with each other, if we need to change the data in a particular block we need to change the data in its previous block and it goes on…. . This is how blockchain becomes immutable as well as trustworthy of its data.

AWS S3 Bucket

AWS S3 bucket uses the MD-5 hashing algorithm. This means whenever we try to upload any file, it will first go through the MD-5 code and its integrity will be validated. If the validation fails it means the value isn’t the one user-entered resulting in an error message along with the failure of storing the object.

Suppose a different user is in need of some file in the bucket, First of all, what the particular user would do is look for the E tag within the AWS S3. After which the user will get the MD 5 value for the downloaded file and check if both are matching or not. We can check for the integrity of the object by comparing both these values so as to conclude if the object was tampered with, within the network or isn’t completely downloaded.

Thus, we can get to know about the process behind the upload of our file whether it was tampered with, by uploading or downloading using hashing.

Hashing Passwords

Whenever we enter the password for any account, it is converted to a string that has no connection with the password and stored in the database. So when a user wants to sign up for an account that would require them to choose a password, instead of storing this password as the text it is stored as a hash generated from the one user entered. Why shouldn’t we store the password as text? Because if in case an attacker gets access to the database, he would, in turn, get access to the real password which the user may have used for different accounts (the reason why most websites warn not to use the same password everywhere ). When any user wants to log in they enter their password which gets hashed and checked with the values in their database to verify if the password entered was correct.

Tokenization

Credit card tokens are used to exchange the customer data with an alphanumeric ID that has no definite value and relation to the account’s owner. Tokens themselves do not contain any useful consumer data. These tokens have the data of location where the user data is stored in the customer’s bank. For instance, if we want to check if a token corresponds to a Card, it is enough to compare the hash of the card with the token. Also, there is no way to reverse the token because it hasn’t been stored in a table with both the card and token.

Rabin- Karp Algorithm

Michael O. Rabin and Richard Karp formulated a way of doing string search by implementing hashing, resulting in the time complexity O( M + N ) compared to the regular O( MN ). It is basically used to search for a string. Depending on the size of the string we keep on calculating the hash values for a pattern. One is for the original string and the other is for the sequence of characters in the string which we keep on shifting to towards the end of the string. Once we get matching of the two hashes we need to check the original strings again as different strings may have the same hash. We continue this until we reach the end of the string.

Suggest improvement

Introduction to Universal Hashing in Data Structure

Share your thoughts in the comments