Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

How to Generate MD5 Checksum for Files in Java?

  • Difficulty Level : Medium
  • Last Updated : 29 Jul, 2021

An alphanumeric value i.e. the sequence of letters and numbers that uniquely defines the contents of a file is called a checksum (often referred to as a hash). Checksums are generally used to check the integrity of files downloaded from an external source. You may use a checksum utility to ensure that your copy is equivalent if you know the checksum of the original version. For example, before backing up your files you can generate a checksum of those files and can verify the same once you have to download them on some other device. The checksum would be different if the file has been corrupted or altered in the process.

MD5 and SHA are the two most widely used checksum algorithms. You must ensure that you use the same algorithm that has been used to generate the checksum when checking checksums. For example, the MD5 checksum value of a file is totally different from its SHA-256 checksum value.

Attention reader! Don’t stop learning now. Get hold of all the important Java Foundation and Collections concepts with the Fundamentals of Java and Java Collections Course at a student-friendly price and become industry ready. To complete your preparation from learning a language to DS Algo and many more,  please refer Complete Interview Preparation Course.

To produce a checksum, you run a program that puts that file through an algorithm. Typical algorithms used for this include MD5, SHA-1, SHA-256, and SHA-512.

These algorithms use a cryptographic hash function that takes an input and generates a fixed-length alphanumeric string regardless of the size of the file.



NOTE:

  1. Even small changes in the file will produce a different checksum.
  2. These cryptographic hash functions, though, aren’t flawless. “Collisions” with the MD5 and SHA-1 functions have been discovered by security researchers. They’ve found two different files, that produce the same MD5 or SHA-1 hash, but are different. This is highly unlikely to happen by mere accident, but this strategy may be used by an attacker to mask a malicious file as a valid file.

Generating Checksum in Java

Java provides an inbuilt functionality of generating these hash functions through MessageDigest Class present in the security package of Java. Message digests are encrypted one-way hash functions that take data of arbitrary size and produce a hash value of fixed length.

  • We first start with instantiating the MessageDigest Object by passing any valid hashing algorithm string.
  • Then we update this object till we read the complete file. Although we can use the digest(byte[] input) which creates a final update on the MessageDigest object by reading the whole file at once in case the file is too big/large we might not have enough memory to read the entire file as a byte array and this could result in Java.lang.OutOfMemoryError: Java Heap Space.
  • So, It’s better to read data in parts and update MessageDigest.

Once the update is complete one of the digest method is called to complete the hash computation. Whenever a digest method is called the MessageDigest object is reset to its initialized state. The digest method returns a byte array that has bytes in the decimal format so we Convert it to hexadecimal format. And the final string is the checksum.

Example:

Java




// Java program to Generate MD5 Checksum for Files
 
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
 
public class GFG {
 
    // this method gives a NoSuchAlgorithmException in case
    // we pass a string which dosen't have any hashing
    // algorithm in its correspondence
   
    public static void main(String[] args)
        throws IOException, NoSuchAlgorithmException
    {
 
        // create a file object referencing any file from
        // the system of which checksum is to be generated
        File file = new File("C:\\Users\\Raghav\\Desktop\\GFG.txt");
 
        // instantiate a MessageDigest Object by passing
        // string "MD5" this means that this object will use
        // MD5 hashing algorithm to generate the checksum
        MessageDigest mdigest = MessageDigest.getInstance("MD5");
 
        // Get the checksum
        String checksum = checksum(mdigest, file);
 
        // print out the checksum
        System.out.println(checksum);
    }
 
    // this method return the complete  hash of the file
    // passed
    private static String checksum(MessageDigest digest,
                                   File file)
        throws IOException
    {
        // Get file input stream for reading the file
        // content
        FileInputStream fis = new FileInputStream(file);
 
        // Create byte array to read data in chunks
        byte[] byteArray = new byte[1024];
        int bytesCount = 0;
 
        // read the data from file and update that data in
        // the message digest
        while ((bytesCount = fis.read(byteArray)) != -1)
        {
            digest.update(byteArray, 0, bytesCount);
        };
 
        // close the input stream
        fis.close();
 
        // store the bytes returned by the digest() method
        byte[] bytes = digest.digest();
 
        // this array of bytes has bytes in decimal format
        // so we need to convert it into hexadecimal format
 
        // for this we create an object of StringBuilder
        // since it allows us to update the string i.e. its
        // mutable
        StringBuilder sb = new StringBuilder();
       
        // loop through the bytes array
        for (int i = 0; i < bytes.length; i++) {
           
            // the following line converts the decimal into
            // hexadecimal format and appends that to the
            // StringBuilder object
            sb.append(Integer
                    .toString((bytes[i] & 0xff) + 0x100, 16)
                    .substring(1));
        }
 
        // finally we return the complete hash
        return sb.toString();
    }
}

Output: 

8eeecb74627e963d65d10cbf92a2b7c9

 




My Personal Notes arrow_drop_up
Recommended Articles
Page :

Start Your Coding Journey Now!