Open In App

Birthday attack in Cryptography

Last Updated : 31 Aug, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Prerequisite – Birthday paradox 
Birthday attack is a type of cryptographic attack that belongs to a class of brute force attacks. It exploits the mathematics behind the birthday problem in probability theory. The success of this attack largely depends upon the higher likelihood of collisions found between random attack attempts and a fixed degree of permutations, as described in the birthday paradox problem

Birthday paradox problem – 
Let us consider the example of a classroom of 30 students and a teacher. The teacher wishes to find pairs of students that have the same birthday. Hence the teacher asks for everyone’s birthday to find such pairs. Intuitively this value may seem small. For example, if the teacher fixes a particular date say October 10, then the probability that at least one student is born on that day is 1 – (364/365)30 which is about 7.9%. However, the probability that at least one student has the same birthday as any other student is around 70% using the following formula: 
 

1 - 365!/((365 - n!) * (365n))  (substituting n = 30 here) 

Derivation of the above term: 

Assumptions – 
1. Assuming a non leap year(hence 365 days). 
2. Assuming that a person has an equally likely chance of being born on any day of the year. 
Let us consider n = 2. 
P(Two people have the same birthday) = 1 – P(Two people having different birthday) 
                                                              = 1 – (365/365)*(364/365) 
                                                              = 1 – 1*(364/365) 
                                                              = 1 – 364/365 
                                                              = 1/365. 
So for n people, the probability that all of them have different birthdays is: 
P(N people having different birthdays) = (365/365)*(365-1/365)*(365-2/365)*….(365-n+1)/365. 
                                                              = 365!/((365-n)! * 365n

Hash function – 
A hash function H is a transformation that takes a variable sized input m and returns a fixed size string called a hash value(h = H(m)). Hash functions chosen in cryptography must satisfy the following requirements: 
 

  • The input is of variable length,
  • The output has a fixed length,
  • H(x) is relatively easy to compute for any given x,
  • H(x) is one-way,
  • H(x) is collision-free. 
     

A hash function H is said to be one-way if it is hard to invert, where “hard to invert” means that given a hash value h, it is computationally infeasible to find some input x such that H(x) = h

If, given a message x, it is computationally infeasible to find a message y not equal to x such that H(x) = H(y) then H is said to be a weakly collision-free hash function. 

A strongly collision-free hash function H is one for which it is computationally infeasible to find any two messages x and y such that H(x) = H(y)

Let H: M => {0, 1}n be a hash function (|M| >> 2n

Following is a generic algorithm to find a collision in time O(2n/2) hashes. 

Algorithm: 
 

  1. Choose 2n/2 random messages in M: m1, m2, …., mn/2
  2. For i = 1, 2, …, 2n/2 compute ti = H(mi) => {0, 1}n
  3. Look for a collision (ti = tj). If not found, go back to step 1 
     

We consider the following experiment. From a set of H values, we choose n values uniformly at random thereby allowing repetitions. Let p(n; H) be the probability that during this experiment at least one value is chosen more than once. This probability can be approximated as: 

 

p(n; H) = 1 - ( (365-1)/365) * (365-2)/365) * ...(365-n+1/365))
p(n; H) = e-n(n-1)/(2H) = e-n2/(2H)

Digital signature susceptibility – 
Digital signatures can be susceptible to birthday attacks. A message m is typically signed by first computing H(m), where H is a cryptographic hash function, and then using some secret key to sign H(m). Suppose Alice wants to trick Bob into signing a fraudulent contract. Alice prepares a fair contract m and fraudulent one m’. She then finds a number of positions where m can be changed without changing the meaning, such as inserting commas, empty lines, one versus two spaces after a sentence, replacing synonyms, etc. By combining these changes she can create a huge number of variations on m which are all fair contracts. 

Similarly, Alice can also make some of these changes on m’ to take it, even more, closer towards m, that is H(m) = H(m’). Hence, Alice can now present the fair version m to Bob for signing. After Bob has signed, Alice takes the signature and attaches to it the fraudulent contract. This signature proves that Bob has signed the fraudulent contract. 

To avoid such an attack the output of the hash function should be a very long sequence of bits such that the birthday attack now becomes computationally infeasible.
 

Example :

Let’s demonstrate a Birthday Attack in Python using the MD5 hash function:

Python




import hashlib
import random
 
# Function to generate a random string of a given length
def generate_random_string(length):
    charset = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
    return ''.join(random.choice(charset) for _ in range(length))
 
# Function to perform the Birthday Attack
def birthday_attack():
    hash_dict = {}
    num_attempts = 0
 
    while True:
        num_attempts += 1
        random_string = generate_random_string(10)
        hash_value = hashlib.md5(random_string.encode()).hexdigest()
 
        if hash_value in hash_dict:
            print(f"Collision found after {num_attempts} attempts!")
            print(f"Original String 1: {hash_dict[hash_value]}")
            print(f"Original String 2: {random_string}")
            break
 
        hash_dict[hash_value] = random_string
 
# Example usage
if __name__ == "__main__":
    birthday_attack()


Explanation:

  1. The ‘generate_random_string()' function generates a random string of a given length containing uppercase and lowercase letters, as well as digits.
  2. The ‘birthday_attack()' function performs the actual attack. It keeps generating random strings, calculates their MD5 hash values using the ‘hashlib.md5()' function, and checks if the hash value is already present in the ‘hash_dict'. If a collision is found (two different inputs with the same hash), it prints the original strings and exits the loop.

Output :

Collision found after 10467 attempts!
Original String 1: 9lr9UUjklH
Original String 2: 9lr9UUjkT5

In this example, we simulated a Birthday Attack by generating random strings and calculating their MD5 hash values. After approximately 10467 attempts, we found a collision where two different input strings produced the same MD5 hash. This demonstrates the vulnerability of hash functions to Birthday Attacks, highlighting the importance of using secure and collision-resistant hash functions in cryptography. For this reason, it is generally recommended to use stronger hash functions like SHA-3 in practical applications.



Previous Article
Next Article

Similar Reads

Birthday Reminder Application in Python
This app helps in reminding birthdays and notifying your friend's birthdays. This app uses Python and Ubuntu notifications to notify users on every startup of the system. # Python program For # Birthday Reminder Application # time module is must as reminder # is set with the help of dates import time # os module is used to notify user # using defau
2 min read
Image Steganography in Cryptography
The word Steganography is derived from two Greek words- 'stegos' meaning 'to cover' and 'grayfia', meaning 'writing', thus translating to 'covered writing', or 'hidden writing'. Steganography is a method of hiding secret data, by embedding it into an audio, video, image, or text file. It is one of the methods employed to protect secret or sensitive
8 min read
Breaking Cryptography
In this article, we will discuss the overview of symmetric encryption protocols and how the need for Asymmetric Encryption came into the picture. Here we will discuss the overview, cryptography examples, and rules for breaking cryptography. Let’s discuss them one by one. Overview : A key scrambles the messages and that very key can decrypt the mess
8 min read
Visual Cryptography | Introduction
Introduction: Visual cryptography is a method of secure communication that uses images to encrypt secret messages. It operates by splitting an image or text into multiple shares, such that when the shares are overlaid, the original image or text becomes visible. Here is how it works:The original image or text is divided into two or more shares, eac
6 min read
An Overview of Cloud Cryptography
Cloud cryptography is a set of techniques used to secure data stored and processed in cloud computing environments. It provides data privacy, data integrity, and data confidentiality by using encryption and secure key management systems. Common methods used in cloud cryptography include: Symmetric encryption: encrypts and decrypts data using the sa
4 min read
Understanding Rainbow Table Attack
What is a Rainbow Table? The passwords in a computer system are not stored directly as plain texts but are hashed using encryption. A hash function is a 1-way function, which means that it can't be decrypted. Whenever a user enters a password, it is converted into a hash value and is compared with the already stored hash value. If the values match,
4 min read
Program to perform a letter frequency attack on a monoalphabetic substitution cipher
Given a string S of size N representing a monoalphabetic cipher, the task is to print the top five possible plain texts that can be decrypted from the given monoalphabetic cipher using a letter frequency attack. Examples: Input: S = "ETAOINSHRDLCUMWFGYPBVKJXQZ"Output: A SIMPLE MESSAGE B TJNQMF NFTTBHF A SIMPLE MESSAGE C UKORNG OGUUCIG C UKORNG OGUU
13 min read
Understanding ReDoS Attack
ReDoS stands for Regular Expression Denial of Service. The ReDoS is an algorithmic complexity attack that produces a denial of service by providing a regular expression that takes a very long time to evaluate. The attack exploits the fact that most regular expression implementations have exponential time worst case complexity, so for larger input s
3 min read
Article Tags :
Practice Tags :