Open In App

Blockchain Merkle Trees

A hash tree is also known as Merkle Tree. It is a tree in which each leaf node is labeled with the hash value of a data block and each non-leaf node is labeled with the hash value of its child nodes labels. This article focuses on discussing the following topics in detail:

  1. What is a Cryptographic Hash?
  2. What is Hash Pointer?
  3. Blockchain Structure
  4. Block Structure
  5. Merkle Tree Structure
  6. How Do Merkle Trees Work?
  7. Why Merkle Trees are Important For Blockchain?
  8. Proof of Membership
  9. Merkle Proofs
  10. Simple Payment Verification(SPV)
  11. Advantages of Merkle Tree

Let’s discuss each of these topics in detail.



What is a Cryptographic Hash?

A cryptographic hash is a function that outputs a fixed-size digest for a variable-length input. A hash function is an important cryptographic primitive and extensively used in blockchain. For example, SHA-256 is a hash function in which for any variable-bit length input, the output is always going to be a 256-bit hash.



What is Hash Pointer?

A regular pointer stores the memory address of data. With this pointer, the data can be accessed easily. On the other hand, a hash pointer is a pointer to where data is stored and with the pointer, the cryptographic hash of the data is also stored. So a hash pointer points to the data and also allows us to verify the data. A hash pointer can be used to build all kinds of data structures such as blockchain and Merkle tree.

Blockchain Structure

The blockchain is a proficient combination of two hash-based data structures-

  1. Linked list: This is the structure of the blockchain itself, which is a linked list of hash pointers. A regular linked list consists of nodes. Each node has 2 parts- data and pointer. The pointer points to the next node. In the blockchain, simply replace the regular pointer with a hash pointer.
  2. Merkle tree: A Merkle tree is a binary tree formed by hash pointers, and named after its creator, Ralph Merkle.

Blockchain as linked list with hash pointers

Block Structure

1. Block header: The header data contains metadata of the block, i.e information about the block itself. The contents of the block header include-

2. Merkle tree: A Merkle tree is a binary tree formed by hash pointers, and named after its creator, Ralph Merkle.

Each block comprises of block header + Merkle tree

Merkle Tree Structure

Structure of Merkle tree

1. A blockchain can potentially have thousands of blocks with thousands of transactions in each block. Therefore, memory space and computing power are two main challenges.

2. It would be optimal to use as little data as possible for verifying transactions, which can reduce CPU processing and provide better security, and this is exactly what Merkle trees offer.

3. In a Merkle tree, transactions are grouped into pairs. The hash is computed for each pair and this is stored in the parent node. Now the parent nodes are grouped into pairs and their hash is stored one level up in the tree. This continues till the root of the tree. The different types of nodes in a Merkle tree are:

4. Bitcoin uses the SHA-256 hash function to hash transaction data continuously till the Merkle root is obtained.

5. Further, a Merkle tree is binary in nature. This means that the number of leaf nodes needs to be even for the Merkle tree to be constructed properly. In case there is an odd number of leaf nodes, the tree duplicates the last hash and makes the number of leaf nodes even.

How Do Merkle Trees Work?

Binary tree direction vs Merkle tree direction

Example: Consider a block having 4 transactions- T1, T2, T3, T4. These four transactions have to be stored in the Merkle tree and this is done by the following steps-

Step 1: The hash of each transaction is computed. 

H1 = Hash(T1).

Step 2: The hashes computed are stored in leaf nodes of the Merkle tree. 

Step 3: Now non-leaf nodes will be formed. In order to form these nodes, leaf nodes will be paired together from left to right, and the hash of these pairs will be calculated. Firstly hash of H1 and H2 will be computed to form H12. Similarly, H34 is computed. Values H12 and H34 are parent nodes of H1, H2, and H3, H4 respectively. These are non-leaf nodes.

H12 = Hash(H1 + H2) 

H34 = Hash(H3 + H4)

Step 4: Finally H1234 is computed by pairing H12 and H34. H1234 is the only hash remaining. This means we have reached the root node and therefore H1234 is the Merkle root.

H1234 = Hash(H12 + H34)

Merkle tree works by hashing child nodes again and again till only one hash remains.

Key Points:

Why Merkle Trees are Important For Blockchain?

Merkle trees use a one-way hash function extensively and this hashing separates the proof of data from data itself

Proof of Membership

A very interesting feature of the Merkle tree is that it provides proof of membership.

Example: A miner wants to prove that a particular transaction belongs to a Merkle tree Now the miner needs to present this transaction and all the nodes which lie on the path between the transaction and the root. The rest of the tree can be ignored because the hashes stored in the intermediate nodes are enough to verify the hashes all the way up to the root. 

Proof of membership: verifying the presence of transactions in blocks using the Merkle tree.

If there are n nodes in the tree then only log(n) nodes need to be examined. Hence even if there are a large number of nodes in the Merkle tree, proof of membership can be computed in a relatively short time.

Merkle Proofs

A Merkle proof is used to decide:

  1. If data belongs to a particular Merkle tree.
  2. To prove data belongs to a set without the need to store the whole set.
  3. To prove a certain data is included in a larger data set without revealing the larger data set or its subsets.

Merkle proofs are established by hashing a hash’s corresponding hash together and climbing up the tree until you obtain the root hash which is or can be publicly known.

Consider the Merkle tree given below:

Let us say we need to prove that transaction ‘a’ is part of this Merkle tree. Everyone in the network will be aware of the hash function used by all Merkle trees. 

  1. H(a) = Ha as per the diagram.
  2. The hash of Ha and Hb will be Hab, which will be stored in an upper-level node.
  3. Finally hash of Hab and Hcd will give Habcd. This is the Merkle root obtained by us.
  4. By comparing the obtained Merkle root and the Merkle root already available within the block header, we can verify the presence of transaction ‘a’ in this block.

From the above example, it is clear that in order to verify the presence of ‘a’, ‘a’ does not have to be revealed nor do ‘b’, ‘c’, ‘d’ have to be revealed, only their hashes are sufficient. Therefore Merkle proof provides an efficient and simple method of verifying inclusivity, and is synonymous with “proof of inclusion”.
A sorted Merkle tree is a tree where all the data blocks are ordered using an ordering function. This ordering can be alphabetical, lexicographical, numerical, etc.

Proof of Non-Membership:

Coinbase Transaction:

A coinbase transaction is a unique Bitcoin transaction that is included in the Merkle tree of every block in the blockchain. It is responsible for creating new coins and also consists of a coinbase parameter that can be used by miners to insert arbitrary data into the blockchain.

Simple Payment Verification(SPV)

Advantages of Merkle Tree

  1. Efficient verification: Merkle trees offer efficient verification of integrity and validity of data and significantly reduce the amount of memory required for verification. The proof of verification does not require a huge amount of data to be transmitted across the blockchain network. Enable trustless transfer of cryptocurrency in the peer-to-peer, distributed system by the quick verification of transactions.
  2. No delay: There is no delay in the transfer of data across the network. Merkle trees are extensively used in computations that maintain the functioning of cryptocurrencies.
  3. Less disk space: Merkle trees occupy less disk space when compared to other data structures.
  4. Unaltered transfer of data: Merkle root helps in making sure that the blocks sent across the network are whole and unaltered.
  5. Tampering Detection: Merkle tree gives an amazing advantage to miners to check whether any transactions have been tampered with.
    • Since the transactions are stored in a Merkle tree which stores the hash of each node in the upper parent node, any changes in the details of the transaction such as the amount to be debited or the address to whom the payment must be made, then the change will propagate to the hashes in upper levels and finally to the Merkle root.
    • The miner can compare the Merkle root in the header with the Merkle root stored in the data part of a block and can easily detect this tampering.
  6. Time  Complexity: Merkle tree is the best solution if a comparison is done between the time complexity of searching a transaction in a block as a Merkle tree and another block that has transactions arranged in a linked list, then-
    • Merkle Tree search: O(logn), where n is the number of transactions in a block.
    • Linked List search: O(n), where n is the number of transactions in a block.

Article Tags :