# Huffman Decoding

• Difficulty Level : Hard
• Last Updated : 11 Nov, 2017

We have discussed Huffman Encoding in a previous post. In this post decoding is discussed.

Examples:

```Input Data : AAAAAABCCCCCCDDEEEEE
Frequencies : A: 6, B: 1, C: 6, D: 2, E: 5
Encoded Data :
0000000000001100101010101011111111010101010
Huffman Tree: '#' is the special character used
for internal nodes as character field
is not needed for internal nodes.
#(20)
/       \
#(12)         #(8)
/      \        /     \
A(6)     C(6) E(5)     #(3)
/     \
B(1)    D(2)
Code of 'A' is '00', code of 'C' is '01', ..
Decoded Data : AAAAAABCCCCCCDDEEEEE

Input Data : GeeksforGeeks
Character With there Frequencies
e 10, f 1100, g 011, k 00, o 010, r 1101, s 111
Encoded Huffman data :
01110100011111000101101011101000111
Decoded Huffman Data
geeksforgeeks
```

To decode the encoded data we require the Huffman tree. We iterate through the binary encoded data. To find character corresponding to current bits, we use following simple steps.

1. We start from root and do following until a leaf is found.
2. If current bit is 0, we move to left node of the tree.
3. If the bit is 1, we move to right node of the tree.
4. If during traversal, we encounter a leaf node, we print character of that particular leaf node and then again continue the iteration of the encoded data starting from step 1.

The below code takes a string as input, it encodes it and save in a variable encodedString. Then it decodes it and print the original string.

The below code performs full Huffman Encoding and Decoding of a given input data.

 `// C++ program to encode and decode a string using``// Huffman Coding.``#include ``#define MAX_TREE_HT 256``using` `namespace` `std;`` ` `// to map each character its huffman value``map<``char``, string> codes;`` ` `// to store the frequency of character of the input data``map<``char``, ``int``> freq;`` ` `// A Huffman tree node``struct` `MinHeapNode``{``    ``char` `data;             ``// One of the input characters``    ``int` `freq;             ``// Frequency of the character``    ``MinHeapNode *left, *right; ``// Left and right child`` ` `    ``MinHeapNode(``char` `data, ``int` `freq)``    ``{``        ``left = right = NULL;``        ``this``->data = data;``        ``this``->freq = freq;``    ``}``};`` ` `// utility function for the priority queue``struct` `compare``{``    ``bool` `operator()(MinHeapNode* l, MinHeapNode* r)``    ``{``        ``return` `(l->freq > r->freq);``    ``}``};`` ` `// utility function to print characters along with``// there huffman value``void` `printCodes(``struct` `MinHeapNode* root, string str)``{``    ``if` `(!root)``        ``return``;``    ``if` `(root->data != ``'\$'``)``        ``cout << root->data << ``": "` `<< str << ``"\n"``;``    ``printCodes(root->left, str + ``"0"``);``    ``printCodes(root->right, str + ``"1"``);``}`` ` `// utility function to store characters along with``// there huffman value in a hash table, here we``// have C++ STL map``void` `storeCodes(``struct` `MinHeapNode* root, string str)``{``    ``if` `(root==NULL)``        ``return``;``    ``if` `(root->data != ``'\$'``)``        ``codes[root->data]=str;``    ``storeCodes(root->left, str + ``"0"``);``    ``storeCodes(root->right, str + ``"1"``);``}`` ` `// STL priority queue to store heap tree, with respect``// to their heap root node value``priority_queue, compare> minHeap;`` ` `// function to build the Huffman tree and store it``// in minHeap``void` `HuffmanCodes(``int` `size)``{``    ``struct` `MinHeapNode *left, *right, *top;``    ``for` `(map<``char``, ``int``>::iterator v=freq.begin(); v!=freq.end(); v++)``        ``minHeap.push(``new` `MinHeapNode(v->first, v->second));``    ``while` `(minHeap.size() != 1)``    ``{``        ``left = minHeap.top();``        ``minHeap.pop();``        ``right = minHeap.top();``        ``minHeap.pop();``        ``top = ``new` `MinHeapNode(``'\$'``, left->freq + right->freq);``        ``top->left = left;``        ``top->right = right;``        ``minHeap.push(top);``    ``}``    ``storeCodes(minHeap.top(), ``""``);``}`` ` `// utility function to store map each character with its``// frequency in input string``void` `calcFreq(string str, ``int` `n)``{``    ``for` `(``int` `i=0; iright``// if s[i]=='0' then move to node->left``// if leaf node append the node->data to our output string``string decode_file(``struct` `MinHeapNode* root, string s)``{``    ``string ans = ``""``;``    ``struct` `MinHeapNode* curr = root;``    ``for` `(``int` `i=0;ileft;``        ``else``           ``curr = curr->right;`` ` `        ``// reached leaf node``        ``if` `(curr->left==NULL and curr->right==NULL)``        ``{``            ``ans += curr->data;``            ``curr = root;``        ``}``    ``}``    ``// cout<first <<``' '` `<< v->second << endl;`` ` `    ``for` `(``auto` `i: str)``        ``encodedString+=codes[i];`` ` `    ``cout << ``"\nEncoded Huffman data:\n"` `<< encodedString << endl;`` ` `    ``decodedString = decode_file(minHeap.top(), encodedString);``    ``cout << ``"\nDecoded Huffman Data:\n"` `<< decodedString << endl;``    ``return` `0;``}`

Output:

```Character With there Frequencies
e 10
f 1100
g 011
k 00
o 010
r 1101
s 111

Encoded Huffman data
01110100011111000101101011101000111

Decoded Huffman Data
geeksforgeeks
```

Comparing Input file size and Output file size:
Comparing the input file size and the Huffman encoded output file. We can calculate the size of the output data in a simple way. Lets say our input is a string “geeksforgeeks” and is stored in a file input.txt.
Input File Size:

```Input: "geeksforgeeks"
Total number of character i.e. input length: 13
Size: 13 character occurrences * 8 bits = 104 bits or 13 bytes.
```

Output File Size:

```Input: "geeksforgeeks"
------------------------------------------------
Character |  Frequency |  Binary Huffman Value |
------------------------------------------------
e      |      4     |         10            |
f      |      1     |         1100          |
g      |      2     |         011           |
k      |      2     |         00            |
o      |      1     |         010           |
r      |      1     |         1101          |
s      |      2     |         111           |
------------------------------------------------

So to calculate output size:
e: 4 occurrences * 2 bits = 8 bits
f: 1 occurrence  * 4 bits = 4 bits
g: 2 occurrences * 3 bits = 6 bits
k: 2 occurrences * 2 bits = 4 bits
o: 1 occurrence  * 3 bits = 3 bits
r: 1 occurrence  * 4 bits = 4 bits
s: 2 occurrences * 3 bits = 6 bits

Total Sum: 35 bits approx 5 bytes
```

Hence, we could see that after encoding the data we have saved a large amount of data.
The above method can also help us to determine the value of N i.e. the length of the encoded data.

This article is contributed by Harshit Sidhwa. If you like GeeksforGeeks and would like to contribute, you can also write an article using write.geeksforgeeks.org or mail your article to review-team@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.