Huffman coding is a technique that compresses the size of data. For example, a .zip file of a text document is a compressed version of the original document. This so-called lossless data compression...


Huffman coding is a technique that compresses the size of data. For example, a .zip file of a text document is a compressed version of the original document. This so-called lossless data compression is a result of Huffman coding. Although ASCII and Unicode use bit strings having a fixed length—8 and 16, respectively—to represent symbols, Huffman codes have variable lengths. These codes are based on the frequency of occurrence of a symbol in a given data set. Symbols that occur frequently have shorter codes than those occurring less frequently. A binary tree—called a Huffman tree—is used to generate these codes. For example, let’s encode some text composed only of the letters A through E. Suppose these letters occur with the following frequencies:


 A, 12 times;


 B, 3 times;


 C, 1 time;


D, 9 times; and


 E, 15 times.


We need to arrange these letters by their frequencies in increasing order. To do so, we associate each letter with its frequency of occurrence and add these pairs to a collection, such as a sorted list or a priority queue. The result is shown in Figure 25-12a. Now we remove the two entries having the lowest frequencies and make them leaves in a binary tree. The parent of these leaves is a node containing the sum of the frequencies in the leaves, as illustrated in Figure 25-12b. Since the parent contains only a frequency, the letter portion of the node is null, which is shown as


 • in the figure. We now add the contents of the parent to our list or queue, as shown to the right of the tree in Figure 25-12b.


The steps in creating a binary tree for Huffman coding


When we remove the next two entries from the list, we create a new leaf containing D 9 and join it to the existing tree with a new root containing the sum of the frequencies in its two children. The result is given in Figure 25-12c. Note that we then place the contents of this new root in its correct order within the remaining data. Parts d and e of the figure show the remaining steps in this process. Figure 25-12f shows the resulting binary tree with its left links and right links labelled with 0 and 1, respectively. To encode a character, you begin at a leaf and traverse the tree to its root, recording 0s and 1s in reverse order according to the left and right branches that you take. The Huffman codes for our example are as follows: A is 10, B is 1101, C is 1100, D is 111, and E is 0. To decode a Huffman code, you traverse the tree from its root to a leaf by taking a left branch for each 0 encountered and a right branch for each 1 encountered. Write a program that reads a text file of alphabetic data, creates a Huffman tree, and uses the tree to compress the file. Your program should then take the compressed file and, using your tree, decode it.

Nov 16, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here