# Computer Science 102

Spring 2003

Second draft due: THURS night, APR 10

Introduction

In this draft you will finish the implementation of Huffman coding.

Given a set of characters and their corresponding frequencies, an optimal coding scheme is produced by Huffman coding. A binary tree is constructed so that each leaf in the tree and only the leaves contain a character. We consider all left edges of the tree (edges from a node to its left child) to be labelled 0, and all right edges to be labelled 1. The code for any character is the string of bits produced by traversing the edges of the tree from the root to the leaf that contains that character. To produce the best coding, characters with high frequencies should be high in the tree so that they will have short codes. For example consider the following tree:

```

0 /   \ 1
/     \
E     0/ \  1
H   /\
0 /  \  1
L     B

```

The codes is the above tree are: E = 0, H = 10, L = 110, B = 111.

A Huffman tree is built in the following way: we create a tree node for each character. The node stores the character and the frequency. Initially the data is stored in a heap as single tree nodes with the left and right children being null, just as it appeared in draft #2. Then we repeatedly combine the trees with the two lowest frequencies but now we make them the children of a new node that has the combined frequencies as its frequency and has an arbitrary character for the letter. The subtree with the smaller frequency becomes the left subtree of this new node, and the other one becomes the right subtree.

Your ADT for this draft must be changed to incorporate two new methods, decode and printCode.

Here are the procedures to be written in your ADT:

1. readOrdList. Already done in draft #2.

2. printIt() prints the characters and corresponding frequencies on the sorted linked list before any nodes are combined. Already done in draft #2.

3. combineTree(TreeNode p1, TreeNode p2) This is almost identical to combine for draft #2 except that in addition to giving a value to the combined frequency, it sets the left field of the node created (its type is TreeNode) to p1, and its right field to p2.

4. private void reduce() from draft 2, which now calls CombineTree instead of Combine.

5. reduceIt Already done in draft #2.

6. private void printCode(TreeNode tree, String s) a recursive routine that prints the Huffman code for each letter in the tree. Do the concatenation of '0' and '1' to stream in the actual parameter. printCode is called by the method printCode() that has no parameters.

7. private void decode(TreeNode t, String binary) which takes the text file code that consists of compressed binary code (the text file for this on the WEB at huffcode.txt ) and recursively decodes it using the Huffman tree just built. This is called by the method decode() --no parameters -- that reads the text file.

Each procedure should contain preconditions and postconditions. The data for this program to build the tree is the same as in draft #2. The text file with the compressed binary file will be on the WEB.

Your main method should call an instance of the object HuffADTb and should

• Read the data from the file and form the Huffman tree.
• Print the Huffman code for each letter using PrintTheCode.
• Decode the compressed binary file.