# Programming Assignment 3 Computer Science 102

Spring 2002

Second draft due: WED, APRIL, 17, 11:59 pm.

Introduction

As discussed in draft#1, we are studying an efficient scheme for for compressing text files called Huffman coding. We again exploit the fact that not all characters appear with the same frequency in the text, by encoding rarely used characters with long codes and frequently used ones with short codes.

Given a set of characters and their corresponding frequencies, an optimal coding scheme is produced by Huffman coding. It uses a binary tree for encoding. In Draft #2, however, we use a heap as a priority queue instead of a linked list. The big Oh behavior for constructing a heap is O(nlog(n)); whereas, its O(n^2) for a list

Draft #2

We will use a heap implementation of a priority queue to write and test the initial versions of some of the methods to be used in draft #3. Note that here we use a min-heap in this project instead of the max-heap we used in class. Also, deleteMin() will replace the deleteMax() that is on the lecture links on the web. Here is the skeleton of the class you will be using:

```class HuffADTa
//solves draft #2 for project #3
{   int last;//the # of nodes in heap. This decreases as succesive pairs of
//minimum nodes are extracted
final int MAX = 11;//there are 11 pairs of letter-frequency entries

Object[] ob;

private class HuffRecord//defines the data frequency record
{  public String letter;
public int freq;
}

private class TreeNode//defines the treelets comprising the list
{  public HuffRecord data;
public TreeNode left, right;
}
}
```

where HuffRecord and TreeNode are embedded classes. The keyword public is optional because it is the default. Here are the procedures to be written in your ADT class:

1. public void ReadOrdList( String file), where file is a file name, e.g., "c:\\huff.txt". This is similar to the method of tha same name that you used in draft #1. It:
• reads a record consisting of a sequence of characters and their associated frequencies from the file.
• calls private void pairs(String record, HuffRecord[] stream) and sends the record to it. The method pairs constructs stream, an array of objects of type HuffRecord. Since stream is an object, any changes made to it are reflected in the calling method.
• passes the array stream to initialize, which assigns the two fields of stream[], i.e., freq and letter to the corresponding fields of an instance of TreeNode, let's say aux. It assigns this to an element of an array of Object, ob[]. This array is an instance variable which represents the heap.
2. Method write prints the elements of the array ob In order to do this, you must of course cast ob[j] as an instance of TreeNode and then dereference it before you print it.

3. method shift() shifts the elements one position, so that parent =child/2 has meaning.

4. Method build creates a heap by calling shiftUp(j) in a loop. To facilitate your understanding of the heap process let's call shiftUp(j) by the name insert instead.

5. Method reduceIt() calls reduce in a while loop while an instance variable last is greater than one. last represents the value of the highest subscript of the heap as the heap is shrunken,

6. private void reduce() is similar to the method of the same name in draft #1. It
• calls DeleteMin twice to obtain the element of the heap with minimum frequency, i.e., the root. The first call should return a pointer p1 of type Object. The counter last is then decremeneted. The second call should return a pointer p2.

• calls Combine with the above two pointers as parameters.
7. private void combine(TreeNode p1, TreeNode p2) creates a new instance aux of TreeNode as in draft #1 whose frequency field is the sum of the frequencies in the objects pointed to by p1 and p2, and assigns aux to ob[last]. Then places the element in its proper place in the heap by calling insert(last).

8. private Object deleteMin(int n) swaps the top of the heap, the first element in ob[], with the last element in it. It then reheapifies the heap by calling shiftDown(n-1). This method is defined in the heap sort done in class.

Each procedure should contain preconditions and postconditions. The data for this program is posted on the web at huff.txt ; Or if for some reason you cannot download the data from the web, you can make your own file. The data is: A 40 E 45 G 6 H 8 I 34 L 12 N 9 O 20 S 15 V 2 W 3

Your main method should create an instance of the class huffADTa and should

• Read the data from the file and form the heap.
• Print the elements of the heap.
• Reduce the heap to the final element.
The output is on the WEB. In the output, X is the character placed in the letter field for elements produced by combining two other elements.