Programming Assignment 3 Computer Science 102

Spring 2005

First draft due: TH, MAR31 , 11:59 pm. 11:59 pm.


In this assignment we will study an efficient scheme for for compressing a text file. It is called Huffman coding. A simple method to encode text as a string of 0s and 1s is to represent each character as a unique string of 8 binary digits (bits). A text message is thus translated into a string of bits. By exploting the fact that not all characters appear with the same frequency in the text, we can encode rarely used characters with long codes and frequently used ones with short codes.

Given a set of characters and their corresponding frequencies, an optimal coding scheme is produced by Huffman coding. It uses a binary tree for encoding. In Draft #1, however, we simply use a linked list.

Draft #1

We will use a linked list implementation to write and test the initial versions of some of the procedures to be used in draft #2. here is the skeleton of the class (it's on the WEB) you will be using:

class HuffADTa
//solves draft #1 for project #3
{   ListNode list;
    final int MAX = 11;//there are 11 pairs of letter-frequency entries

    private class HuffRecord//defines the data frequency record
    {  public String letter; 
       public int freq;
    private class TreeNode//defines the treelets comprising the list
   {  public HuffRecord data;
      public TreeNode left, right;
   private class ListNode//defines the linked list. Each node is a treelet
   { public TreeNode info;
     public ListNode next;

where HuffRecord, TreeNode, and ListNode are all embedded classes. The keyword public is optional because it is the default. If you wish, you may use the program we did in class displayed at:

since it contains these three embedded classes, and add the following procedures to the program:

  1. private ListNode delFirst() removes the first element of the linked list pointed to by list, and returns the pointer to that element in let's say smallfreqNode.

  2. private void combine(ListNode p1, ListNode p2) creates a new node whose frequency is the sum of the frequencies in the nodes pointed to by p1 and p2, and inserts that new node in its proper place in the linked list list by calling insertList.

  3. private void reduce()

  4. public void reduceIt() calls reduce and prints the resulting linked list until there is only one node remaining on the list. The frequency field of this node will automatically contain the sum of all the frequencies on the original linked list.

Each procedure should contain preconditions and postconditions. The data for this program is posted on the web at huff.txt ; Or if for some reason you cannot download the data from the web, you can make your own file. The data is: A 40 E 45 G 6 H 8 I 34 L 12 N 9 O 20 S 15 V 2 W 3

Your main method should create an instance of the class and should

The output is on the WEB. In the output, X is the character placed in the name field for nodes produced by combining two other nodes.