Programming Assignment 3 Computer Science 102

Spring 2001

First draft due: MON, APRIL, 2, 11:59 pm.

Introduction

In this assignment we will study an efficient scheme for for compressing a text file. It is called Huffman coding. A simple method to encode text as a string of 0s and 1s is to represent each character as a unique string of 8 binary digits (bits). A text message is thus translated into a string of bits. By exploting the fact that not all characters appear with the same frequency in the text, we can encode rarely used characters with long codes and frequently used ones with short codes.

Given a set of characters and their corresponding frequencies, an optimal coding scheme is produced by Huffman coding. It uses a binary tree for encoding. In Draft #1, however, we simply use a linked list.

Draft #1

We will use a linked list implementation to write and test the initial versions of some of the procedures to be used in draft #2. here is the skeleton of the class (it's on the WEB) you will be using:

class HuffADTa
//solves draft #1 for project #3
{   ListNode list;
    final int MAX = 11;//there are 11 pairs of letter-frequency entries

    private class HuffRecord//defines the data frequency record
    {  public String letter; 
       public int freq;
    }
    
    private class TreeNode//defines the treelets comprising the list
   {  public HuffRecord data;
      public TreeNode left, right;
   }
   
   private class ListNode//defines the linked list. Each node is a treelet
   { public TreeNode info;
     public ListNode next;
   }
}

where HuffRecord, TreeNode, and ListNode are all embedded classes. The keyword public is optional because it is the default. Here are the procedures to be written in your ADT class:

  1. public void ReadOrdList( String file), where file is a file name, e.g., "c:\\huff.txt". This method:
  2. private void createList( HuffRecord[] stream) creates objects of type TreeNode, places the elements of stream into the data fields of this object and using insertList inserts these objects as nodes in the linked list pointed to by list.
  3. private void insertList(TreeNode tree). This uses two variables current and follows of type ListNode where follows chases current until the value of tree.data.freq determines where the node whose info field is tree is inserted in the list. It uses push(tree) if the frequency is the smallest on the list to insert a node at the beginning of the list, or insert( follow, tree) to insert a node in any point in the rest of the sorted list, where follow is the pointer to the node that has the highest frequency smaller that that of the new node, i.e., the pointer to the preceding node in the linked list. The node representing the frequency-letter pair is either created in Push or in Insert.

  4. private ListNode delFirst() removes the first element of the linked list pointed to by list, and returns the pointer to that element in smallfreqNode.

  5. private void combine(ListNode p1, ListNode p2) creates a new node whose frequency is the sum of the frequencies in the nodes pointed to by p1 and p2, and inserts that new node in its proper place in the linked list list by calling insertList.

  6. private void reduce()

  7. public void reduceIt() calls reduce and prints the resulting linked list until there is only one node remaining on the list. The frequency field of this node will automatically contain the sum of all the frequencies on the original linked list.

  8. public void printIt() prints the values of the letter and freq fields for each node on the linked list.

Each procedure should contain preconditions and postconditions. The data for this program is posted on the web at huff.txt ; Or if for some reason you cannot download the data from the web, you can make your own file. The data is: A 40 E 45 G 6 H 8 I 34 L 12 N 9 O 20 S 15 V 2 W 3

Your main method should create an instance of the class huffADTa and should

The main method and the output are on the WEB. In the output, X is the character placed in the name field for nodes produced by combining two other nodes.