Programming Assignment 3 Computer Science 102

Spring 1997

First draft due: Sun, April, 6, 9:00 pm.


In this assignment we will study an efficient scheme for for compressing a text file. It is called Huffman coding. A simple method to encode text as a string of 0s and 1s is to represent each character as a unique string of 8 binary digits (bits). A text message is thus translated into a string of bits. By exploting the fact that not all characters appear with the same frequency in the text, we can encode rarely used characters with long codes and frequently used ones with short codes.

Given a set of characters and their corresponding frequencies, an optimal coding scheme is produced by Huffman coding. It uses a binary tree for encoding.

Draft #1

We will use a linked list implementation to write and test the initial versions of some of the procedures to be used in draft #2. here is the data structure you will using:

unit huff3ADT;


type HuffmanRecord = record
        name : char;
        freq : integer

     point_tnode = ^tnode
     tnode = record
        Data : HuffmanRecord
        left, right : point_tnode    

     pointer = ^  ListNode;
     ListNode = record         
        info : point_tnode;
        next : pointer
     RecordFile = File of  HuffmanRecord;
     huffman_type = object
        PROCEDURE print;
        PROCEDURE ReadOrdList( var DataFile : RecordFile);
        PROCEDURE Reduce_it;
            list : pointer;
     end {object};


Here are the procedures to be written in your ADT:

  1. InsertList(var list : pointer; newPtr : point_tnode); which creates a record of type ListNode with the info field set to set to newPtr and inserts it into the linked list pointed to by list, in its proper place. This linked list is sorted by frequencies, i.e., newPtr^.data.freq. It uses push( list, newPtr) if the frequency is the smallest on the list to insert a node at the beginning of the list, or insert( follow, newPtr) to insert a node in any point in the rest of the sorted list, where follow is the pointer to the node that has the highest frequency smaller that that of the new node, i.e., the pointer to the preceding node in the linked list.

  2. ReadOrdList( var DataFile : RecordFile); which repeatedly until there is no more data to be read. List is the private field and ReadOrdListis a method of the object.

  3. DelFirst(var list, smallfreqNode : pointer) which removes the first element of the linked list pointed to be list, and returns the pointer to that element in smallfreqNode.

  4. Combine( var list : pointer; p1, p2 : pointer) which creates a new node whose frequency is the sum of the frequencies in the nodes pointed to by p1 and p2, and inserts that new node in its proper place in the linked list list by calling InsertList.

  5. reduce( var list : pointer) which gets as input list

  6. Reduce_it calls reduce and prints the resulting linked list until there is only one node remaining on the list. The frequency field of this node will automatically contain the sum of all the frequencies on the original linked list.

Each procedure should contain preconditions and postconditions. The data for this program will contain the frequencies and will be mailed to you via majordomo. Or you can make your own binary file. The data is: A 40 E 45 G 6 H 8 I 34 L 12 N 9 O 20 S 15 V 2 W 3.

Your client should call an instance of the object huffman_type and should