# Programming Assignment 3 Computer Science 102

Spring 1997

First draft (preliminary version) due: SUN, APRIL, 5, 12:00 pm.

Final draft due: SUN, APRIL 12, 12:00 pm

Introduction

In this assignment we will study an efficient scheme for for compressing a text file. It is called Huffman coding. A simple method to encode text as a string of 0s and 1s is to represent each character as a unique string of 8 binary digits (bits). A text message is thus translated into a string of bits. By exploting the fact that not all characters appear with the same frequency in the text, we can encode rarely used characters with long codes and frequently used ones with short codes.

Given a set of characters and their corresponding frequencies, an optimal coding scheme is produced by Huffman coding. It uses a binary tree for encoding. In Draft #1, however, we simply use a linked list.

Draft #1

We will use a linked list implementation to write and test the initial versions of some of the procedures to be used in draft #2. here is the data structure (it's on the WEB) you will using:

```unit huff3ADT;

INTERFACE

type HuffmanRecord = record
name : char;
freq : integer
end;

point_tnode = ^tnode
tnode = record
Data : HuffmanRecord
left, right : point_tnode
end;

pointer = ^  ListNode;
ListNode = record
info : point_tnode;
next : pointer
end;
RecordFile = File of  HuffmanRecord;
huffman_type = object
PROCEDURE print;
PROCEDURE ReadOrdList( var DataFile : RecordFile);
PROCEDURE Reduce_it;
private
list : pointer;
end {object};

IMPLEMENTATION
end.
```

1. InsertList(var list : pointer; newPtr : point_tnode); which creates a record of type ListNode with the info field set to set to newPtr and inserts it into the linked list pointed to by list, in its proper place. This linked list is sorted by frequencies, i.e., newPtr^.data.freq. It uses push( list, newPtr) if the frequency is the smallest on the list to insert a node at the beginning of the list, or insert( follow, newPtr) to insert a node in any point in the rest of the sorted list, where follow is the pointer to the node that has the highest frequency smaller that that of the new node, i.e., the pointer to the preceding node in the linked list. The node representing the record is either created in Push or in Insert.

2. ReadOrdList( var DataFile : RecordFile); which repeatedly
• reads a sequence of characters and their associated frequencies from a file
• creates a record of type point_tnode which holds the information read in.
• and inserts this new node into the linked list list by calling InsertList
until there is no more data to be read. List is the private field and ReadOrdListis a method of the object.

3. DelFirst(var list, smallfreqNode : pointer) which removes the first element of the linked list pointed to by list, and returns the pointer to that element in smallfreqNode.

4. Combine( var list : pointer; p1, p2 : pointer) which creates a new node whose frequency is the sum of the frequencies in the nodes pointed to by p1 and p2, and inserts that new node in its proper place in the linked list list by calling InsertList.

5. reduce( var list : pointer) which gets as input list
• calls DelFirst twice to obtain the two nodes within the linked list of minimum frequency, the first call should return a pointer p1, and the second call should return a pointer p2.
• calls Combine with list and the above two pointers as parameters.

6. Reduce_it calls reduce and prints the resulting linked list until there is only one node remaining on the list. The frequency field of this node will automatically contain the sum of all the frequencies on the original linked list.

Each procedure should contain preconditions and postconditions. The data for this program will contain the frequencies and will be mailed to you via majordomo. Or you can make your own binary file (see Section 15.3, page 665 in the Turbo Pascal text). The data is: A 40 E 45 G 6 H 8 I 34 L 12 N 9 O 20 S 15 V 2 W 3 and the binary file is on the WEB.

Your client should call an instance of the object huffman_type and should

• Read the data from the file and form the linked list.