Programming Assignment 3 Computer Science 102
First draft due: WED, APRIL, 5,
In this assignment we will study an efficient scheme for for compressing
a text file. It is called Huffman coding. A simple method to encode
text as a string of 0s and 1s is to represent each character as a unique
string of 8 binary digits (bits). A text message is thus translated into
a string of bits. By exploting the fact that not all characters appear
with the same frequency in the text, we can encode rarely used characters
with long codes and frequently used ones with short codes.
Given a set of characters and their corresponding frequencies, an optimal
coding scheme is produced by Huffman coding. It uses a binary
tree for encoding. In Draft #1, however, we simply use a linked list.
We will use a linked list implementation to write and test the
initial versions of some of the procedures to be used in draft #2.
here is the data structure (it's on the WEB) you will using:
type HuffmanRecord = record
name : char;
freq : integer
point_tnode = ^tnode
tnode = record
Data : HuffmanRecord
left, right : point_tnode
pointer = ^ ListNode;
ListNode = record
info : point_tnode;
next : pointer
RecordFile = File of HuffmanRecord;
huffman_type = object
PROCEDURE ReadOrdList( var DataFile : RecordFile);
list : pointer;
Here are the procedures to be written in your ADT:
sort(var list : pointer; newPtr : point_tnode); which
creates a record of type ListNode with the info
field set to set to newPtr and inserts it into the linked list
pointed to by list, in its proper place. This linked list is
sorted by frequencies, i.e., newPtr^.data.freq. It uses
push( list, newPtr) if the frequency is the smallest on the list
to insert a node at the beginning of the list,
or insert( follow) to insert a node in any point in the
rest of the sorted list, where follow is the pointer to the node
that has the highest frequency smaller that that of the new node, i.e.,
the pointer to the preceding node in the linked list. The node
representing the record is either created in Push or in
ReadOrdList( var DataFile : RecordFile);
until there is no more data to be read. List is the private
field and ReadOrdListis a method of the object.
- reads a sequence of characters and their associated frequencies
from a file
- creates a record of type point_tnode which holds the
information read in.
- and inserts this new node into the linked list list
by calling sort
DelFirst(var list, smallfreqNode : pointer) which removes
the first element of the linked list pointed to by list,
and returns the pointer to that element in smallfreqNode.
Combine( var list : pointer; p1, p2 : pointer) which creates
a new node whose frequency is the sum of the frequencies in the
nodes pointed to by p1 and p2, and inserts that new node
in its proper place in the linked list list by calling
reduce( var list : pointer) which gets as input list
- calls DelFirst twice to obtain the two nodes within
the linked list of minimum frequency, the first call should return
a pointer p1, and the second call should return
a pointer p2.
- calls Combine with list and
the above two pointers as parameters.
calls reduce and prints the resulting linked list
until there is only one node remaining on the list. The
frequency field of this node will automatically contain the sum of all the
frequencies on the original linked list.
Each procedure should contain preconditions and postconditions.
The data for this program will contain the frequencies and will be mailed
to you via majordomo. Or you can make your own binary file
(see Section 15.3, page 665 in the Turbo Pascal text). The
data is: A 40 E 45 G 6 H 8 I 34 L 12 N 9 O 20 S 15 V 2 W 3
and the binary file is on the WEB. Note that you cannot read a binary
file simply by opening the link and viewing it using a browser. You must
download it and then write a Pascal program to read it.
Your client should call an instance of the object huffman_type
The client and the output are on the WEB. In the output, $
is the character placed in the name field for nodes produced
by combining two other nodes.
- Read the data from the file and form the linked list.
- Print the linked list.
- Reduce the list to the final node.