Spring 2007
Due THURS 11:59pm APR 12
the 2nd midterm will bw MON APR 16.
Introduction
As discussed in draft#1, we are studying an efficient scheme for for compressing text files called Huffman coding. We again exploit the fact that not all characters appear with the same frequency in the text, by encoding rarely used characters with long codes and frequently used ones with short codes.
Given a set of characters and their corresponding frequencies, an optimal coding scheme is produced by Huffman coding. It uses a binary tree for encoding. In Draft #3, however, we use a heap as a priority queue instead of a linked list. The big Oh behavior for removing the smallest value (the root) from the heap is O(1) -- it's done in constant time -- and the behavior for reheaping is O(log(n)) since it only depends on the height; whereas, the big Oh behavior for removing the minimum values from a linked list and repositioning the combined treelet is O(n). The big Oh behavior for doing this for n nodes is O(n*log(n)) for the heap and O(n^2) for the linked list. Please use the command line to execute your program and read from it the two file names, namely, huff.txt (it contains the letters and frequencies) and huffcode.txt (the "compressed" binary file). The command line can be obtained from Run on the Start menu by typing cmd. The two file names should be used as global variables in your program. Note that this draft can be done with static methods.
Draft #3
We will use a heap implementation of a priority queue to this program. Note that here we use a min-heap in this project instead of the max-heap we used in class. Also, deleteMin() will replace the deleteMax() that is on the lecture links on the web. Here is the skeleton of the class you will be using:
public class Hw3c //solves draft #3 for project #3 { static int last; //the # of nodes in heap. This decreases as succesive pairs of //minimum nodes are extracted final static int MAX = 11;//there are 11 pairs of letter-frequency entries static Comparable x = new Comparable[MAX + 1]//one more element for shift up private static class HuffRecord//defines the data frequency record { public String letter; public int freq; } private static class TreeNode implements Comparable //defines the treelets comprising the list { public HuffRecord data; public TreeNode left, right; } }
where HuffRecord and TreeNode are, as before, embedded classes. The keyword public is optional because it is the default. The class TreeNode must of course contain a compareTo method and should also contain a toString() method. Here are the methods to be altered or added to your class of draft #2:
Each procedure should contain preconditions and postconditions. The data for this program is the same as for draft #2.
Your main method should: