CSCI 620  Program 5: Huffman Compression, Step 2

1 The Goals and the Purpose.
1. To use an array and a heap.
2. To implement the rst two phases of a Hu man Encoding application.
A Hu man code can be used to compress a le. It replaces xed-size, one-byte characters by variable- length codes (strings of bits). The codes for the most frequently used characters will be shorter than 8 bits, those for unusual characters may be longer. A new code is de ned for each le.
This helps to keep the codes short, since no codes are generated for characters that are not used in the le. The input le is read twice, once to generate the code, then again to encode the le. This assignment and the next two form one large project to generate and use a Huffman code. It is essential
to debug this part before going on to the rest of the project.
2 Modules to Create
Class Tally You have a complete and working Tally class from Program 2. Include your Tally class as part of this project.
Class Node and Tree
 De ne a tree Node with four data members: a char, an int, and two Node*s. Also typedef Tree to be a Node*.
 Write Node constructor that takes two parameters: a char (the input character), and an int (its frequency). Use the parameters and two nullptrs to initialize the Node object.
 Write a second Node constructor that takes two di erent parameters: two Node*s (the left and right sons of the new node). Use the parameters to initialize the pointer elds of the Node object.
Set the char eld to any arbitrary visible char value: you will never use it but you need to be able to see it on a printout. Set the frequency eld to the sum of the frequencies of the left and right sons.
 You may also need a default constructor for Node. If so, use =default.
 For this assignment, de ne a null default destructor. A real destructor will be needed in the last phase of this program, but not at this stage. The output of this phase will be a binary tree of Nodes that contains every Node that you have allocated.
 Write a print function that displays all four elds. The pointers will be printed in hex if you use
<<.
 Write accessor functions if and only if you need them.
A Heap Class Adapt the function de nitions from the class Heap example to implement a MIN-heap that stores Trees instead of numbers.
 Your Heap should have one data member: a vector<Node*.
 Write a Heap constructor with one parameter, an array of type Tally. See details below.
 Implement the downHeap, buildHeap, and printHeap functions rst, and debug that much.
 After your heap works, implement push() and pop() to put things into the heap and take them out.
 Also write a function reduceHeap() that calls push() and pop() to reduce the heap to one element. See details below.

The Heap Constructor
 Start by pushing a dummy Node* into the vector to occupy subscript 0. We want the real data
to start at subscript 1.
 Then process the data in the tally vector:
{ Take each non-zero Tally in the array.
{ Use the character and the counter to create a new Node*.
{ Push the Node* into the vector.
 Call the buildHeap function to arrange the data in heap order, according to the frequency of the
letters, with the least frequent letter in slot 1 of the vector
reduceHeap() The goal of this function is to combine all the Nodes in the heap into one Hu man
tree. Repeat this until there is only one node left:
 Call pop() to remove the node at the root, replace it by the last node in the heap, and perform
a downHeap() operation to re-establish heap order. This node will be the left son of a new node.
 Call pop() again to remove the node at the root, replace it by the last node in the heap, and
perform a downHeap() operation to re-establish heap order. This node will be the right son of a
new node.
 Create a new Node with these to Node*s.
 Push the new node onto the end of the heap and perform an upHeap() operation to re-establish
heap order.
The main Function
1. Create a Tally object and use it, as you did in P2, to tally the characters in an input le. Main
should own the Tally array.
2. Construct a Heap object using the lled Tally array. The constructor will call buildHeap().
When the heap constructor nishes, print the heap array and make sure the heap is correct. Then go
on to the next phase:
1. Call reduceHeap() to unify the nodes in the Heap.
2. Print the resulting tree that is in subscript 1 of the heap. To do this, you will need to do a
recursive tree walk. In this recursion, print the contents of a node then indent four columns and
recursively print the let son, then the right son. Return from the recursion when the tree you are
trying to print is a nullptr; print \|||-" instead. (Remember: every leaf node has two nullptr
sons.)
3 Future Work
The last two phases of the Hu man project are:
 P6: Do a recursive treewalk to generate a code. Write the code to a binary le.
 P7: Reopen the original text le. Encode each letter in it. Write the encodings to a binary le.
In a real Hu man project, the code and the encoded message are written to the same le. I am asking
you to keep them separate for reasons related to debugging and grading.
Powered by