Part 1
- [5 points] Insert the values 13,9,10,21,11,16,4,12,6 into an initially empty AVL tree in order. Show only the final tree after all insertions. Then, delete 10,9,6 in given order. Show only the final tree after all deletion operations.
- [5 points] Insert 15,13,9,5,12,8,7,4,0,6,2,1,3 into an empty min-heap. Show only the final heap as a tree (not as an array) after all insertions.
- [15 points] Let T1 and T2 be AVL trees such that the largest key in T1 is less than the smallest key in T2. Give an algorithm (in psuedocode) for procedure JOIN-AVLTREES(T1, T2) that joins these two AVL trees. The runtime should be O(logn), where n is the size of the resulting AVL tree. Justify in a few of sentences the correctness and efficiency of your algorithm.
Part 2 Running Median
Use the given file names and function signatures during implementation. Feel free to write helper functions to accomplish certain tasks but remember to give meaningful function names.
- [7,5 points] Implement an array-based max-heap data structure named as MaxHeap for maintaining a list of integer values that supports the following operations:
Listing 2: MaxHeap
void insert(int value); // inserts integer into heap int peek(); // returns the value of the max element int extractMax(); // retrieves and removes the max element int size(); // returns the number of elements in the heap
Put your code into MaxHeap.h and MaxHeap.cpp files.
- [7,5 points] Implement an array-based min-heap data structure named as MinHeap for maintaining a list of integer values that supports the following operations:
Listing 3: MinHeap
void insert(int value); // inserts integer into heap int peek(); // returns the value of the min element int extractMin(); // retrieves and removes the min element int size(); // returns the number of elements in the heap
Put your code into MinHeap.h and MinHeap.cpp files.
- [20 points] Design a MedianHeap data structure reusing MaxHeap and MinHeap, which supports getting median of the elements in O(1) time and inserting an element in O(logn) If the length of the list is even, choose the larger value of the two middle elements as the median.
Listing 4: MedianHeap
void insert(int value); //inserts an element into MedianHeap int findMedian(); //returns the value of the median
Put your code into MedianHeap.h and MedianHeap.cpp files.
Part 3 Huffman Coding
Huffman coding, in essence, is a lossless data compression algorithm. Consider the data which is a sequence of characters encoded in ASCII. In ASCII, every character is encoded with the same number of bits: 8 bits per character. The common characters, e.g., alphanumeric characters, punctuation, control characters, etc., use only 7 bits; there are 128 different characters that can be encoded with 7 bits. Huffman coding on the other hand, compresses data by using fewer bits to encode more frequently occurring characters so that not all characters are encoded with 8 bits.
Suppose that we have a 100,000 character data file that we wish to store. We observe that the characters in the file occur with the frequencies given by Table 1. That is, only 6 different characters appear, and the character a occurs 45,000 times. If we store it without compression we would need 700,000 bits. However with Huffman coding, we can compress the file to (45 1 + 13 3 + 12 3 + 16 3 + 9 4 + 5 4)1,000 = 224,000 bits.
Table 1: A character-coding
a b c d e f
Frequency (in thousands) 45 13 12 16 9 5
ASCII 1100001 1100010 1100011 1100100 1100101 1100110
Variable-length codeword 0 101 100 111 1101 1100
The main goal is to create a tree that corresponds to coding schemes that will be used to encode/decode a file. Huffman trees are a type of binary tree used in performing this kind of conversion. A Huffman tree includes all the characters of your data on the leaf nodes, with most frequent characters being closer to the root. The distance and path away from the root is used to determine what bits (what 0s and 1s) are used to represent each character. The Huffman tree of Table 1 is as follows.
In the pseudocode that follows, we assume that C is a set of n characters and that each character c in C is an object with an attribute c.freq giving its frequency. The algorithm uses a min-priority queue Q, keyed on the freq attribute, to identify the two least-frequent objects to merge together.
Algorithm 1: HUFFMAN(C)
- n = |C|;
- Q = C;
- for i = 1 to n 1 do
4allocate a new node z;
5z.left = x = EXTRACT_MIN(Q);
6z.right = y = EXTRACT_MIN(Q);
7z.freq = x.freq + y.freq;
8INSERT(Q,z);
- end
- return EXTRACT_MIN(Q)
The steps of creating the Huffman tree of Table I is given in the next page as an example [1]. Your first objective should be to implement the min-heap data structure as an array of pointers in which each pointer points to a MinHeapNode.
Listing 5: MinHeapNode
struct MinHeapNode
{
char character; // An input character int freq; // Frequency of the character
MinHeapNode *left, *right; // Left and right child
};
Put your min-heap code into HuffmanHeap.h and HuffmanHeap.cpp files. After you are finished, transform your heap into a min-priority queue implementing Huffman coding and put the codes into HuffmanQueue.h and HuffmanQueue.cpp files. Design the Huffman coding algorithm and put your code into HuffmanCode.h and HuffmanCode.cpp files.
Add a main function to the main.cpp file which reads an input file of characters and their frequencies (i.e., e 9), and writes their coding schemes (i.e., e 1101) on an output file. Take input and output file names as arguments. At the end, write a basic makefile which compiles all your code and creates an executable file named hw3.
References
[1] T.M. Cormen, C. E. Leiserson, R. L. Rivest & C. Stein. Introduction to Algorithms,
3rd edition. The MIT Press. 2001. 432
Reviews
There are no reviews yet.