The assignment is to input movie and review data, allowing the user to search this data quickly for movies by ID or name. Heres a screenshot of the final program (one possible execution run):
The only data structure you may use are binary search trees, in particular the BST class you have developed in the HW and labs.
Programming Environment
You are free to program in any environment you want; final submissions will be collected using
Gradescope. We are releasing an initial main.cpp file, and sample input files, on Codio. Login to Codio and open the project cs251-project02-bst-movie-search. You can export the provided files using the Project menu, Export as Zip. Or you can work on Codio using the provided makefile: open a terminal window via the Tools menu, and make build to compile and make run to run.
If you have not yet created a Codio account, register using your UIC email address and join the class (CS 251 Fall 2019) via the following URL: https://codio.com/p/joinclass?token=brigadeamerica
Part 1 of 5: binarysearchtree<TKey, TValue>
From the most recent HW and lab exercises, you should have a working binarysearchtree<TKey> class in a header file bst.h. The first part of this assignment is to extend your binarysearchtree class so that it works in terms of (key, value) pairs. At this point we have only stored keys in the tree. In reality, trees store a key for ordering / searching, and a separate value for any data associated with the key. For example, your UIC Netid is a key, while your name and address are values associated with your Netid. A value can be a simple integer, or more typically a structure of values (e.g. Name, Address, Phone, Email, etc.).
In the context of this assignment, youll need to store movie information in the tree, e.g. the publication year of the movie, along with review data. So youll want to define a structure to hold the data:
struct MovieData
{ int PubYear; int Num5Stars; int Num4Stars; int Num3Stars; int Num2Stars; int Num1Stars;
};
Now, to store movie data using the movie id as the key, you declare a binary search tree as follows:
binarysearchtree<int,MovieData> bstMoviesByID;
In other words, each entry in the tree is now a (key, value) pair where the key is an integer movie id, and the value is a MovieData struct. The tree is still ordered and searched using the key, the only differences are that insert will insert both (key, value), and search will return a pointer to the value so it can be read / written.
The good news is these changes are fairly straightforward, since only a few modifications are needed. To begin, the template clause at the top of bst.h must change to include both key and value:
template<typename TKey, typename TValue> class binarysearchtree
Next youll need to update the struct NODE to store both the key and the value:
struct NODE
{
TKey Key;
TValue Value; NODE* Left;
NODE* Right;
};
Keep in mind the tree is still ordered and searched by the key, so nothing with regards to how the tree is built or searched is changed. For the insert function, the only difference is that when the node is inserted, the value must also be stored in the new node. This means the value must be passed as a parameter:
void insert(TKey key, TValue value) { .
. // same as before, though be sure to store value along with key .
}
The final change involves the search function, which currently returns true / false to denote whether the key was found or not found. Now, if the key is found, we want to return a pointer (yes, a C-style pointer) to the value so it can be read or written. If the key is not found, search will return nullptr. Heres the declaration:
TValue* search(TKey key) { .
. // return pointer to value if found, nullptr if not . // NOTE: to return pointer: return &(cur->Value) .
}
Make these changes, and then test using different (key, value) combinations. Example:
binarysearchtree<int,int> bst1; binarysearchtree<int,string> bst2; binarysearchtree<string,int> bst3;
bst1.insert(123, 456); .
.
.
int* value = bst1.search(123);
cout << *value << endl; // should output 456 *value = 789; // change value to 789
I would recommend modifying your inorder function to output both the key and the value, and use inorder to help confirm your functions are working correctly.
Part 2 of 5: binarysearchtree<TKey, TValue> copy constructor
Since we are going to use the binarysearchtree class in a program, we need a copy constructor for parameter passing to work properly. Add a copy constructor in the public section of your bst.h file:
//
// copy constructor:
//
binarysearchtree(binarysearchtree& other) {
.
.
.
}
Like we did in the previous project, the goal of the copy constructor is to make a complete and independent copy of the other tree. Recall that you can make your life easier if you take advantage of the functions already available in the class [ Hint: insert ]. To make a copy, you have to traverse the entire other tree, and visit every node. Your inorder function does this as well, so your copy constructor should mimic how inorder works including the public-private approach. However, think very carefully about how the recursive traversal should be done: inorder, preorder, or postorder? Humm..
How to test your work? Make a copy of the tree, change the copy, and make sure the original is unchanged. Heres some code to get you started:
binarysearchtree<int,int> bst1; bst.insert(123, 456); .
. // insert more (key, value) pairs .
binarysearchtree<int, int> bst2 = bst1; // copy construct:
cout << bst1.size() << vs. << bst2.size() << endl; cout << bst1.height() << vs. << bst2.height() << endl;
int* value1 = bst1.search(123); int* value2 = bst2.search(123);
cout << *value1 << vs. << *value2 << endl; // both 456
*value2 = 789; // this should only change bst2:
cout << *value1 << vs. << *value2 << endl; // 456 vs. 789
Part 3 of 5: binarysearchtree<TKey, TValue> submission
Submit your binarysearchtree class to Gradescope for grading. Your work will receive a tentative grade for correctness, and then will undergo manual review for commenting, readability, and approach. This part of the assignment is worth 30 points, and commenting/readability/approach count for 10% of that 30 points.
Part 4 of 5: movie search program
Once you have a working BST class, use this class to input movie data and efficiently search this data in O(lgN) time. Since the input data is in random order, tree balancing is not required. Heres another screenshot of the program in
action ->
The movies input file consists of N>0 lines, where each line consists of 3 values: integer movie id, integer publication year, and string movie name. Heres the start of the provided movies1.txt file:
119 1995 Heat
87 1952 Singin in the Rain 6 2008 The Dark Knight .
.
.
104 1988 Die Hard
The data is in random order, both in terms of the movie id, and the movie name. This makes it suitable for building binary search trees using the id as a key, and the name as a key. In this assignment, youll want to create two different trees so you can lookup by id or name in O(lgN) time. Make no assumptions about the input data, as well test your work using different movie files.
The reviews input file consists of N>0 lines, where each line consists of 3 values: an integer review id, an integer movie id, and the integer rating in the range 1..5. The movie id refers to the movie for which this rating is associated, and the rating is the # of stars given in this review. Heres the start of the provided reviews1.txt file:
140035 188 5
133994 78 4
123308 199 2 .
.
.
289471 78 4
Once again, make no assumptions about the input data, as well test your work using different review files.
One of the design decisions youll need to make is how to associate each review with the correct movie. Thats where the movie id comes in. As you input the review data, lookup the movie id in your binary search tree, and store the rating in the value portion of the tree. In other words, use the reviews movie id as the search key, and store the reviews rating in the value associated with the key. When you are done processing the reviews, you should have a binary search tree that contains all the review data, organized by movie id or name (or both). The goal is to be able to find a movies information in O(lgN) time using either the movies id, or the movies name. This implies your key may be a simple type like integer or string, but your value will most likely be a struct.
When searching for a movie, note that the user can enter a numeric movie id, or a string name. Use the getline function to obtain the input, and then determine if the input is numeric or not:
string input;
cout << Enter a movie id or name (or # to quit)> ; getline(cin, input); // read entire input line
It is not sufficient to simply check the first character of the input to see if its numeric some movie names start with numeric values such as 12 Angry Men. Youll need to write a little function to loop over the string and check all the characters; strings are objects, use their length() function to obtain the # of characters.
The program should interact with the user until the sentinel # is input. Your program output must match the output shown in the screenshots *exactly*, otherwise the autograding system will report your program as incorrect. Its okay for your program to be case-sensitive, i.e. Finding Nemo != finding nemo.
Additional program requirements:
- You must use your own binarysearchtree class, a solution will not be provided and you cannot use that of another individual. The entire program must be your own work.
- No other data structures no arrays, no vectors, no stacks, no queues, nothing. The *only* data structure you may use is your own binarysearchtree<TKey, TValue>.
- Create multiple trees to ensure that all search and insert operations are O(lgN). Tree balancing is not required as the input data is random.
- You can only open and read each input file once; you cannot solve the program by reading the data from the input files over and over again.
Reviews
There are no reviews yet.