For this assignment you will be implementing a hash table data structure for storing different types of structs. The context of the assignment is a simulated book store that will track its books for sale and the actual sales themselves through two different hash tables. Towards the end of the assignment we will also explore a (sorta related) interesting computational problem we will solve using a simple, non-recursive brute-force algorithm. Perhaps in your Algorithms course you will study it (or a similar problem) using a more efficient, recursive algorithm.
Data Structures
The assignment relies on three data structures given as structs:
struct Book { string isbn; // 13-character, null-terminated string [14 bytes] string title; // 25-byte buffer to store the null-terminated title string author; // 25-byte buffer to store the null-terminated author int times_sold; // 4-byte integer the stores # of sales of this book
}
Note that a Book struct requires 14 + 25 + 25 + 4 = 68 bytes. We will assume that the entire struct is word-aligned, which guarantees that times_sold will also be word-aligned. In this assignment we will also assume that a string containing an ISBN consists only of the digit characters (no dashes, spaces, Xs, etc.)
struct BookSale { string isbn; // 13-character, null-terminated string [14 bytes] byte[2] padding; // 2 bytes of zeros int customer_id; // 4-byte customer ID# int sale_date; // date of sale as # of days since 1/1/1600 [4 bytes] int sale_price; // 4 bytes
}
Note that a BookSale struct requires 14 + 2 + 4 + 4 + 4 = 28 bytes. The date of the sale is encoded as a 4-byte integer that provides the number of days that have passed since January 1, 1600. For example, January 5, 1600 would be represented by the number 4. January 1, 1600 itself would be represented by the number 0. November 8, 2020 would be represented by the number 153,714. You will write functions to convert a date-string in the form YYYY-MM-DD (e.g., 2020-11-08) to this numerical encoding. The inspiration for this time format is the UNIX timestam p system.
Instances of the above structs will be stored in two separate hash table data structures, as defined below. The hash table itself will be word-aligned.
struct Hashtable { int capacity; // maximum # of elements in hashtable [4 bytes] int size; // # of elements in hashtable [4 bytes] int element_size; // size of one element in the hashtable [4 bytes] object[] elements; // objects stored in the hashtable
// consumes (capacity*element_size) bytes
}
A hash table in this assignment consists of an array of structs, not an array of pointers to structs. Note how this differs from Java, wherein what we call an array of objects is really an array of references to objects. In C and MIPS it is possible to create an array of pointers to objects, and we will explore that option in the last function of the assignment. But for the functions in the current assignment, we will store the structs directly in the arrays themselves, the reason being that one of the purposes of this assignment is to deal with arrays of varying element sizes.
In the elements array, an empty entry, i.e., an entry in the array that has never stored a struct, is filled entirely with zeros. A deleted (sometimes called available) entry is one that once stored an object that has since been deleted (removed) from the hash table. A deleted element is represented by
(capacity*element_size), one-byte, twos complement representations of -1 (i.e., 0xFF). That is, the entire entry in the array consists of one-byte -1s. Due to a quirk of how we will be using the hash table for this assignment, it will be sufficient to load the first byte of an entry in the elements array and check if that byte equals 0xFF. If so, that entry can be assumed to be deleted. Similarly, to check if an entry in elements is empty, it will be sufficient to load the first byte and check if it equals zero. Every 13-character ISBN must start with 9, so the first character of a non-empty, non-deleted object in elements cannot be 0. In sum, the first byte of a Book struct or a BookSale struct must be the number 0, the number -1, or the ASCII code for the character 9.
Part 1: Copy a Memory Buffer
int memcpy(byte[] dest, byte[] src, int n)
The memcpy function copies n bytes of data from address src to address dest. You may assume that the dest buffer is at least n bytes in size. This function will be used in later functions to copy structs from one place in memory to another. Therefore, you should code for generality and not be concerned about the significance of the bytes that the function copies. For this reason, your code should use lbu to read the bytes from memory.
The function takes the following arguments, in this order:
- dest: the address to copy bytes to
- src: the address to copy bytes from
- n: the number of bytes to copy (must be greater than 0)
Note that src and dest are not necessarily word-aligned.
Returns in $v0:
- n if n > 0, or
- -1 if n 0
Additional requirements:
- The function may only change the contents of the dest buffer and no other parts of main memory.
- Only the first n bytes memory starting at dest may be changed. All other bytes in dest must be left unchanged.
Examples:
Function Arguments | Return Value | Updated dest |
XXXXXXX, ABCDEFGHIJ, 3 | 3 | ABCXXXX |
XXXXXXX, ABCDEFGHIJ, 7 | 7 | ABCDEFG |
XXXXXXX, ABCDEFGHIJ, -3 | -1 | XXXXXXX |
XXXXXXX, ABCDEFGHIJ, 0 | -1 | XXXXXXX |
Part 2: Compare Two Strings
int strcmp(string str1, string str2)
The strcmp function takes two null-terminated strings (either or both of which could be empty) and returns an integer that indicates the lexicographic order of the strings. The function begins by comparing the first character of each string. If they are equal to each other, it continues with the following pairs until the characters differ or until a terminating null-character is reached. Note that the strings can be of different lengths.
The function takes the following arguments, in this order:
- str1: the starting address of the first string
- str2: the starting address of the second string
Returns in $v0:
- the difference between the ASCII values of the first mismatch, if any: str1[n] str2[n], where n is the index of the first mismatch; or
- 0 if the contents of both strings are identical (including the case where they are both empty strings); or
- length of str1 if str2 is an empty string but str1 is non-empty; or
- negated length of str2 if str1 is an empty string but str2 is non-empty
Additional requirements:
- The function must not write any changes to main memory.
Examples:
Function Arguments | Return Value |
ABCD, ABCGG | -3 |
WHOOP!, WHOA | 14 |
Intel, pentium | -39 |
STONY, BROOK | 17 |
, mouse | -5 |
lonely guy, | 10 |
Wolfie, Wolfie | 0 |
, | 0 |
happy, Z | 14 |
WOLF, WOLFIE | -73 |
StonyBrook, Stony | 66 |
Part 3: Initialize a Hash Table
int initialize_hashtable(Hashtable* hashtable, int capacity, int element_size)
The initialize_hashtable function takes a pointer to (i.e., address of) an uninitialized Hashtable struct and performs the following steps:
- sets the structs capacity field to the capacity argument
- sets the structs size field to 0
- sets the struct element_size field to the element_size argument
- fills the entirety of the structs elements field to 0 (i.e., all capacity * element_size bytes)
Assume that the uninitialized block of memory referenced by hashtable is large enough to store a hash table of the required size in bytes.
The function takes the following arguments, in this order:
- hashtable: a pointer to an uninitialized block of memory
- capacity: the number of elements in the hash tables elements array
- element_size: the size of a single struct stored in one element of the hash tables elements
Returns in $v0:
- -1 if capacity is less than 1 or element_size is less than 1; or 0 otherwise
Additional requirements:
- The function may only change the contents of the hashtable buffer and no other parts of memory.
Example #1: initializing a hash table to store Book structs.
hashtable = pointer to buffer of 352 (12+5*68) uninitialized bytes of memory capacity = 5 element_size = 68
Return value in $v0: 0
Resulting contents of hashtable :
5 0 68 (capacity, size, element_size)
elements[] array contents in hexadecimal, truncated:
0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
1: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
2: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
3: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
4: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Example #2: initializing a hash table to store BookSale structs.
hashtable = pointer to buffer of 320 (12+11*28) uninitialized bytes of memory capacity = 11 element_size = 28
Return value in $v0: 0
Resulting contents of hashtable :
11 0 28 (capacity, size, element_size)
elements[] array contents in hexadecimal, truncated:
0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
1: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
2: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
3: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 4: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
.
. .
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Example #3: failed attempt to initialize a hash table to store BookSale structs. hashtable = pointer to buffer of some unknown number of uninitialized bytes of memory capacity = -3 element_size = 28
Return value in $v0: -1
Contents of hashtable are the same as before the function call.
Part 4: Compute the Hash Code for an ISBN in a Given Hash Table
int hash_book(Hashtable* books, string isbn)
The hash_book function takes a pointer to a valid Hashtable struct named books and an ISBN-13 value called isbn for which we want to compute the hash function. The hash function itself is very simple: sum the ASCII codes of the 13 characters in isbn and take the modulus of dividing that sum by the capacity of the books hash table (e.g., sum of the ASCII codes mod the capacity).
The function takes the following arguments, in this order:
- books: the starting address of a valid Hashtable struct
- isbn: a 13-character, null-terminated string that represents a valid ISBN
Returns in $v0:
- The value of the hash function when evaluated for the provided ISBN.
Additional requirements:
- The function may not write any changes to main memory.
Example #1:
hashtable = pointer to a valid hash table with capacity = 9 isbn = 9780671028370
Return value in $v0: 7
Example #2:
hashtable = pointer to a valid hash table with capacity = 17 isbn = 9780071401940
Return value in $v0: 11
Example #3:
hashtable = pointer to a valid hash table with capacity = 13
isbn = 1123121401940
Return value in $v0: 3
Part 5: Retrieve a Book from a Hashtable of Book Structs
int, int get_book(Hashtable* books, string isbn) -> (int, int)
The get_book function attempts to find a struct inside books with the given ISBN and returns the index of that struct, as well as the number of entries in books.elements that needed to be inspected to find that struct by way of linear probing. If an empty entry is encountered during probing, this means that no book of the given ISBN is in the hash table. If a deleted entry is encountered during probing, the linear probing must continue. If no struct has the given ISBN, the function returns -1 in $v0 and the number of inspected entries of elements in $v1.
The function takes the following arguments, in this order:
- books: the starting address of a valid Hashtable struct
- isbn: a 13-character null-terminated string that represents a valid ISBN
Returns in $v0:
- The index in elements where the book was found; or
- -1 if the hash table does not contain a book with the given ISBN
Returns in $v1:
- The number of entries in the elements array that were accessed before a book of the given
ISBN was found or was determined not to be in the elements array.
Additional requirements:
- The function may not write any changes to main memory.
- The function must call hash_book and strcmp .
The examples in this document are not comprehensive nor cover all possible classes of inputs!
Example #1: Attempt to get a book from an empty hash table
isbn = 9780553214830
Contents of books hash table:
9 0 68 (capacity, size, element_size)
0: empty
1: empty
2: empty
3: empty
4: empty
5: empty
6: empty
7: empty
8: empty
Return value in $v0: -1
Return value in $v1: 1
Example #2: Get a book that is present in the hash table at the expected index
isbn = 9780440060670
Contents of books hash table:
7 5 68 (capacity, size, element_size)
0: 9780345501330 Fairy Tail, Vol. 1 (Fair Hiro Mashima, William Fl 0
1: empty
2: 9780670032080 Financial Peace Revisite Dave
Ramsey 0
3: 9780440060670 The Other Side of Midnig Sidney Sheldon 0
4: 9780064408330 Joey Pigza Swallowed the Jack
Gantos 0
5: deleted
6: 9781416971700 Out of My Mind Sharon M.
Draper 0
Return value in $v0: 3
Return value in $v1: 1
Example #3: Attempt to get a book that is not present in the hash table; expected location is empty
isbn = 9780007201780
Contents of books hash table:
7 5 68 (capacity, size, element_size)
0: 9780345501330 Fairy Tail, Vol. 1 (Fair Hiro Mashima, William Fl 0
1: empty
2: 9780670032080 Financial Peace Revisite Dave
Ramsey 0 3: 9780440060670 The Other Side of Midnig Sidney
Sheldon 0
4: 9780064408330 Joey Pigza Swallowed the Jack
Gantos 0
5: deleted
6: 9781416971700 Out of My Mind Sharon M.
Draper 0
Return value in $v0: -1
Return value in $v1: 1
Example #4: Get a book that is not present in the hash table; expected location is deleted (hash table not full); probing required
isbn = 9780060182980
Contents of books hash table:
7 5 68 (capacity, size, element_size)
0: 9780345501330 Fairy Tail, Vol. 1 (Fair Hiro Mashima, William Fl 0
1: empty
2: 9780670032080 Financial Peace Revisite Dave
Ramsey 0 3: 9780440060670 The Other Side of Midnig Sidney
Sheldon 0
4: 9780064408330 Joey Pigza Swallowed the Jack
Gantos 0
5: deleted
6: 9781416971700 Out of My Mind Sharon M.
Draper 0
Return value in $v0: -1
Return value in $v1: 6
Example #5: Get a book that is present in a full hash table
isbn = 9780064408330
Contents of books hash table:
7 7 68 (capacity, size, element_size)
0: 9780345501330 Fairy Tail, Vol. 1 (Fair Hiro Mashima, William Fl 0
1: 9780345419580 Moreta: Dragonlady of Pe Anne
McCaffrey 0
2: 9780670032080 Financial Peace Revisite Dave
Ramsey 0 3: 9780440060670 The Other Side of Midnig Sidney
Sheldon 0
4: 9780064408330 Joey Pigza Swallowed the Jack
Gantos 0
5: 9780140168130 Big Sur Jack Kerouac, Aram Saroy 0
6: 9781416971700 Out of My Mind Sharon M.
Draper 0
Return value in $v0: 4
Return value in $v1: 1
Example #6: Get a book that is present in the hash table at a probed index; one or more deleted slots encountered during probing
isbn = 9780140168130
Contents of books hash table:
7 5 68 (capacity, size, element_size)
0: 9780345501330 Fairy Tail, Vol. 1 (Fair Hiro Mashima, William Fl 0
1: 9780345419580 Moreta: Dragonlady of Pe Anne
McCaffrey 0
2: deleted
3: 9780440060670 The Other Side of Midnig Sidney
Sheldon 0
4: deleted
5: 9780140168130 Big Sur Jack Kerouac, Aram Saroy 0
6: 9781416971700 Out of My Mind Sharon M.
Draper 0
Return value in $v0: 5
Return value in $v1: 6
Example #7: Attempt to get a book that is not present in the hash table; hash table is full
isbn = 9780316905750
Contents of books hash table:
7 7 68 (capacity, size, element_size)
0: 9780345501330 Fairy Tail, Vol. 1 (Fair Hiro Mashima, William Fl 0
1: 9780345419580 Moreta: Dragonlady of Pe Anne
McCaffrey 0
2: 9781516865870 Beacon 23: The Complete Hugh
Howey 0
3: 9780440060670 The Other Side of Midnig Sidney
Sheldon 0
4: 9781573451990 Rumors of War (Children Dean
Hughes 0
5: 9780140168130 Big Sur Jack Kerouac,
Aram Saroy 0
6: 9781416971700 Out of My Mind Sharon M.
Draper 0
Return value in $v0: -1
Return value in $v1: 7
Part 6: Insert a Book Struct into a Hashtable of Books
int, int add_book(Hashtable* books, string isbn, string title, string author)
The add_book function attempts to insert a new Book struct into a given hash table by initializing the contents of the struct with the provided ISBN, title and author. It also sets the times_sold field of the struct to 0. It begins by checking if the hash table is full. If so, the function returns (-1, -1) and makes no changes to the hash tables contents. Next, it checks if a book of the given ISBN is already in the hash table. If so, the function returns the return values of get_book(books, isbn) as its own return values. Otherwise, the function attempts to insert the struct at books.elements[hash_book(isbn)]. Failing that, it performs linear probing to find the next empty or deleted entry in the elements array and inserts the struct at that location. If needed, probing wraps-around from the last entry in elements to the first. The function returns in $v0 the index where the struct was inserted. It returns in $v1 the number of entries in elements it had to read before finding an empty or deleted entry. For example, if it must skip over 3 occupied slots before finding an empty one, $v1 would be 4 in that case. If a book is successfully added to the hash table, books.size is incremented by 1.
Note that at most 24 characters of a books title or author can be stored in a Book struct. Therefore, any characters in title or author beyond the 24th character are dropped. The function must then write a null-terminator for the truncated string.
When writing the title and author string arguments into the Book struct, if the string is shorter than 24 characters in length, the function pads out the remainder of the 25 bytes of the title or author field by appending null-terminators. For instance, if title were MIPS Rocks, which is only 10 characters in length, the function would write the following as the title field in the new Book struct:
MIPS Rocks
The function takes the following arguments, in this order:
- books: the starting address of a valid Hashtable of Book structs
- isbn: a 13-character null-terminated string that represents a valid ISBN
- title: a non-empty, null-terminated string that represents a books title
- author: a non-empty, null-terminated string that represents a books author
Returns in $v0:
- The index in elements where the new book was added (or found, if the book was already in the hash table); or
- -1 if the hash table was full before the function call.
Returns in $v1:
- The number of entries in the elements array that were accessed before the book was added to the hash table; or
- -1 if the hash table was full before the function call.
Additional requirements:
- The function may only change the contents of the books hash table and no other parts of memory.
- The function must call get_book, hash_book and memcpy.
Example #1: Add a book to an empty hash table.
isbn = 9780553214830 title = The Declaration of Independence author = Founding Fathers
Return value in $v0: 4
Return value in $v1: 1
Resulting books contents:
9 1 68 (capacity, size, element_size)
0: empty
1: empty
2: empty
3: empty
4: 9780553214830 The Declaration of Indep Founding
Fathers 0
5: empty
6: empty
7: empty
8: empty
(Note: null-terminators are appended as needed to pad the title and author fields to 25 bytes each.
The final 4-byte word in each struct is the times_sold field, which is 0 in this case.)
Example #2: Add a book to a hash table with existing values; no collisions
isbn = 9781620610090 title = Opal (Lux, #3) author = Jennifer L. Armentrout books contents before function call:
9 4 68 (capacity, size, element_size)
0: empty
1: 9780446695660 Double Whammy Carl
Hiaasen 0
2: empty
3: empty
4: 9780553214830 The Declaration of Indep Founding
Fathers 14 5: 9788433914260 Hollywood Charles
Bukowski 0
6: 9780345527780 Wicked Business (Lizzy &