Part 2: Radix-sort on strings
The class random generator has been updated (random_generator.h and random generator.cpp) by a member function which generates random strings of up to a given length with characters between a and z.
- char* random_string(int max, int& m)
The function picks a string length m at random in between 1 and max (1<=m<= max). The length m of the random string is stored in the second parameter. The function then allocates m + 1 characters. The first m characters (0m-1) are chosen at random in between the characters a
and z. The m-th character is set to 0 in order to mark the end of the string.
A string in C is represented by an array of characters (e.g. char *st). Each character is an ASCII representation of a specific integer value. The character a is a representation of the value 97 according to the ASCII table. The individual characters of the array/string can therefore be viewed as a digit.
- Implement an insertion sort algorithm for strings that sorts a given array of strings according to the character at position d. It is necessary to include the length of each string (array A len) as it is unclear whether or not the digit d A non-existing digit d is interpreted as a 0 in the sorting process. Add a parameter int d and int* A_len to the algorithm arguments and modify the given insertion sort algorithm accordingly. This algorithm should be implemented in the method below:
void insertion_sort_digit(char** A, int* A_len, int l, int r, int d)
Notes:
- Do not copy the characters itself, but only the pointer to the string/array of characters instead.
- Do not fill the strings with 0s to make them of equal size.
- Do not compute the length of a string with strlen. Use the int array A_len
- If you move a string within the array of strings then you must update the array containing the length of the strings accordingly.
- Use this modified insertion sort algorithm to implement radix sort for an array of strings. Measure the runtime performance for arrays of random strings 5 times for every combination of array size n = 2500; 5000; 7500; 10000 and length of the random strings m = 25; 35; 45. Discuss your results. (You might have to adjust the value for n dependent on your computers speed, but allow each test to take up to a couple of minutes). This algorithm should be implemented in the method below:
void radix_sort_is(char** A, int* A_len, int n, int m)
- Develop a counting sort algorithm for strings that sorts a given array of strings according to the character at position d. As for the insertion sort on digit d, a non-existing digit d is treated as a 0 throughout the counting sort. This algorithm should be implemented in the method below:
void counting_sort_digit(char** A, int* A_len, char** B, int* B_len, int n, int d)
Notes:
- Do not copy the characters itself, but only the pointer to the string/array of characters, instead.
- Do not fill the strings with 0s to make them of equal size.
- Do not compute the length of a string with strlen. Use the int array A_len
- When moving the string into its correct position you also have to move the length of the string into its correct position.
- You can use int C[256] to store the counters/positions.
- Use this new counting sort algorithm to implement radix sort for an array of strings. Measure the runtime performance for arrays of random strings 5 times for every combination of array size n = 50000; 100000; 500000; 750000 and length of the random strings m = 25; 35; 45. Discuss your results. (You might have to adjust the value for n dependent on your computers speed, but allow each test to take up to a couple of minutes). This algorithm should be implemented in the method below:
void radix_sort_cs(char** A, int* A_len, int n, int m)
Remarks:
- You are not allowed to use code from online resources. Your submission will be tested against that, and will receive a 0, and a report to the Graduate Academic Integrity Board if it is detected.
- Your report has to be typed, and submitted in a pdf file. You should provide figures that plot the running times in relation to the input size parameters.
- You might have to adjust the value for n depending on your computers speed, but allow each test to take up to a couple of minutes.
- Start with smaller values of n and m and stop if one instance of the algorithm takes more than 5 min to complete (the insertion sort implementations will hit that limit early on).
- Discuss your results means that in your report you have to present the results in a table, and analyze and interpret your findings properly. What do the experiments tell you?
- The programming, testing and the experimentation will take some time. Start early.
- Feel free to use the provided source code for your implementation. You have to document your code.
- A Makefile is provided to build the code in the Virtual Box provided.
- Your code has to compile, and run at the Virtual box shared with you in order to be graded.
- Your methods should be added in the sort.h/sort.cpp files. You must keep the same file structure, makefile that is provided in the skeleton code.
Reviews
There are no reviews yet.