Part 2: Radix-sort on strings
The class random generator has been updated (random_generator.h and random generator.cpp) by a member function which generates random strings of up to a given length with characters between a and z.
- char* random_string(int max, int& m)
The function picks a string length m at random in between 1 and max (1<=m<= max). The length m of the random string is stored in the second parameter. The function then allocates m + 1 characters. The first m characters (0m-1) are chosen at random in between the characters a and z. The m-th character is set to 0 in order to mark the end of the string.
A string in C is represented by an array of characters (e.g. char *st). Each character is an ASCII representation of a specific integer value. The character a is a representation of the value 97 according to the ASCII table. The individual characters of the array/string can therefore be viewed as a digit.
- Implement an insertion sort algorithm for strings that sorts a given array of strings according to the character at position d. It is necessary to include the length of each string (array A len) as it is unclear whether or not the digit d exists. A non-existing digit d is interpreted as a 0 in the sorting process. Add a parameter int d and int* A_len to the algorithm arguments and modify the given insertion sort algorithm accordingly. This algorithm should be implemented in the method below:
void insertion_sort_digit(char** A, int* A_len, int l, int r, int d)
Notes:
- Do not copy the characters itself, but only the pointer to the string/array of characters instead.
- Do not fill the strings with 0s to make them of equal size.
- Do not compute the length of a string with strlen. Use the int array A_len
- If you move a string within the array of strings then you must update the array containing the length of the strings accordingly.
- Use this modified insertion sort algorithm to implement radix sort for an array of strings. Measure the runtime performance for arrays of random strings 10 times for every combination of array size n = 10000; 25000; 50000; 75000; 100000 and length of the random strings m = 25; 50; 75. Discuss your results. (You might have to adjust the value for n dependent on your computers speed, but allow each test to take up to a couple of minutes). This algorithm should be implemented in the method below:
void radix_sort_is(char** A, int* A_len, int n, int m)
- Develop a counting sort algorithm for strings that sorts a given array of strings according to the character at position d. As for the insertion sort on digit d, a non-existing digit d is treated as a 0 throughout the counting sort. This algorithm should be implemented in the method below:
void counting_sort_digit(char** A, int* A_len, char** B, int* B_len, int n, int d)
Notes:
- Do not copy the characters itself, but only the pointer to the string/array of characters, instead.
- Do not fill the strings with 0s to make them of equal size.
- Do not compute the length of a string with strlen. Use the int array A_len
- When moving the string into its correct position you also have to move the length of the string into its correct position.
- You can use int C[256] to store the counters/positions.
- Use this new counting sort algorithm to implement radix sort for an array of strings. Measure the runtime performance for arrays of random strings 10 times for every combination of array size n = 100000; 250000; 500000; 750000; 1000000 and length of the random strings m = 25; 50; 75. Discuss your results. (You might have to adjust the value for n dependent on your computers speed, but allow each test to take up to a couple of minutes). This algorithm should be implemented in the method below:
void radix_sort_cs(char** A, int* A_len, int n, int m)
Remarks:
- You are not allowed to use code from online resources. Your submission will be tested against that, and will receive a 0, and a report to the Graduate Academic Integrity Board if it is detected.
- Your report has to be typed, and submitted in a pdf file. You should provide figures that plot the running times in relation to the input size parameters.
- You might have to adjust the value for n depending on your computers speed, but allow each test to take up to a couple of minutes.
- Start with smaller values of n and m and stop if one instance of the algorithm takes more than 5 min to complete (the insertion sort implementations will hit that limit early on).
- Discuss your results means that in your report you have to present the results in a table, and analyze and interpret your findings properly. What do the experiments tell you?
- The programming, testing and the experimentation will take some time. Start early.
- Feel free to use the provided source code for your implementation. You have to document your code.
- A Makefile is provided to build the code in the Virtual Box provided.
- Your code has to compile, and run at the Virtual box shared with you in order to be graded.
- Your methods should be added in the sort.h/sort.cpp files. You must keep the same file structure, makefile that is provided in the skeleton code.
Reviews
There are no reviews yet.