Name: [Solved] CSCE221-Assignment 2
Brand: Assignment Chef
SKU: [Solved] CSCE221-Assignment 2
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

5/5 - (1 vote)

In this assignment, the sort program reads a sequence of integers either from the screen (standard input) or from a file, and outputs the sorted sequence to the screen (standard output) or to a file. The program can be configured to show total running time and/or total number of comparisons done in the sort.
This program does not have a menu but takes arguments from the command line. The code for interface is completed in the template programs, so you only have to know how to execute the program using the command line.

The program usage is as follows. Note that options do not need to be specified in a fixed order. Usage:

./sort [-a ALGORITHM] [-f INPUTFILE] [-o OUTPUTFILE] [-h] [-d] [-p] [-t] [-c] Example:

./sort -h

./sort -a S -f input.txt -o output.txt -d -t -c -p

./sort -a I -t -c

./sort Options:

ALGORITHM: Use ALGORITHM to sort.

ALGORITHM is a single character representing an algorithm:

S for selection sort

B for bubble sort

I for insertion sort

H for shell sort

R for radix sort

-f INPUTFILE: Obtain integers from INPUTFILE instead of STDIN

-o OUTPUTFILE: Place output data into OUTPUTFILE instead of STDOUT

-h: Display this help and exit

-d: Display input: unsorted integer sequence

-p: Display output: sorted integer sequence

-t: Display running time of the chosen algorithm in milliseconds

-c: Display number of comparisons (excluding radix sort)

Format of the input data. The first line of the input contains a number n which is the number of integers to sort. Subsequent n numbers are written one per line which are the numbers to sort. Here is an example of input data:

5 // this is the number of lines below = number of integers to sort

-8

-2

Format of the output data. The sorted integers are printed one per line in increasing order. Here is the output corresponding to the above input:

-8

-2

(50 points) Your tasks include implementing the following five sorting algorithms in corresponding cpp files.
- selection sort in selection-sort.cpp
- insertion sort in insertion-sort.cpp
- bubble sort in bubble-sort.cpp
- shell sort in shell-sort.cpp
- radix sort in radix-sort.cpp
  1. Implement the radix sort algorithm that can sort 0 to(2¹⁶1) but takes input 2¹⁵to (2¹⁵1) .
  2. About radix sort of negative numbers: You can shift input to all positive numbers by adding a number which makes the smallest negative number zero. Apply radix sort and next make a reverse shift to get the initial input.

Generate the sets of the sizes 10², 10³, 10⁴, and 10⁵integers in three different orders.
- random order
- increasing order
- decreasing order

HINT: The standard library <cstdlib>provides functions srand() and rand() to generate random numbers.

Measure the average number of comparisons (excluding radix sort) and average running times of each algorithms on the 12 integer sequences.
- (20 points) Insert additional code into each sort (excluding radix sort) to count the number of comparisons performed on input integers. The following tips should help you with determining how many comparisons are performed.
  1. You will measure 3 times for each algorithm on each sequence and take average
  2. Insert the code that increases number of comparison num_cmps++typically in an if or a loop statement

Remember that C++ uses the shortcut rule for evaluating boolean expressions. A way to count comparisons accurately is to use comma expressions. For instance while (i < n && (num_cmps++, a[i] < b))

HINT: If you modify sort.cpp and run several sorting algorithms subsequently, you have to call resetNumCmps() to reset number of comparisons between every two calls tos->sort() .

Modify the code in sort.cpp so that it repeatedly measures the running time ofs->sort() .
1. You will measure roughly 10⁷times for each algorithm on each sequence and take the average. You have to run for the same number of rounds for each algorithm on each sequence, and make sure that each result is not 0.
2. When you measure the running time of sorting algorithms, please reuse the input array but fill with different numbers. Do not allocate a new array every time, that will dramatically slower the program.

To time a certain part of the program, you may use functions clock() defined in header file <ctime>, or gettimeofday()defined in <sys/time.h>. Here are the examples of how to use these functions. The timing part is also completed in the template programs. However, you will apply these function to future assignments.

The example using clock() in <ctime>:

#include <ctime>

clock_t t1, t2; t1 = clock(); // start timing

/* operations you want to measure the running time */

t2 = clock(); // end of timing double diff = (double)(t2 t1)/CLOCKS_PER_SEC; cout < < The timing is < < diff < < ms < < endl; The example using gettimeofday()in <sys/time.h>:

#include <sys/time.h>

struct timeval start, end;

gettimeofday(&start,0); // start timing

/* operations you want to measure the running time */

Report)

Write a report that includes all following elements in your report.

A brief description of assignment purpose, assignment description, how to run your programs, what to input and output.

The purpose of this assignment was to gain a better understanding of the relative efficiency of different sorting algorithms on a variety of input types, both in terms of number of comparisons and time taken. We are assigned to implement various sorting algorithms and experimentally observe their running times.We are supposed to implement 5 sorting algorithms: selection sort, insertion sort, bubble sort, shell sort, and radix sort. The program is run by the command ./sort [-a] [-f] [-o] [-h] [-d] [-p] [-t] [-c]. The command line arguments correspond to the desired algorithm, input file, output file, whether or not the user wants to display the help, display input, display output, display running time in milliseconds, and display number of comparisons, excluding radix sort because radix sort is a non-comparison-based sorting algorithm. The user should have the data in an input file, specified in the command line arguments passed to the main function in the program. Further, we learned the basic use of git and GitHub, as well as some experience in programming as part of a group.

Explanation of splitting the program into classes and a description of C++ object oriented features or generic programming used in this assignment.

We split the program into classes so that we could have a general virtual sort class which could later be cast to one of a number of child classes. This allowed us to handle each preference of sorting type

separately while allowing a common interface for the user. Generic programming features are the general algorithms executed by the various classes, as these are implementable in a similar fashion regardless of language. Object-oriented features include the use of a virtual class which is later cast to a child based upon command line inputs.

Briefly describe the features of each of the five sorting algorithms.

Radix Sort- Radix sort uses a series of counting sorts from the least significant to the most significant places on a value, necessitating a stable counting sort algorithm, and taking in values up signed integers between (2¹⁵)and (2¹⁵1). It is capable of sorting negative numbers by supplying an offset to make all values positive, and then returning them to their original values with a commensurate negative offset at the end of the process. This particular implementation of radix sort utilizes a byte-level counting sort, which then conducts itself in the usual manner of counting sorts by using three arrays, scanning for indices, cascading addition in the vocabulary array, and then using inverted index referencing to determining the value and number of array cells to be filled. It can sort in linear time with the given constraints of data.

Shell Sort- Shell sort uses a technique of segmenting normally distributed data across an interval into equal sub-intervals. The allocated into the correct sub-interval, and is sorted using insertion sort. Afterwards, the sub-intervals are stitched together, resulting in a sorted list. Because the sub-intervals allocation allows an insertion sort operation to be nearly sorted already, it can approach constant time given normally distributed data, and the other operations are linear.

Bubble Sort- Bubble combs linearly from the first element in the array to the last element of interest, taking up the largest element that it finds and comparing it to its upper neighbor. After each iteration, it becomes uninterested in the next highest element of interest in the array, thereafter restricting its search to those elements that it considers unsorted. In this way it will always handle data in at least O(n²) time.

Insertion Sort- Insertion sort, similarly to Bubble sort, combs linearly through the array. However, instead of carrying the largest element to the top, it instead will compare each element to its neighbors and carry small elements to the bottom. An element x at [a] will be compared to its neighbor y at [a+1]: if it is larger, then it will switch their indices. Then, y at [a] will be compared to the element at [a 1] . This will continue until all elements are sorted, and it runs also in O(n²) time.

Selection Sort- Selection sort will go through the element from smallest element to largest, and on each iteration compare the first element to every other in the list. If it finds that another element is smaller, then it will place that one at the front of the list. This algorithm has both a best and worst case of n²time, and is horribly inefficient.

Theoretical Analysis. Theoretically analyze the time complexity of the sorting algorithms with input integers in decreasing, random and increasing orders and fill the second table. Fill in the first table with the time complexity of the sorting algorithms when inputting the best case, average case and worst case. Some of the input orders are exactly the best case, average case and worst case of the sorting algorithms. State what input orders correspond to which cases. You should use big-O asymptotic notation when writing the time complexity (running time).

Complexity	best	average	worst
Selection Sort	O(n²)	O(n²)	O(n²)
Insertion Sort	O(n)	O(n²)	O(n²)
Bubble Sort	O(n)	O(n²)	O(n²)
Shell Sort	O(n log n)	O(n(log n))²)	O(n(log n))²)
Radix Sort	O(n)	O(n)	O(n) or O(k)
Complexity	inc	ran	dec
Selection Sort	O(n²)	O(n²)	O(n²)
Insertion Sort	O(n)	O(n²)	O(n²)
Bubble Sort	O(n)	O(n²)	O(n²)
Shell Sort	O(n log n)	O(n(log n))²)	O(n(log n))²)
Radix Sort	O(n)	O(n) or O(k)	O(n) or O(k)

inc: increasing order; dec: decreasing order; ran: random order

(a) Briefly describe the experiments. Present the experimental running times (RT) and number of comparisons (#COMP) performed on input data using the following tables.

For the experiments, we created lists of incrementing, random and decrementing values. We extensively tested each algorithm and took an average of the run times. This rigorous testing led us to the following results:

RT	Selection Sort (ms)							Insertion Sort (ms)									Bubble Sort (ms)
n	inc		ran		dec			inc				ran			dec		inc		ran			dec
100	0.043		0.052		0.046			0.003				0.031			0.05		0.003		0.098			0.078
10³	3.34		3.51		3.44			0.012				2.23			4.47		0.011		7.21			7.67
10⁴	175		196		279			0.134				143			235		0.092		375			329
10⁵	19332		21780		32388			0.713				11359			23823		0.598		44540			49056
RT	Shell Sort (ms)						Radix Sort (ms)
n	inc			ran	dec		inc				ran		dec
100	0.007			0.028	0.12		0.062				0.056		0.049
10³	0.107			0.428	0.185		0.33				0.328		0.404
10⁴	1.143			4.88	2.16		2.28				1.09		1.91
10⁵	10.205			55.9	19		20.43				20.56		20
#COMP		Selection Sort														Insertion Sort
n		inc				ran				dec						ran				inc			dec
100		4950				4950				4950						2608				99			5049
10³		499500				499500				499500						250035				999			500499
10⁴		49995000				49995000				49995000						25222691				9999			49994956
10⁵		4999950000				4999950000				4999950000						2506169125				99999			4999950203
#COMP		Bubble Sort												Shell Sort
n		ran				inc			dec					ran				inc			dec
100		4929				99			4950					902				503			668
10³		499490				999			499500					15683				8006			11716
10⁴		49991084				9999			49994999					258010				120005			169245
10⁵		4999508670				99999			4999950000					4259397				1500006			2196626

inc: increasing order; dec: decreasing order; ran: random order

As we can see, the run time and number of comparisons do increase with number of elements on the list.

The values for the number of comparison match the intended Big O notation.

(a) For each of the five sort algorithms, graph the running times over the three input cases (inc, ran, dec) versus the input sizes (n); and for each of the first four algorithms graph the numbers of comparisons versus the input sizes, totaling in 9 graphs.

All of these graphs have been plotted in Log base 10 on both values.

HINT: To get a better view of the plots, use logarithmic scales for both x and y axes. Just to clarify, the last graph (Shell Sort) should be (Comparisons) on the y-axis.

(a) To compare performance of the sorting algorithms you need to have another 3 graphs to plot the results of all sorts for the running times for each of the input cases (inc, ran, dec) separately.

HINT: To get a better view of the plots, use logarithmic scales for both x and y axes.

Comment on how the experimental results relate to the theoretical analysis and explain any discrepancies you note. Is your computational results match the theoretical analysis you learned from class or textbook? Justify your answer. Also compare radix sorts running time with the running time of four comparison-based algorithms.

We can see a correlation for between the theoretical and experimental results. For Selection Sort we notice that the BigO asymptotic function is n ². From our results we can see that we have 4950 comparisons done on to sort our algorithm in all three cases. Even though this seems like only half of the time our comparisons will be done, we know from mathematics that the BigO function must be asymptotically larger than our function, so our experimental value agrees with our results. Another algorithm we can analyze would be Bubble Sort. From the theoretical predictions we see that the BigO should be a case of O(n) for increasing lists and a O(n ²) for the average and worst case. Our experimental results do follow this mathematical correlation with the best case being a run time of 0.598ms for the 10⁵inputs and 44540 and 49056 for the average and worst case.

The Radix Runt Time is significantly lower than the three comparison based algorithms but similar to the Shell Sort run times. For the average case, we do see that the Radix perform better than the Shell.

Give your observations and conclusion. For instance, which sorting algorithm seems to perform better on which case? Do the experimental results agree with the theoretical analysis you learned from class or textbook? What factors can affect your experimental results?

For the case of small input size (10²and 10³), the Shell Sort and Radix Sort seems to be the best algorithms for the least number of comparisons. The Shell Sort performs at a run time of milliseconds for small input sizes and the Radix sort performs on the millisecond range also. This agrees with our analysis of the BigO for OO(n²(logn)) for the average and worst cases for Shell Sort and O(n) for all the cases in Radix sort. When we begin to use larger data sets (10 ⁴and 10⁵), we see that the Radix Sort has the smallest values in Run Time for our inputs. This agrees with our theoretical analysis that states the worst case should be O(n). The factors that could be affecting our experimental results could be the fact that our Radix Sort was implemented with operations on the byte level, and so would be faster to compute than other arithmetic operations regardless of the inherent efficiency of the algorithm. Other factors could be varied computational load on the servers at the times that these algorithms were tested, or even the uncertainty that identical processors are handling our algorithms, given that the server is something of a black box. One last factor could be the fact that we ran most of the comparisons on a laptop and not TAMUs Unix servers. These discrepaciencs gave us different runtimes due to the hardware integrated into the seperate machines.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[Solved] CSCE221-Assignment 2

Reviews

Related products

[Solved] CSCE221-Assignment 6

[Solved] CSCE221-Assignment 3 Part 2

[Solved] CSCE221-Assignment 5

[Solved] CSCE221-Assignment 4

[Solved] CSCE221-Assignment 1

[Solved] CSCE221-Homework 1