You should work in the same group for this assignment that you are working with for the tree assignment. This assignment is very amenable to group work since much of it involves gathering data.
The files for this assignment can be found in ~ola/cps100/sort.
About sortall.cc
The program sortall.cc uses templated functions to sort. Any class/type that supports the standard comparison operations: <=, >=, ==, !=, >, < and both input and output operations using streams can be sorted (I/O isn't necessary to sort). There are four parts to this assignment
You are to run the program and time how long it takes to use selection sort, insertion sort, and bubble sort for arrays of size 500 to 5,000 in increments of 500. You will then graph this data using the gnuplot graphing program which will generate plots like the one below: (you can use other plotting packages if you know how to use them). You should probably think about how to minimize your time in generating data --- some thought about how to organize the runs of the program will save time.
In order to use gnuplot you must set up data files as pairs of x,y coordinates. If selection sort of 500 ints takes 0.523 seconds and of 1000 ints takes 0.9873 seconds then a data file called selectint.data should be created whose first two lines are:
500 0.523
1000 0.9873
For this part of the assignment you will create many (nine) data files, three for selection sort, three for insertion sort, and three for bubble sort;, for each sort you'll sort three different kinds of element. Each file will contain 10 lines of data with each line consisting of a number of array elements (500 --- 5,000) and the time to sort the array of that many elements as described above. Name these files selectint.data, selectstr.data, and so on depending on whether you're sorting ints or strings. The files are automatically generated by the sorting classes, you supply a file name when the sorting-class variable is constructed.
plot "insertint.data" with linespoints 1 1, \ "selectint.data" with linespoints 3 3, "bubbleint.data" with linespoints 5 5(the number-pairs 1 1 and 3 3 indicate the style of line and point to use). You can add another file by using a comma after the "5 5", specifying bubblestr.data, and specifying "linespoints 2 2" for example. Note that the backslash \ can be used to continue typing a long line on multiple lines (but gnuplot treats this as one line). This command should generate a plot on the screen. Then you should type
Finally you will create a version of the plot to print by typing
You can then print the plot by typing lpr squareplot.ps or print squareplot.ps Note that to quit gnuplot you simply type "quit".
These are "smart" pointers because as pointers they require less time to swap/move since only pointers are moved rather than entire strings being re-copied.
Timing the faster sorts
After implementing all these faster sorts you are to run the program sortall so that it sorts arrays of size 10000, 20000, ... 100,000 using these faster sorts (but NOT using O(n^2) sorts which will take too long). You can either plot the data, or include the data as part of your README file showing how long each sort takes for ints, strings, and smart string pointers.
Range of Numbers and Pivot Function After creating the table/data above you should run the program again but constrain the range of numbers used in the sorts to be less than 100 by invoking the program via sortall 100. Summarize any significant changes in the runtimes of the different sorts when the range of number is constrained as compared to when the range is larger (recall that the default range is less than 10,000). Give a brief explanation as to why changes occur or no changes occur.
You should then change the function Pivot used by quicksort so that rather than split the array into two sections: one less than or equal to the pivot and one greater; the array is split into three sections: one less than the pivot, one equal to the pivot, and one greater than the pivot. The section equal to the pivot does NOT need to be recursively sorted. To do this you may need to change the parameters of the function Pivot. Re-run for JUST quicksort using your changed pivot function; be sure to account for these results in your README file. You will probably want to implement a CheckSorted function to determine if your new quick sort is working. It will be difficult to earn full credit without implementing such a function. CheckSort might, for example, take two vectors and decide if one is a sorted version of the other one.
BucketSort
Bucket sort works when sorting integers in a limited range (and, on computers, all integers are in a limited range.) In the routine BucketSort this range is specified by the additional parameter radix as noted in the comments of the routine. (This means that you CANNOT use BucketSort with the class SortBench as the class is written since the BucketSort function doesn't have the right signature/prototype.) For example, if all the numbers being sorted are in the range 0--9 (the value of radix would be 10), then the diagram below shows how "bucket" counts are determined from an array and then used to "sort" the array.
Note that the count in each bucket indicates how many occurrences of each number appear in the original array and can be used in a straightforward manner to "store" numbers in the sorted array. The numbers are not being re-arranged as with other sorts, the count array is used to generate an array that has the same number of occurrences of each number that appeared in the original array.
When sortall is invoked it interprets any argument as the radix used to determine the range of numbers (see the main routine). For example, sortall 1000 indicates that all numbers will be in the range 0--999. The default radix is 10,000.
Shell Sort
Shell sort is described in Weiss. The basic idea is to do a sequence of insertion sorts, but to ``look'' at elements that are far apart. In insertion sort an element is inserted into its proper position relative to all other elements by examining all other elements. In shell sort, an element might be inserted into the proper position relative to every 100th element rather than every element. Then elements are inserted into proper position relative to every 50th element, every 25th element, and so on until at the last stage of shell sort a regular insertion sort takes place. Because many elements are moved before this final stage, the sort is much more efficient than insertion sort. There are many more details of this algorithm in Weiss. The increments used in this version of shell sort are described as {\em Hibbard's} increments, they are of the form: 1, 3, 7, ... , 2^k - 1.
You should submit your modified program sortall.cc and a README file containing the analysis of the routines as described above. Submit these using