Mini-Lab: Sorting Algorithms

 


Introduction

In this mini-lab, you will experiment with several different algorithms to solve the same problem (sorting numbers), and analyze the performance of those algorithms.



Experimental Running Times for Sorting Algorithms

In this section, you will collect and compare running times for various sorting algorithms. You will use Excel to record and analyze your data. If you are not familiar with Excel, you should ask the lab instructor or TA for help.
  1. Inside the Knet\Dragon\cls_compsci\CS107 folder, there is a SortingLab9 folder. Copy it to your M: drive.

  2. Run Excel and open your copy of the file SortingLab9. Fill in your name at the top.

  3. In the SortingLab9 folder, double-click on the sort icon. This program will sort one of several types of data sets using several different sorting algorithms. First, you will be asked to select what type of data set to sort (random data, already sorted data, data in reverse order, or data containing sorted clumps within it). Choose random data. Next, you will be asked to select the size of the data set. Choose the smallest size (10,000 values). The program will run four different sorting algorithms (bubble sort, insertion sort, quicksort, and selection sort) on a set of 10,000 randomly arranged values, and report the running time for each algorithm. Record these running times in the spreadsheet.

  4. Repeat this for all data set sizes (but always with random data).

  5. Review Questions: Based on the running times you've collected, which of the four algorithms seems like the best (fastest)? Which seems the worst (slowest)? Type your answers in the appropriate spaces in the spreadsheet.

    Now you will collect additional data to see how the algorithm running times are affected by the initial arrangement of the input values.

  6. Use sort to collect running times for the four sorting algorithms on already sorted data with the six smallest data set sizes. Enter these times in the second table in the spreadsheet.

  7. Review Questions: Which algorithm seems best for data that is already sorted? (This is useful when your data is almost, but not completely, sorted, such as when you are adding a small number of values to a large sorted set.) Which algorithm seems worst? Enter your answers in the appropriate spaces in the spreadsheet.

    In class we saw a number of common functions with very different performance characteristics. Now we will determine which of those common functions most closely models the performance of your fastest sorting algorithm on random data. We will do this by looking at how the running time for the algorithm varies with the data set size N, and how that compares to various common functions of N.

  8. The third table in the spreadsheet contains columns labeled with various functions of N: N, N log2N, N2, N2 log2N, and N3. A few of the columns have a formula entered for the first row. Create a formula for each of the remaining columns and then fill down to calculate the values for the remaining rows.

  9. Enter the running times for the algorithm you selected as best for random data in the column labeled T (for time) in the third table.

  10. Determine which function is proportional to the running times of your algorithm. If a function is proportional to T, then no matter what the value of N is, the ratio of T to that function will be the same (or approximately the same).

  11. Now that we know which function is proportional to T, and the constant ratio between that function and T, we can calculate a formula for T. For example, if T were proportional to N3 and the ratio T/(N3) is getting closer and closer to 0.07, then we have
    T/(N3) = 0.07.
    Multiplying each side by N3, we have
    T = 0.07(N3).

  12. Review Questions:

  13. Save and print this spreadsheet.
    NOTE: Be sure to use Print Preview before printing to make sure you are printing the right thing.