# Assignment 3: Efficiency and Experimentation Solution

Problem Overview

This assignment is a departure from previous assignments insomuch as its focus is not on program construction, but is instead on experimentation, analysis, critical thinking, and applying the scientific method.

The provided resources for this assignment are the following.

A3.jar — A jar file containing the TimingLab class for which you have to experimentally determine the big-Oh running time of a method, and the SortingLab class for which you have to experimentally determine the sorting algorithms implemented in various methods.

TimingLabClient.java — Source code of a Java class that illustrates basic use of the TimingLab class. This class is for illustration only. You can modify it for your own purposes or you can use it as an example to create a similar class for your own purposes.

SortingLabClient.java — Source code of a class that illustrates basic calls to public methods of SortingLab. This class is for illustration only. You can modify it for your own purposes or you can use it as an example to create a similar class for your own purposes.

There are two parts to this assignment. Each has its own deliverable and is to be done independently of the other.

Part A: Experimental Discovery of Running Time

You must develop and perform a repeatable experimental procedure that will allow you to empirically discover the big-Oh time complexity of the timeTrial(int N) method in the TimingLab class. The parameter N represents the problem size. Thus, by iteratively calling timeTrial with successively larger values of N, you can collect timing data that is useful for characterizing the method’s time complexity.

The constructor TimingLab(int key) will create a TimingLab object whose timeTrial method is tailored specifically to the given key value. You will use your group number in Canvas as the key required by the constructor. For example, group 42 would invoke the constructor TimingLab(42) and group 23 would invoke the constructor TimingLab(23). This will create two distinct objects whose timeTrial

methods would (potentially) have different time complexities.

You are guaranteed that no matter what key value is used, the associated time complexity will be proportional to Nk for some positive integer k. Thus, you can take advantage of the following property of polynomial time complexity functions T(N).

T(N) µ Nk =)

T(2N)

T(N)

µ (2N)k

Nk =

2kNk

Nk = 2k (1)

1

COMP 2210

This property tells us that as N is successively doubled, the ratio of the method’s running time on the current problem size to the method’s running time on the previous problem size (i.e., T(2N)/T(N)) converges to a numerical constant, call it R, that is equal to 2k, and thus k = log2R.

As an example, consider the data shown in Table 1. Each row in this table records data for a given run of some method being timed. The first column (N) records the problem size, the second column (Time) records the time taken for the method to run on this problem size, the third column (R) is the ratio discussed above (i.e., Timei/Timei1), and the third column (k) is log2R. From Property 1 and Table 1 we can hypothesize that the method being timed has O(N4) time complexity.

Table 1: Running-time data and calculations

N Time R k

8 0.04 — —

16 0.08 2.25 1.17

32 0.84 10.37 3.37

64 7.59 9.03 3.18

128 113.56 14.97 3.91

256 1829.28 16.11 4.01

512 29689.21 16.23 4.02

You are required to use System.nanoTime() to generate timing data like that shown in Table 1. Running time values must be expressed in seconds.

To document your work you must write a lab report that fully describes the experiment used to discover the big-Oh time complexity of the given method. The lab report must discuss the experimental procedure

(what you did), data collection (all the data you gathered), data analysis (what you did with the data), and interpretation of the data (what conclusions you drew from the data). The lab report must be well written,

it must exhibit a high degree of professionalism, and it must be structured like the provided sample. Your experiment must be described in enough detail that it could be reproduced by someone else.

Part B: Experimental Identification of Sorting Algorithms

You must develop and perform a repeatable experimental procedure that will allow you to empirically discover the sorting algorithms implemented by the five methods of the SortingLab class—sort1, sort2, sort3, sort4, sort5. The five sorting algorithms implemented are merge sort, randomized quicksort, non-randomized quicksort, selection sort, and insertion sort.

Insertion sort, selection sort, and merge sort are implemented exactly as described in lecture and the associated note set. Quicksort has two implementations, randomized and non-randomized. Both implementations

choose the pivot in the same way: the pivot is always the left-most element (i.e., a[left]) of the current partition. Both algorithms also use the same algorithm for the partitioning operation (although it is slightly different than the one presented in lecture). The two implementations are different, however in the following way. The randomized quicksort implementation makes the worst case probabilistically

unlikely by randomly permuting the the array elements before the quicksort algorithm begins.

The non-randomized quicksort exposes the algorithm’s worst case by never shuffling the array elements.

To generate timing data you are required to use System.nanoTime(). Running time values must be expressed in seconds. Although timing data will be an important part of your experimental procedure, it will (likely) not be sufficient to distinguish between merge sort and randomized quicksort. To do this you

should think about stability.

The constructor SortingLab(int key) will create a SortingLab object whose sort method implementations are tailored specifically to the given key value. You will use your group number in Canvas as the key required by the constructor. For example, group 42 would invoke the constructor SortingLab(42)

and group 23 would invoke the constructor SortingLab(23), creating two distinct objects whose sort methods would have different implementations.

To document your work you must write a lab report that fully describes the experiments used to discover the the sorting algorithm associated with each of the five sort methods. The lab report must discuss the experimental procedure (what you did), data collection (all the data you gathered), data analysis (what

you did with the data), and interpretation of the data (what conclusions you drew from the data). The lab report must be well written, it must exhibit a high degree of professionalism, and it must be structured like the provided sample. Your experiment must be described in enough detail that it could be reproduced

by someone else.

This assignment is a departure from previous assignments insomuch as its focus is not on program construction, but is instead on experimentation, analysis, critical thinking, and applying the scientific method.

The provided resources for this assignment are the following.

A3.jar — A jar file containing the TimingLab class for which you have to experimentally determine the big-Oh running time of a method, and the SortingLab class for which you have to experimentally determine the sorting algorithms implemented in various methods.

TimingLabClient.java — Source code of a Java class that illustrates basic use of the TimingLab class. This class is for illustration only. You can modify it for your own purposes or you can use it as an example to create a similar class for your own purposes.

SortingLabClient.java — Source code of a class that illustrates basic calls to public methods of SortingLab. This class is for illustration only. You can modify it for your own purposes or you can use it as an example to create a similar class for your own purposes.

There are two parts to this assignment. Each has its own deliverable and is to be done independently of the other.

Part A: Experimental Discovery of Running Time

You must develop and perform a repeatable experimental procedure that will allow you to empirically discover the big-Oh time complexity of the timeTrial(int N) method in the TimingLab class. The parameter N represents the problem size. Thus, by iteratively calling timeTrial with successively larger values of N, you can collect timing data that is useful for characterizing the method’s time complexity.

The constructor TimingLab(int key) will create a TimingLab object whose timeTrial method is tailored specifically to the given key value. You will use your group number in Canvas as the key required by the constructor. For example, group 42 would invoke the constructor TimingLab(42) and group 23 would invoke the constructor TimingLab(23). This will create two distinct objects whose timeTrial

methods would (potentially) have different time complexities.

You are guaranteed that no matter what key value is used, the associated time complexity will be proportional to Nk for some positive integer k. Thus, you can take advantage of the following property of polynomial time complexity functions T(N).

T(N) µ Nk =)

T(2N)

T(N)

µ (2N)k

Nk =

2kNk

Nk = 2k (1)

1

COMP 2210

This property tells us that as N is successively doubled, the ratio of the method’s running time on the current problem size to the method’s running time on the previous problem size (i.e., T(2N)/T(N)) converges to a numerical constant, call it R, that is equal to 2k, and thus k = log2R.

As an example, consider the data shown in Table 1. Each row in this table records data for a given run of some method being timed. The first column (N) records the problem size, the second column (Time) records the time taken for the method to run on this problem size, the third column (R) is the ratio discussed above (i.e., Timei/Timei1), and the third column (k) is log2R. From Property 1 and Table 1 we can hypothesize that the method being timed has O(N4) time complexity.

Table 1: Running-time data and calculations

N Time R k

8 0.04 — —

16 0.08 2.25 1.17

32 0.84 10.37 3.37

64 7.59 9.03 3.18

128 113.56 14.97 3.91

256 1829.28 16.11 4.01

512 29689.21 16.23 4.02

You are required to use System.nanoTime() to generate timing data like that shown in Table 1. Running time values must be expressed in seconds.

To document your work you must write a lab report that fully describes the experiment used to discover the big-Oh time complexity of the given method. The lab report must discuss the experimental procedure

(what you did), data collection (all the data you gathered), data analysis (what you did with the data), and interpretation of the data (what conclusions you drew from the data). The lab report must be well written,

it must exhibit a high degree of professionalism, and it must be structured like the provided sample. Your experiment must be described in enough detail that it could be reproduced by someone else.

Part B: Experimental Identification of Sorting Algorithms

You must develop and perform a repeatable experimental procedure that will allow you to empirically discover the sorting algorithms implemented by the five methods of the SortingLab class—sort1, sort2, sort3, sort4, sort5. The five sorting algorithms implemented are merge sort, randomized quicksort, non-randomized quicksort, selection sort, and insertion sort.

Insertion sort, selection sort, and merge sort are implemented exactly as described in lecture and the associated note set. Quicksort has two implementations, randomized and non-randomized. Both implementations

choose the pivot in the same way: the pivot is always the left-most element (i.e., a[left]) of the current partition. Both algorithms also use the same algorithm for the partitioning operation (although it is slightly different than the one presented in lecture). The two implementations are different, however in the following way. The randomized quicksort implementation makes the worst case probabilistically

unlikely by randomly permuting the the array elements before the quicksort algorithm begins.

The non-randomized quicksort exposes the algorithm’s worst case by never shuffling the array elements.

To generate timing data you are required to use System.nanoTime(). Running time values must be expressed in seconds. Although timing data will be an important part of your experimental procedure, it will (likely) not be sufficient to distinguish between merge sort and randomized quicksort. To do this you

should think about stability.

The constructor SortingLab(int key) will create a SortingLab object whose sort method implementations are tailored specifically to the given key value. You will use your group number in Canvas as the key required by the constructor. For example, group 42 would invoke the constructor SortingLab(42)

and group 23 would invoke the constructor SortingLab(23), creating two distinct objects whose sort methods would have different implementations.

To document your work you must write a lab report that fully describes the experiments used to discover the the sorting algorithm associated with each of the five sort methods. The lab report must discuss the experimental procedure (what you did), data collection (all the data you gathered), data analysis (what

you did with the data), and interpretation of the data (what conclusions you drew from the data). The lab report must be well written, it must exhibit a high degree of professionalism, and it must be structured like the provided sample. Your experiment must be described in enough detail that it could be reproduced

by someone else.

You'll get a 693.8KB .ZIP file.