Big-O
Definition
Basic rules
Examples
Sorting techniques
Selection sort
Insertion sort
Big-O notation is a technique for quickly approximating the speed of a procedure. Basically, it identifies whether the running time grows exponentially with the input size, or whether it grows linearly, or quadratically, or whatever.
We can define the time to run a procedure as T(n), a function T of the input size n. Big-O notation would write something like the following.
Formally, the notation T(n) = O(g(n)) means the following: There is some constant c and some input size N, so that for any input sizes n beyond N,
For example, consider the following code that determines whether a number n is prime.
This procedure turns out to take O(n0.5) time: The total number of iterations can be bounded by sqrt(n), and the amount of time for each iteration is bounded by a constant.int i; for(i = 2; i * i <= n; i++) { if(n % i == 0) { IO.println("not prime"); break; } } if(i * i > n) IO.println("is prime");
There are three helpful rules that you might keep in mind when you're trying to evaluate the speed of a program.
In our primality example, we would reason like the following.
Each of the blocks replaced by ``O(1) time'' involved no loops or function calls.O(1) time for(...) { O(1) time } O(1) time
O(1) time O(sqrt(n)) time O(1) time
// exponentation, method 1: finds a to the bth power ret = 1.0; for(int i = b; i > 0; i--) ret *= a; IO.println(ret); // exponentiation, method 2: finds a to the bth power ret = 1.0; int i = n; while(i > n) { if(i % 2 == 0) { i = i / 2; } else { ret *= a; i = (i - 1) / 2; } a = a * a; } IO.println(ret); // count lattice points: determines how many integer-coordinate // points lie in a circle of radius r count = 0; y = 0; x = r; while(x > 0) { while(x*x + y*y < r*r) { ++y; } count += y; } IO.println(4 * count + 1);
This last example illustrates an example where the ``basic rules'' outlined above give a poor solution: You're better off reasoning through the problem more carefully.
As you'll quickly learn, sorting an array is one of computer science's favorite topics. It's highly useful, simple, and nonetheless interesting. No fewer than four of the required classes for a major address the topic (150, 160, 200, 338).
There are lots of techniques. Personally, my favorite technique - just because it's the simplest to understand and implement - is selection sort. The selection sort algorithm involves finding the smallest element of the array, and swapping it with the first element. Then finding the smallest element after the first element, and swapping it with the second element. Then finding the smallest element after the second element and swapping it with the third element, etc. Once you're done, the array is sorted.
public class SelectionSort { public static void sort(int[] elts) { for(int i = 0; i < elts.length; i++) { int min = i; // min will be index of minimum starting at i for(int j = i + 1; j < elts.length; j++) { if(elts[j] < elts[min]) min = j; } if(i != min) { // swap elts[i] and elts[min] int temp = elts[i]; elts[i] = elts[min]; elts[min] = temp; } } } }
Insertion sort is a different algorithm, which you may remember from a CSCI 150 lab: In insertion sort, you begin with the first element, which by itself is already sorted. But then you insert the second element into the list (possibly swapping the first and second elements). Then you insert the third element into the first two elements where it fits.
public class InsertionSort { public static void sort(int[] elts) { for(int i = 1; i < elts.length; i++) { int k = elts[i]; // store element i in k int j; for(j = i - 1; j >= 0 && elts[j] > k; j--) { elts[j + 1] = elts[j]; // shift elts[j] up one slot } elts[j + 1] = k; // place k into vacated slot } } }
(Notice the use of the short-circuit property of && in the condition for the for loop: We test whether j >= 0 first, and then we test whether elts[j] > k. This order is important. If we tested whether elts[j] > k first, then it might happen that j was -1, in which case this would throw an ArrayIndexOutOfBoundsException.)
Although insertion sort and selection sort both consume O(n2) time, insertion sort is considered to be faster, since it at least sometimes does better. For example, if somehow we knew the array was already sorted, insertion sort would only take O(n) time. Of course, if we knew it were already sorted, why would we call it? But insertion sort also does well if the list is almost sorted.
Given these two sorting techniques, you might be inclined to give up: Is there any hope for doing better than O(n2)? It turns out there is, but that's a topic for later.