CS 240: Lecture 5 - Insertion Sort
Dear students:
Last time we looked at selection sort. We implemented it and we analyzed it. Today we do the same for insertion sort. But we also need to formalize our simplification of cost functions down to complexity classes.
Insertion Sort
Just like selection sort, insertion sort orders a collection. But its mechanic is different. Selection sort focuses on a single cell and finds the value that belongs in it. As soon as a cell is filled, that cell is never touched again. Insertion sort takes this approach:
shove element 1 leftward to order elements 0-1
shove element 2 leftward to order elements 0-2
shove element 3 leftward to order elements 0-3
...
shove element 1 leftward to order elements 0-1 shove element 2 leftward to order elements 0-2 shove element 3 leftward to order elements 0-3 ...
After we place an element, it may not be in the right spot. Later elements may displace it back to the right. That seems wasteful, right? Surely insertion sort has no redeeming qualities over selection sort.
Here's one way it might be implemented:
public static void insertionSort(int[] items) {
for (int fromIndex = 1; fromIndex < items.length; ++fromIndex) {
for (int toIndex = fromIndex; toIndex > 0 && items[toIndex - 1] > items[toIndex]; --toIndex) {
int tmp = items[toIndex - 1];
items[toIndex - 1] = items[toIndex];
items[toIndex] = tmp;
}
}
}
public static void insertionSort(int[] items) { for (int fromIndex = 1; fromIndex < items.length; ++fromIndex) { for (int toIndex = fromIndex; toIndex > 0 && items[toIndex - 1] > items[toIndex]; --toIndex) { int tmp = items[toIndex - 1]; items[toIndex - 1] = items[toIndex]; items[toIndex] = tmp; } } }
To make an informed statement about the superiority of selection sort over insertion sort, we must examine insertion sort's cost function. Suppose we try sorting this array:
The first iteration will touch all five cells and yield this ordering:
The second iteration will touch only four cells and yield this ordering:
And so on for the remaining iterations:
This stack of cells looks just like selection sort. That means we have this cost function:
This simplifies to
It would seem that insertion sort is no better than selection sort. However, what if we sort this array?
On the first iteration, we touch only elements 0 and 1:
On the remaining iterations, we touch only the fromIndex
and its immediate left-neighbor:
Under these circumstances, the cost function is:
By this reasoning, insertion sort is a linear algorithm.
So, which is it? Is insertion sort linear or quadratic? It is both. In the worst case, it runs in quadratic time. In the best case, it runs in linear time. To paint a complete picture of the algorithm, you must present its best, worst, and average cases. Computing average case performance is a lot of work, and we're not going to do it this semester.
If you want to mislead people into buying your algorithm, you share your best case but hide the fact that the best case is rare. If you want to mislead people into rejecting your opponent's algorithm, you share their worst case but hide the fact that the worst case is rare.
In this case, the best case of insertion sort is rare, so you are unlikely to experience that linear time performance. Selection sort, on the other hand, always touches the same number of cells. Its best, worst, and therefore average cases are all quadratic.
Complexity Classes
Simplifying lands us on just a small set of different functions. Here are some of the most common ones:
Each of these functions forms a set of all the cost functions that simplify to it. We express that a cost function belongs to one of these sets with Big Theta notation:
We might say that an algorithm is Big Theta of
Each of these functions also forms a set containing all the cost functions that simplify to it or to something faster. This membership is expressed using Big O notation:
These different sets are called complexity classes.
Big O
Theoretical computer scientists have defined more formal criteria for deciding how to lump these cost functions together in complexity classes. Here's the textbook definition for saying that a function is Big O of another:
For cost function
What this means is that function
You can think of Big O as imposing a less-than-or-equal-to relationship on functions. Function
This reveals a weakness in Big O. We can say that
Exercise
Suppose you have developed an algorithm for routing
Big Theta
To have meaningful measures of an algorithm's time efficiency, we want to sandwich its cost function both above and below. Sandwiching is provided by the Big Theta classification, which has this definition:
For cost function
This means that one scaling of
Proving Membership
Finding constants to prove that a cost function is
If the limit converges, then the value
We're not going to prove this. Sorry.
Let's apply this strategy in a few examples. Consider
Since
Consider
Since
Consider
Since
We'll work through
TODO
You've got a few things to do before we meet again:
See you next time!
Sincerely,
P.S. It's time for a haiku!
while
Maybe even two