CS 240: Lecture 6 - Merge Sort
Dear students:
Last week we examined two sort algorithms. They contained little magic, but they were also slow. This week we look at two faster sorts: merge sort and quick sort. Both use recursion.
Concrete
Insertion sort and selection sort can be written in just a few lines of code, but the cost of each of those lines is quite high. Both algorithms are
Since there are
When
Merge Sort
Merge sort sorts an array by splitting it in two and sorting each half independently. The recursive calls keep splitting the array. Eventually we reach a subarray that contains one element or none. The elements in that very small subarray are sorted. Once the left and right subarrays are both sorted, they are merged together into one big sorted array.
The splitting operation might look something like this:
private static void mergeSort(int[] items,
int[] tmp,
int left,
int right) {
if (left < right) {
int mid = (left + right) / 2;
mergeSort(items, tmp, left, mid);
mergeSort(items, tmp, mid + 1, right);
merge(items, tmp, left, mid, right);
}
}
private static void mergeSort(int[] items, int[] tmp, int left, int right) { if (left < right) { int mid = (left + right) / 2; mergeSort(items, tmp, left, mid); mergeSort(items, tmp, mid + 1, right); merge(items, tmp, left, mid, right); } }
The tmp
variable provides scratch space that eases the merging process. The int
parameters delimit the window of the array that is being sorted. We shouldn't expect the caller to provide anything but the array to be sorted, so we make this helper method private. The public method primes the parameters to their initial values:
public static void mergeSort(int[] items) {
mergeSort(items, new int[items.length], 0, items.length - 1);
}
public static void mergeSort(int[] items) { mergeSort(items, new int[items.length], 0, items.length - 1); }
When I look at this algorithm, I find it hard to believe that it actually does anything. Most of its lines are devoted to the splitting. The real sorting magic is done by the merge
method.
Tentative Analysis
Finding the cost function for merge sort requires a new strategy. For the selection and insertion sorts, the costs of the inner loop increased linearly over
Drawing a picture for merge sort is possible. But first we use a recurrence. A cost function expressed as a recurrence is recursive. There's a base case for some small value of
In a general case,
For merge sort, the cost of the current level is the cost of the merge operation. We don't know what that is yet. Let's call it
But what is
Merge
The merge method receives an array that has two sorted subarrays in it. Its job is to reorganize those two subarrays into one sorted subarray. But it has to do this in an efficient way. It does this by considering only the head elements of the two subarrays at a time. It picks off the smaller of the two as the next element in the merged array.
When a head element is pulled from its subarray, the head index is nudged rightward. Eventually one of the subarrays will be consumed. At that point, we stop comparing the head elements and just draw from the rest of the unconsumed list.
We might implement merge
in this way:
private static void merge(int[] items, int[] tmp, int left, int mid, int right) {
// Copy items to temporary array since we're about to
// chaotically reorganize the original.
for (int i = left; i <= right; ++i) {
tmp[i] = items[i];
}
int i = left;
int leftIndex = left;
int rightIndex = mid + 1;
// Compare heads until one subarray is empty.
for (; leftIndex <= mid && rightIndex <= right; ++i) {
if (tmp[leftIndex] <= tmp[rightIndex]) {
items[i] = tmp[leftIndex];
++leftIndex;
} else {
items[i] = tmp[rightIndex];
++rightIndex;
}
}
// Copy over remainder of left subarray.
for (; leftIndex <= mid; ++leftIndex, ++i) {
items[i] = tmp[leftIndex];
}
// Copy over remainder of right subarray.
for (; rightIndex <= right; ++rightIndex, ++i) {
items[i] = tmp[rightIndex];
}
}
private static void merge(int[] items, int[] tmp, int left, int mid, int right) { // Copy items to temporary array since we're about to // chaotically reorganize the original. for (int i = left; i <= right; ++i) { tmp[i] = items[i]; } int i = left; int leftIndex = left; int rightIndex = mid + 1; // Compare heads until one subarray is empty. for (; leftIndex <= mid && rightIndex <= right; ++i) { if (tmp[leftIndex] <= tmp[rightIndex]) { items[i] = tmp[leftIndex]; ++leftIndex; } else { items[i] = tmp[rightIndex]; ++rightIndex; } } // Copy over remainder of left subarray. for (; leftIndex <= mid; ++leftIndex, ++i) { items[i] = tmp[leftIndex]; } // Copy over remainder of right subarray. for (; rightIndex <= right; ++rightIndex, ++i) { items[i] = tmp[rightIndex]; } }
Only one of the bottom two for loops will have any iterations. Note that there are other ways to write this method.
In the worst case, how many item comparisons will we need to perform to merge our two subarrays? The best case happens when one array is entirely smaller than the other. It will get copied over by the first for loop, and then the other array will be drained. In the worst case, the two halves will be evenly matched. Neither array will get drained before the other until the very last element. That means the first for loop will execute
We now completely describe the cost function of merge sort as a recurrence:
How does this compare to
However, I don't want to leave you in suspense. We can reason about the approximate cost using a picture. At the first level of the recursion, we merge together all of the
This complexity class is the product of a linear function and a logarithmic function. We say it's linearithmic. We'll plot a linearithmic function against a quadratic to see how merge sort compares to the selection and insertion sorts.
TODO
You've got a few things to do before we meet again:
See you next time!
Sincerely,
P.S. It's time for a haiku!
m
steps
But then they halved the problem
Now it takes just n