CS 240: Lecture 20 - Heaps Continued
Dear students:
Today we continue our discussion of heaps. First we'll examine how they can be used to implement a very pleasant sorting algorithm that uses little time and little space. Then we'll discuss the complexity of the various priority queue and heap operations.
Heap Sort
When we last left sorts, we had two reasonable, linearithmic algorithms: merge sort and quick sort. Neither is perfect: merge sort requires extra space and quick sort can degenerate to quadratic time. With heaps, we can achieve a very fine sort that requires no extra space and has a linearithmic worst case complexity.
Heap sort starts by taking the unsorted array and imposing a max-heap structure on it. That means we need to get larger items above smaller items. We have two choices for how to implement this. We could perform this task:
for each node from top to bottom
percolate node up
for each node from top to bottom percolate node up
Or this one:
for each node from bottom to top
percolate node down
for each node from bottom to top percolate node down
Which approach is better? The items at the bottom of the tree already satisfy the heap property and won't need to percolate down. The second approach will be less work. We can just skip over the bottom row completely.
Once the heap is formed, we repeatedly remove the maximum. Instead of throwing it into a new sorted collection, which would give heap sort the same blemish as merge sort, we place it at the end of the original array. After all items have been removed, the array contains the sorted list.
Time Complexity
Suppose we want to implement the priority queue ADT. We have many data structures we could use. Let's consider our choices and their computational complexities by filling out this table:
data structure | insert | removeMin | peekMin |
---|---|---|---|
sorted array | ? | ? | ? |
unbalanced BST | ? | ? | ? |
AVL tree | ? | ? | ? |
heap | ? | ? | ? |
The insert
operation takes in a priority-item pair. The removeMin
operation always removes and returns the item with the lowest priority value. The peekMin
operation returns it but does not remove it.
Heapify
Heaps have an advantage over balanced BSTs that is not reflected in the table above. Suppose you have an array filled with values in an arbitrary order, and you want to turn it into one of these collections. How many operations will it take? Sorting it will take
What about populating a heap? A first analysis might observe that there are
Imagine the un-heapified array drawn in tree form. The most costly tree to turn into a heap is a complete one, so imagine a complete tree. How many items are in it? This table matches the number of levels to the number of nodes:
height | nodes |
---|---|
0 | 1 |
1 | 3 |
2 | 7 |
3 | 15 |
4 | 31 |
… | … |
Now let's consider the cost of getting each of these items into place so that we have a heap. To keep things concrete, let's examine a tree of height 5, which will have
level | nodes | swaps per node | total swaps |
---|---|---|---|
All told, we'll need to perform 57 swaps to get the array into a heap. There are only 63 nodes. This suggests that
The summation term is interesting. It expands to this series:
In fact, this is a geometric series, and we can reduce it down to a simpler expression with some sleight of hand. First we consider the sum, the half sum, and their difference:
The sequence represented by
We now rewrite our total number of swaps:
Earlier we said that
We achieve this performance because we do the least work per node in the levels where the most nodes live. If we had implemented the heapify operation in reverse, starting at the root and percolating upward, we would be doing the least work per node in the levels where the fewest nodes live. It's mildly disturbing that a subtly different approach to the task can lead to a slower complexity class.
Space Complexity
Let's consider also how much space these data structures consume.
data structure | incidental space |
---|---|
sorted array | ? |
unbalanced BST | ? |
AVL tree | ? |
heap | ? |
TODO
You've got a few things to do before we meet again:
See you next time!
Sincerely,
P.S. It's time for a haiku: