teaching machines

CS 240: Lecture 27 – AVL Trees

October 27, 2021 by . Filed under cs3, fall-2021, lectures.

Dear students:

Last time we introduced a special case of binary trees in which the left descendents are all less than their root and all the right descendents are greater. Such a tree holds the binary search in its bones.

The binary search is fast, right? We like that. To get that speed, we need to sort our collection and keep it sorted as we add new things to it. Sorting and searching are issues we’ve already examined with arrays, so why do we bother coming up with a new structure for performing a binary search? Consider this table comparing the worst case performance of the find, insert, and remove operations for sorted arrays and binary search trees:

data structure find insert remove
sorted array $\Theta(\log_2 n)$ $\Theta(n)$ $\Theta(n)$
balanced BST $\Theta(\log_2 n)$ $\Theta(\log_2 n)$ $\Theta(\log_2 n)$
unbalanced BST $\Theta(n)$ $\Theta(n)$ $\Theta(n)$

Binary search trees beat out arrays when it comes to inserting and removing because they are linked. Arrays need to shuffle their elements around. However, binary search trees have the advantages only when they are balanced, when they have as much mass on the left as on the right. When they aren’t balanced, the performance degenerates to linear time for all three operations.

This means care must be taken to keep a tree balanced. Several strategies exist for keeping them balanced. The strategy we examine today is called an AVL tree. This was the first so-called self-balancing tree that was invented all the way back in the 1960s.

In an AVL tree, each node is assigned a balance factor. A node’s balance factor is the difference between its children’s heights. If the left side is taller than the right, then the balance factor is positive. If the right side is taller, then the balance factor is negative. If they are equal, the balance factor is 0. The balance factor is calculated as left.height - right.height. If a child doesn’t exist, its height is considered to be -1.

Insertion

After a new node is inserted in an AVL tree, one must walk up from the new node, possibly all the way to the root. If any node on that path has a balance factor greater than 1 or less than -1, then it must be rebalanced by rotating the node. Rotating a node left means that its right child takes its place as the parent, and the node becomes its left child. Rotating a node right means that its left child takes its place as the parent, and the node becomes its right child.

The new parent might already have a child in the place where the rotated node is supposed to be installed. The child will have to be moved, but where? The rotated node has lost one of its children, so it’s a good candidate. In a left rotation, the new parent’s left child moves to become the right child of the rotated node. In a right rotation, the new parent’s right child moves to become the left child of the rotated node.

A single rotation will not always balance out the tree. This situation can be detected by comparing the signs of the balance factors of the node you want to rotate and the node that will take its place. If their signs are different, then the new balance factors won’t improve. In such a situation, we first rotate the node’s child.

Altogether, here is our strategy for deciding how to rotate a node:

if node's balance factor is > 1 (left is heavy)
  if signs of node's and left child's factors don't match
    rotate left child left
  rotate node right
else if node's balance factor is < 1 (right is heavy)
  if signs of node's and right child's factors don't match
    rotate right child right
  rotate node left

Exercises

For the remainder of our time together, you’ll work through some exercises on binary search trees with your neighbors.

TODO

You have some work to do before the next class:

See you next time.

Sincerely,

P.S. It’s time for a haiku!

Our wish when we’re young
To be our parents’ parent
But not when they’re old