teaching machines

CS 240: Lecture 4 – Asymptotic Complexity

September 1, 2021 by . Filed under cs3, fall-2021, lectures.

Dear students:

If I asked you how fast you can run, you probably wouldn’t give me a number. Rather, you’d ask me how far. To get a complete picture of your speed, I might ask how fast you run a mile, a 5K, a 10K, and a marathon. Similarly, when I ask an algorithm how many steps it takes to complete, I don’t get a single number. Instead, I get a function. I tell that function how big the input is, and the function reports how many steps the algorithm takes. We need the whole function in order to make a sound judgement about the algorithm. Some algorithms are sprinters, and some are long-distance runners.

Today we discuss how these functions are dropped into a handful of categories.

Simplifying

When we examine an algorithm’s cost function, we ignore all terms but the one that grows fastest. For example:

Additionally, we ignore all constant terms and coefficients. For example:

We do not automatically throw out constants when they are the base of exponent $n$ or the exponent of base $n$.

This discarding of information needs some justification. How do we know that throwing out these lower-order terms and coefficients won’t throw off our ranking of algorithms?

Let’s start with a visceral argument. Suppose we have the simplified functions $n^2$ and $n$. When we plot them, we clearly see that an algorithm with cost function $n$ is faster:

However, what if we hadn’t simplified? What if the quadratic function was actually $\frac{n^2}{100}$? With that scale factor, the quadratic functions dips below the linear function and is therefore a better algorithm. But if you look at larger values of $n$, you see that the win is only temporary. In the long run, the linear algorithm wins. Whatever constant coefficient you pick, at some point the linear algorithm will drop below the quadratic. Constant coefficients do not change which function ultimately wins in this case.

If the functions are of the same order, surely the coefficients are important. For example, when choosing between $50n$ and $5n$, I will always pick the latter. But if I simplify them both down to $n$, then I won’t know which is better. This is true. The problem is that getting accurate coefficients is really hard. They will be impacted by the programming language, compiler, and hardware.

We must also justify discarding the lower-order terms. Suppose you are comparing the cost functions $n^2 + 3n$ and $n^2$. Both of these simplify to $n^2$, which means that we’re saying these two algorithms are equivalent. But that’s not what the plot shows:

Clearly the $n^2$ algorithm is a better one. But when you zoom out, the gap between them becomes insignificant. The gap of $n$ will never entirely disappear, but it will be dwarfed by the $n^2$ term.

Complexity Classes

Simplifying lands us on just a small set of different functions. Here are some of the most common ones:

Each of these functions forms a set of all the cost functions that simplify to it. We express that a cost function belongs to one of these sets with Big Theta notation:

We might say that an algorithm is Big Theta of $n^2$ or it’s on the order of $n^2$.

Each of these functions also forms a set containing all the cost functions that simplify to it or to something faster. This membership is expressed using Big O notation:

These different sets are called complexity classes.

Big O

Theoretical computer scientists have defined more formal criteria for deciding how to lumping these cost functions together in complexity classes. Here’s the definition given by your textbook for saying that a function is Big O of another:

For $T(n)$, a non-negative function, $T(n) \in O(f(n))$ if and only if there exist positive constants $c$ and $n_0$ such that
$$T(n) \leq cf(n) \text{ for all } n > n_0$$

What this means is that function T can be made to be below function f, though f might need to get scaled up by a constant, and we might have to look at large input sizes.

Based on this we can formally show how cost functions fit into certain complexity classes. For example, $50n \in O(n)$ because $50n \leq cn$, where $c = 50$ and $n_0 = 1$.

This example shows that throwing away the coefficients is an acceptable practice. We can also show that throwing away lower-order terms fits the definition of Big O, but the reasoning requires a little creativity. We may feel, for example, that $n^2 + 40n \in O(n^2)$. We need to show this:

$$n^2 + 40n \leq cn^2$$

That seems hard. However, we do know this:

$$n^2 + 40n \leq n^2 + 40n^2$$

If we combine terms, then we end up finding a value for $c$ that makes the first inequality true:

$$\begin{align}n^2 + 40n &\leq n^2 + 40n^2 \\n^2 + 40n &\leq 41n^2 \\\end{align}$$

This works for $n_0 = 1$. That means we have $n^2 + 40n \in O(n^2)$.

You can think of Big O as imposing a less-than-or-equal-to relationship on functions. Function $f$ is an upper bound to all the functions in its set.

This reveals a weakness in Big O. We can say that $n \in O(n^2)$ because $n \leq cn^2$, for $c = 1$ and $n > 1$. But this isn’t saying much. A very fast algorithm will be Big O of many functions.

Big Omega

Because Big O only tells us vaguely that an algorithm is no worse than another, theorists have also defined Big Omega, which states that an algorithm is no better than another. It has this definition:

For $T(n)$, a non-negative function, $T(n) \in \Omega(f(n))$ if and only if there exist positive constants $c$ and $n_0$ such that
$$T(n) \geq cf(n) \text{ for all } n > n_0$$

This is very much like Big O, but it states that function $f$ can be made to dip below and stay below function $T$. The following memberships can be shown:

You can think of Big Omega as imposing a greater-than-or-equal-to relation on functions. But this too is not precise. Saying that an algorithm with a factorial cost function is $\Omega(n)$ true and unhelpful.

Peer Instruction Exercise

Suppose you have developed an algorithm for routing $n$ pizza deliveries in the most fuel efficient way. The algorithm has cost function $2^{n+1}$. You are considering selling your algorithm and want to advertise it in the most favorable but truthful way. In which of the following sets do you advertise its membership?

Big Theta

To have meaningful measures of an algorithm’s time efficiency, we want sandwich its cost function both above and below. Sandwiching is provided by the Big Theta classification, which has this definition:

For $T(n)$, a non-negative function, $T(n) \in \Theta(f(n))$ if and only if $T(n) \in O(f(n))$ and $T(n) \in \Omega(f(n))$.

This means that one scaling of $f$ can be made up stay on or above function $T$, and another scaling can be made to stay on or below. You can think of Big Theta as imposing an equality relationship on functions. Given this intuition, which of the following statements is true?

TODO

See you next time.

Sincerely,

P.S. It’s time for a haiku!

My car is a boat
But no slower than walking
It’s a Big O car