CS 330 Lecture 31 – Call-by-name and Call-by-need
Dear students,
Last time we introduced a different way of passing parameters. Instead of eagerly evaluating them, we delayed evaluation until they were referenced inside the function. You are probably thinking, “Call-by-name. Big deal. How many new control structures am I going to need in my life?” Probably not many, but I think the bigger deal is that just as Stroustrup made our new types indistinguishable from the builtin types, call-by-name lets our code get the same special treatment that builtin structures get. Call-by-name is just another step in the path to a language designer granting you full citizenship in the world centered on that language.
Let’s extend our table. Suppose a call-by-name function never references its lazily evaluated parameter. How many times will it have been evaluated? None. That’s a win compared to eager evaluation, especially if evaluating it was expensive. What if it’s referenced once? It will be evaluated once. What if it’s referenced many times? It will be evaluated many times.
scheme | 0 references | 1 reference | n references |
---|---|---|---|
call-by-value | 1 | 1 | 1 |
call-by-name | 0 | 1 | n |
Note that we don’t explicitly pass functions as parameters to a call-by-name function. The language’s tooling wraps up our actual parameters in functions for us. Since most languages don’t support call-by-name, our only recourse to achieve something like it is to explicitly pass functions or lambdas. In Ruby we can use blocks for this. Let’s revisit repeat
with blocks:
def repeat n i = 0 while i < n yield i += 1 end end repeat 5 do puts "Foobag!" end
How about a loop-and-a-half control structure? Sadly, Ruby doesn’t let us pass more than one block to a function. We’ll use a lambda
instead:
def loophalf n, outer, inner if n > 0 outer.call n.times do inner.call outer.call end end end loophalf 3, lambda { print 'x' }, lambda { print 'o' }
Even if call-by-name doesn’t exist in most of our languages, we can get close!
Okay, that’s call-by-name/pass-by-closure. There’s another scheme for passing parameters that we should look at. It too is lazy, but instead of reevaluating the parameter on every reference, we only evaluate it the first time and record the result so future references are really fast. This is termed call-by-need, and it too needs a better name. Like pass-by-jit. JIT stands for just-in-time, hinting at the lazyness of this scheme. Let’s round out our table:
scheme | 0 | 1 | n |
---|---|---|---|
call-by-value | 1 | 1 | 1 |
call-by-name | 0 | 1 | n |
call-by-need | 0 | 1 | 1 |
Call-by-need semantics are all over Haskell. What are they good for, those lazy sods? Well, let’s see how they work first, and then I think we’ll be better equipped to answer that question. We will demonstrate the ideas of call-by-need in Ruby, even though it doesn’t technically support it. Let’s create a Lazy
class that encapsulates the big ideas of call-by-need:
- The programmer expresses a computation, but it is not executed until later
- When the programmer does need the real value, the computation is executed and the result is stored for later quick retrieval
How do we expression a computation but not execute it? We do it all the time. Push it into a function. Or a lambda. Or a block. Someone will therefore use our class like this:
image = Lazy.new do open('http://www.twodee.org/tests/slowimage/slowimage.php').read end puts "The lazy image has been constructed, but not executed." if some condition puts "Ugh! This is gonna be slow." puts image.eval.length puts "But not these!" puts image.eval.length puts image.eval.length puts image.eval.length puts image.eval.length puts image.eval.length puts image.eval.length else puts "we never paid the cost of evaluating the image!" end
Retrieving that image is a costly operation. We can avoid that cost if we don’t need it. But if we need it a lot, we’ll only take the hit once.
Alright, so how do this? Lazy
‘s constructor will need to receive and hang on to a block. This is often called a thunk:
class Lazy def initialize &block @thunk = block end end
We will need to evaluate that thunk when the user really needs the value:
class Lazy def initialize &block @thunk = block end def evaluate @thunk.call end end
But there’s no performance benefit yet. We also need to hold on to (memoize) the value after the first execution. Subsequent calls will not reevaluate. They will simply return the previously cached result:
class Lazy def initialize &block @thunk = block @cache = nil end def evaluate if @cache.nil? @cache = @thunk.call end @cache end end
So, there’s an obvious performance benefit from call-by-name semantics. In fact, this laziness can let us create infinite data structures: lists that go on forever, trees that have no leaf nodes. Let’s make an infinite list. It’s always good to start off designing a class by looking at it from the client’s perspective. We want to start it off at a particular number and have it jump by some delta between elements. We can iterate through like this for as many elements as we wish:
l = InfiniteList.new 10, 5 40.times do puts l.head l = l.tail end
We can start to write the constructor. We know the list will have the given number as its head:
class InfiniteList def initialize start, delta @head = start end end
What about its tail? That’ll be another InfiniteList
whose head is start + delta
. But that InfiniteList
will itself have a tail that’s an InfiniteList
, and things will get out of hand quickly. We need to be lazy about the tail:
def initialize start, delta @head = start @tail = Lazy.new InfiniteList.new start + delta, delta end end
Now, we need some methods for accesing the list’s head and tail. The head is pretty easy:
def head @head end
But remember that we need delayed evaluation of the tail until such a time as it was needed. That time is now. We force evaluation of the tail:
def tail @tail.eval end
Will you use call-by-name a lot in life? Well, you wouldn’t if you didn’t know it was possible. But the idea of postponing an evaluation until later and caching its result is definitely not crazy. Call-by-name semantics actually build that behavior into a data structure so that the delaying and caching is transparent to the developer. You are probably using software that uses the call-by-name idea without realizing it. I can think of a few applications:
- Consider an SQL query that yields thousands or millions of records. Not all of them are probably given to you when you issue your
SELECT
statement. Perhaps you are given the first 100, and the remainder are lazily waiting around to be needed. - Consider a game with a large 3D world to explore. A smart game engine won’t immediately load in the 3D models and textures for parts of the world that you aren’t currently in. Eagerness and graphics cards don’t get along.
- Consider visualizing a social graph. If you want see who is friends with someone else, you don’t need to load the entire graph into memory at once. You can load the graph lazily. Only when a node is selected do you retrieve its connected nodes.
A data structure that abstracts this laziness improves performance without burdening the clients.
Here’s your TODO list:
- Read Pattern Matching for Java. On a quarter sheet, write down 2-3 questions or observations. Would you tweak the authors’ suggestions in any way? What situations would benefit from pattern matching in Java?
See you next time, when we discuss some how to tame recursion that gets out of hand.
P.S. It’s Haiku Friday!
I don’t have clean bowls
It’s not because I’m lazy
I just wash by need