CS 330: Lecture 28 – Filter, Map, and Fold
Dear students:
Last time we talked about lambdas and higher-order functions. We wrote this count method:
count :: (a -> Bool) -> [a] -> Int
count _ [] = 0
count predicate (first : rest)
| predicate first = 1 + count predicate rest
| otherwise = count predicate rest
Let’s start with a Program This that uses this function:
Partially applyHere’s my solution.count
to define functionnUpper
. It accepts a list of Cartesian coordinate pairs. It returns the number of pairs whose x-coordinate is less than its y-coordinate. If you cut space along the x == y diagonal, such points are in the upper triangle.If you finish that one, write another to count the number of short
String
s in a list ofString
s.
nUpper :: count (\(a, b) -> a < b)
shorties :: count (\s -> length s < 5)
nUpper [(1, 2), (3, 3), (4, 5), (11, 10)]
shorties ["I", "think", "it's", "impossible", "to", "really", "understand", "somebody", "what", "they", "want", "what", "they", "believe", "and", "not", "love", "them", "the", "way", "they", "love", "themselves"]
Lambdas are pretty great, but be careful. What’s wrong with these?
count (\x -> x > 0) [-3..3]
count (\c -> isUpper c) "You Get What You Need"
count (\x -> x % 2 == 0) [-4..4]
All three of these are cases of UFW—unnecessary function wrapping. We have existing functions that do the same thing, and there’s no need to define a new one:
count (> 0) [-3..3]
count isUpper "You Get What You Need"
count even [-4..4]
Now, we’ve seen how easy it is to define functions on the fly and hand them off to other functions. Think about what this means for those tired old patterns we have been writing in our code. How many loops have you written that have the exact same structure, with the only difference being how you act on an element. Or how you combine it with the others. We can pull out those repeated parts into a function but push the parts that change into a parameter. That parameter has to be an action—a function.
We’ll look at two of those patterns today and implement their abstractions in Haskell. Eventually we will look at them in other languages.
Let’s start with filter
. In SQL, this is the WHERE
clause. We want to reduce a collection to some subset matching some criteria. That criteria is the hole that we must leave blank, but the looping structure is the same everytime:
keepers = [] for each item in list if keeper meets criteria append item to keepers
Does this algorithm look familiar to you? It should. It’s a lot like count
. In Haskell, we can write it like so:
filter' :: (a -> Bool) -> [a] -> [a]
filter' _ [] = []
filter' predicate (first : rest)
| predicate first = first : filter' predicate rest
| otherwise = filter' predicate rest
Now let’s create a collection on which to test this. First we’ll make a new Planet
datatype:
-- A planet is a 3-tuple of name, distance from sun in AU, and planet-Earth size ratio
data Planet = Planet String Double Double deriving Show
Then we can make a list:
planets = [
Planet "Mercury" 0.4 0.055,
Planet "Venus" 0.7 0.815,
Planet "Earth" 1.0 1.0,
Planet "Mars" 1.5 0.107,
Planet "Jupiter" 5.2 318.0,
Planet "Saturn" 9.5 95.0,
Planet "Uranus" 19.6 14.0,
Planet "Neptune" 30.0 17.0
]
We can grab a list of all the planets with more mass than Earth:
massierThanEarth (Planet name, au, mass) = mass > 1.0
massiers = filter' biggerThanEarth planets
-- or using a lambda
massiers' = filter' (\(Planet _ _ mass) -> mass > 1.0) planets
We can grab a list of all the planets inside the asteroid belt, whose inner radius is about 2.2 AUs:
prebelters = filter' (\(Planet _ au _) -> au < 2.2) planets
We can grab a list of all the planets that have a U in their name:
us = filter' (\(Planet name _ _) -> elem 'u' name) planets
The key idea here is that we wrote filter'
once. We wrote the iteration once. We never have to write it again to filter a list. We need only jam a lambda into the hole it leaves open in the algorithm: the predicate.
The next pattern we’ll visit is map. It abstracts this general algorithm:
news = array same size as olds for each item in olds news[i] = transform olds[i] return news
What does this look like in Haskell?
map' :: (a -> b) -> [a] -> [b]
map' _ [] = []
map' xform (first : rest) = xform first : map' xform rest
We can turn a list of planets into a list of planet names:
map' (\(Planet name _ _) -> name) planets
We can append , Lord of Catan
to a bunch of names:
map' (++ ", Lord of Catan") ["Alice", "Bob", "Carol", "Diphthong"]
We can generate a never-ending sequence of even numbers:
map' (*2) [1..]
It’s time for the third big pattern. However, before we examine an abstraction, we should see concrete realizations of the thing we’re abstracting—some say three. Let’s write the following functions, all of which accept a list. They can all be written in the two-line recursive breakdown we’ve seen for processing lists.
sum
, which yields the sum of all the elements in the listproduct
, which yields the product of all the elements in the listjoin
, which concatenates together all the elements of the listany
, which yields true if any of the boolean elements of the list is true, and false otherwiseall
, which yields true if all of the boolean elements of the list are true, and false otherwise
What do these solutions all have in common? Well, they all turn a list into some summary value. How are they different?
- In their “default” or “zero” value that is invoked in the base case.
- In their mixing or combining function, which stirs one of the elements into the summary value.
(In these cases, that summary value is a scalar, but it could have been another collection. Consider copy (first : rest) = first : copy rest
.)
The abstraction of this algorithm is called fold
or reduce or inject, and the zero value and the mixing function will be the holes that we leave open. Its interface will be something like this:
fold mix zero list
Let’s define it, postponing the type signature for a moment:
fold mix zero [] = zero
fold mix zero (first : rest) = mix first (fold mix zero rest)
We can apply a few syntactic niceties:
fold _ zero [] = zero
fold mix zero (first : rest) = mix first $ fold mix zero rest
Now, let’s rewrite all of our functions from above using fold
:
sum = fold (+) 0
product = fold (*) 1
join = fold (++) ""
any = fold (||) False
all = fold (&&) True
With these three functional patterns abstracted away, we will almost never have to write code again, I’m pretty sure. They give us the power to be brief.
Here’s your TODO list:
- Read Higher Order Functions in Learn You a Haskell. On a quarter sheet, write a function that uses two of
map
,filter
, andfold
to compute something interesting. Use lambdas and partial evaluation where appropriate.
See you next time!
P.S. It’s time for a haiku!
Filter, map, and fold
They solve most of our problems
‘Cept for Syria
P.P.S. Here’s the code we wrote together:
lecture.hs
filter' :: (a -> Bool) -> [a] -> [a]
filter' pred [] = []
filter' pred (first : rest)
| pred first = first : filter' pred rest
| otherwise = filter' pred rest
data Planet = Planet String Double Double deriving Show
planets = [
Planet "Mercury" 0.4 0.055,
Planet "Venus" 0.7 0.815,
Planet "Earth" 1.0 1.0,
Planet "Mars" 1.5 0.107,
Planet "Jupiter" 5.2 318.0,
Planet "Saturn" 9.5 95.0,
Planet "Uranus" 19.6 14.0,
Planet "Neptune" 30.0 17.0
]
massierThanEarth (Planet _ _ mass) = mass > 1
massiers = filter' massierThanEarth planets
prebelters = filter' (\(Planet _ au _) -> au < 2.2) planets
us = filter' (\(Planet name _ _) -> elem 'u' name || elem 'U' name) planets
map' :: (a -> b) -> [a] -> [b]
map' xform [] = []
map' xform (first : rest) = xform first : map' xform rest
fold mix zero [] = zero
fold mix zero (first : rest) = mix first (fold mix zero rest)
sum' = fold (+) 0
product' = fold (*) 1
all' = fold (&&) True