CS 330 Lecture 28 – Haskell IO
Dear students,
There are a few ideas we didn’t get to last time. Let’s touch upon those before diving into a discussion of Haskell I/O.
First, what about the type signature for fold
? Let’s reason it out. We know it will have four parts: three parameters and a return value.
fold :: _ -> _ -> _ -> _
What’s the type of the zero value? It’s generic, so we’ll call it a
:
fold :: _ -> a -> _ -> _
We have enough information to complete another of the blanks. In the base case, the zero value is going to be returned. That means the return type better match the zero value’s type:
fold :: _ -> a -> _ -> a
How about the list? It too is generic and doesn’t have to be the same as the zero value, as demonstrated by length
. We’ll call it b
:
fold :: _ -> a -> [b] -> a
Time for the last blank. It’s a function. It takes in an element from the list (a b
) and the result of a recursive call (an a
), and it’s result becomes fold
‘s result (an a
). That means we’ve got this for a type signature:
fold :: (b -> a -> a) -> a -> [b] -> a
Second, sometimes we want to summarize a collection and use one of the elements as the summary value. Think biggest
and smallest
:
biggest [] = error "not possible" biggest (first:[]) = first biggest (first:rest) = max first $ biggest rest smallest [] = error "not possible" smallest (first:[]) = first smallest (first:rest) = min first $ smallest rest
We can define these with fold
too!
biggest [] = error "not possible" biggest (first:rest) = fold max first rest smallest [] = error "not possible" smallest (first:rest) = fold min first rest
We could also abstract out this special case of folding that expects a list of at least one element:
fold1 [] = error "not possible" fold1 mix (first:rest) = fold mix first rest biggest = fold1 max smallest = fold1 mi
Third, many of you didn’t think much of closures. I want to share an example that should demonstrate to you that they are actually quite nice and that you’ve probably been using them all along.
Consider the problem of creating a compound slider/textbox widget in HTML and Javascript:
<!DOCTYPE html> <html> <head> <title>...</title> <style> form { background-color: cornflowerblue; border-radius: 20px; width: 40%; text-align: center; padding: 20px; } </style> </head> <body> <form> <input id="slider" type="range"><br> <input id="box" type="text" size="3"> </form> <script> var slider = document.getElementById('slider'); var box = document.getElementById('box'); slider.oninput = function() { box.value = slider.value; } box.oninput = function() { slider.value = box.value; } </script> </body> </html>
These callback functions are closures. They rely on slider
and box
being accessible. There’s nothing crazy here. This code is not suprising, and it is pleasant to write. Let’s consider our alternatives. We have a few choices about how to handle scope when we define functions:
- We could disallow free variables, forcing everything come in through parameters. The thing is, we aren’t always the ones calling these functions, and passing parameters to them is not an option.
- We could bind the free variables in scope at the time of the function’s definition. That’s a closure.
- We could bind the free variables in scope at the time of the function’s invocation. That’s called dynamic scoping, and it is too wild.
Alright, I don’t think we can postpone discussing Haskell IO any longer. Up till this point we haven’t written any standalone programs. Nothing that gets input from the user, nothing that generates output. We have looked at the purely functional side of Haskell, where time does not exist. Functions always produce the same value given the same inputs. There’s no notion of a counter that increments. Who on Earth would want a language like this?
- People who want to mathematically prove their programs are correct. Proving correctness is very hard to do if you allow RAM to get twiddled with through assignment statements—especially if pointers are involved.
- People who want their programs to run very quickly. Far fewer copies of data need to be made if you know it will never change. Functional programming promotes an unparallelled ethic of sharing. There’s no need to insulate one environment from another, because no damage can be done. Imagine if you could never catch a disease. You wouldn’t mind hanging out with lepers.
- People who want their programs to easily be distributed across multiple machines. Imagine we have three computers processing a dataset. If one of them updates a variable that the others rely on, we have a hazard on our hands. How do we make sure the other machines get the update? We have to use some sort of synchronization to ensure consistency. But if we don’t allow updates, each machine can do its work without concern for the others.
That said, at some point our feet need to touch earth. Time is a thing, and state does change. Haskell does recognize this, but it becomes a different language. It becomes the language of actions. In Haskell, an action produces a side effect. The most obvious side effect is console output. We can use putStrLn
to achieve this:
main = putStrLn "Goodbye, Pluto!"
If we want to issue several side effects, we can sequence them together in a do
block:
main = do putStrLn "Thanks, Obama!" putStrLn "You're welcome!"
This should look more like what you’re used to, right?
Other kinds of side effects exist. Like opening a web browser with openBrowser
, which is provided in module Web.Browser
by the open-browser
package:
import Web.Browser main = do putStrLn "Thanks, Obama!" openBrowser "https://www.youtube.com/watch?v=79DijItQXMM" putStrLn "You're welcome!"
The documentation shows what kind of thing openBrowser
returns:
openBrowser :: String -> IO Bool
Not a Bool
, but an IO Bool
. Haskell is serious about about keeping separate the world of side effects and the world of truth and identity. If you go get some data from the side effect world, it is tainted. Any result that a side effecting function gives back is packaged up in a capsule—an IO
—that isolates the contagion.
Time is another side effect. Can you think of any functions from your imperative languages that alter time? Haskell’s got those too. Check out threadDelay
:
threadDelay :: Int -> IO ()
That set of parentheses is the empty tuple. Sometimes it’s called unit. You can think of it as void. This function is used purely for its side effects, giving back no result. But notice that even that non-result is tainted. Also note that it takes milliseconds, so that number has to be pretty big to be perceptable:
main = do putStrLn "Thanks, Obama!" threadDelay 3000000
Dealing with the running process’ environment is also done in the world of side effects. Suppose we want to get the user’s username. We can use functions from the System.Posix.User
module. Its documentation tells us of the getLoginName
function:
getLoginName :: IO String
It gives us back a tainted string. Hmm… Let’s try using it:
main = do putStrLn "Thanks, Obama!" threadDelay 3000000 putStrLn $ getLoginName
This doesn’t work. putStrLn
doesn’t want your tainted junk. It wants a pure string. What do we do? We must open the sealed capsule. The bind operator <-
does that for us:
main = do putStrLn "Thanks, Obama!" threadDelay 3000000 username <- getLoginName putStrLn $ "You're welcome, " ++ username ++ "!"
The bind operator effectively unwraps the data from its IO
seal and adds a new variable into our execution environment. But here’s the thing: we can only bind like this inside another IO action. We can’t open up one of these capsules inside the world of truth and identity. An IO action represents state, and Haskell makes it impossible to introduce state into the cleanroom that is pure functional programming.
What’s another side effect? How about input? There’s getLine
. Can you guess what its signature is?
getLine :: IO String
To echo out the typed line, we do this:
main = do line <- getLine putStrLine line
What if we wanted an Int
from the user? Well, we can write our own little helper that uses getLine
and then parses it:
getInt :: IO Int getInt = do line <- getLine ???
But then what do we do with it? Remember the show
function? It turns values of many different types into a string representation. Here we have the string and we want to go in reverse. The opposite of show
is read
. However, since the view of the relationship in this direction is one-to-many, we must give read
a clue about what type we want:
getInt :: IO Int getInt = do line <- getLine let int = read line :: Int ???
Also weird here is our use of let
and =
. Variable line
isn’t locked up in an IO action here, so can use regular assignments. Except regular assignments inside IO actions must be preceded by the let
keyword. Go figure.
Okay, now what about our return value? We have a beautiful and pure Int
, but it was born in the world of taint. So, we must encapsulate it. That’s what return
is for:
getInt :: IO Int getInt = do line <- getLine let int = read line :: Int return (int)
return
has this type signature:
return :: a -> IO a
It’s not really like a return in imperative languages, as it doesn’t do anything with control flow. It just generates a tainted capsule. If that capsule is the last expression of the do
block, it will get sent back to the caller, just as with pure functions.
Let’s deal with command-line parameters. This, like getting the username, involves communicating with the process, which is a stateful thing. We enter the world of side effects. The getArgs
function in System.Environment
has this signature:
getArgs :: IO [String]
Let’s extract out the first argument and turn into an Int
:
main = do args <- getArgs -- unwrap from the IO let arg0 = head arg -- pull out just the first let n = read arg0 :: Int -- parse the int
Erm… We should do something interesting here. Let’s print out all the factors of n
. How can we do this in Haskell? Through filtering!
factors n = filter (\x -> mod n x == 0) [1..n] main = do args <- getArgs -- unwrap from the IO let arg0 = head arg -- pull out just the first let n = read arg0 :: Int -- parse the int putStrLn $ show $ factors n
We should remind ourselves why factors are important. Wasn’t it the Babylonians who chose to use 360 for the number of degrees in a circle because 360 had so many ways to divide it evenly? They made it far easier to split a pizza amongst groups of various sizes.
Let’s do one last example, this time with a loop. Let’s print the first command-line parameter vertically. For, like, acrostics and crossword puzzles and stuff. We still don’t have loops, but we can use recursion. Here’s our function that “loops”:
downer :: String -> IO () downer [] = return () downer (first:rest) = do putChar first putStrLn "" downer rest
And our main
:
main = do args <- getArgs let arg0 = head args downer arg0
One of your homework problems requires a loop until the user gets a right answer. We can’t use pattern matching for that, so let’s rewrite this using a more flexible construct:
downer list = do if list == [] then return () else putChar first putStrLn "" downer rest
See you next time, when we discuss laziness and tail recursion!
P.S. It’s Haiku Friday!
Programs come in parts
Input, output, and a third
The fun between them