teaching machines

CS 330 Lecture 28 – Haskell IO

April 14, 2017 by . Filed under cs330, lectures, spring 2017.

Dear students,

There are a few ideas we didn’t get to last time. Let’s touch upon those before diving into a discussion of Haskell I/O.

First, what about the type signature for fold? Let’s reason it out. We know it will have four parts: three parameters and a return value.

fold :: _ -> _ -> _ -> _

What’s the type of the zero value? It’s generic, so we’ll call it a:

fold :: _ -> a -> _ -> _

We have enough information to complete another of the blanks. In the base case, the zero value is going to be returned. That means the return type better match the zero value’s type:

fold :: _ -> a -> _ -> a

How about the list? It too is generic and doesn’t have to be the same as the zero value, as demonstrated by length. We’ll call it b:

fold :: _ -> a -> [b] -> a

Time for the last blank. It’s a function. It takes in an element from the list (a b) and the result of a recursive call (an a), and it’s result becomes fold‘s result (an a). That means we’ve got this for a type signature:

fold :: (b -> a -> a) -> a -> [b] -> a

Second, sometimes we want to summarize a collection and use one of the elements as the summary value. Think biggest and smallest:

biggest [] = error "not possible"
biggest (first:[]) = first
biggest (first:rest) = max first $ biggest rest

smallest [] = error "not possible"
smallest (first:[]) = first
smallest (first:rest) = min first $ smallest rest

We can define these with fold too!

biggest [] = error "not possible"
biggest (first:rest) = fold max first rest

smallest [] = error "not possible"
smallest (first:rest) = fold min first rest

We could also abstract out this special case of folding that expects a list of at least one element:

fold1 [] = error "not possible"
fold1 mix (first:rest) = fold mix first rest

biggest = fold1 max
smallest = fold1 mi

Third, many of you didn’t think much of closures. I want to share an example that should demonstrate to you that they are actually quite nice and that you’ve probably been using them all along.

Consider the problem of creating a compound slider/textbox widget in HTML and Javascript:

<!DOCTYPE html>
<html>
<head>
  <title>...</title>
  <style>

form {
  background-color: cornflowerblue;
  border-radius: 20px;
  width: 40%;
  text-align: center;
  padding: 20px;
}

  </style>

</head>
<body>

  <form>
    <input id="slider" type="range"><br>
    <input id="box" type="text" size="3">
  </form> 
  
  <script>

var slider = document.getElementById('slider');
var box = document.getElementById('box');

slider.oninput = function() {
  box.value = slider.value;
}

box.oninput = function() {
  slider.value = box.value;
}

  </script>
</body>
</html>

These callback functions are closures. They rely on slider and box being accessible. There’s nothing crazy here. This code is not suprising, and it is pleasant to write. Let’s consider our alternatives. We have a few choices about how to handle scope when we define functions:

  1. We could disallow free variables, forcing everything come in through parameters. The thing is, we aren’t always the ones calling these functions, and passing parameters to them is not an option.
  2. We could bind the free variables in scope at the time of the function’s definition. That’s a closure.
  3. We could bind the free variables in scope at the time of the function’s invocation. That’s called dynamic scoping, and it is too wild.

Alright, I don’t think we can postpone discussing Haskell IO any longer. Up till this point we haven’t written any standalone programs. Nothing that gets input from the user, nothing that generates output. We have looked at the purely functional side of Haskell, where time does not exist. Functions always produce the same value given the same inputs. There’s no notion of a counter that increments. Who on Earth would want a language like this?

  1. People who want to mathematically prove their programs are correct. Proving correctness is very hard to do if you allow RAM to get twiddled with through assignment statements—especially if pointers are involved.
  2. People who want their programs to run very quickly. Far fewer copies of data need to be made if you know it will never change. Functional programming promotes an unparallelled ethic of sharing. There’s no need to insulate one environment from another, because no damage can be done. Imagine if you could never catch a disease. You wouldn’t mind hanging out with lepers.
  3. People who want their programs to easily be distributed across multiple machines. Imagine we have three computers processing a dataset. If one of them updates a variable that the others rely on, we have a hazard on our hands. How do we make sure the other machines get the update? We have to use some sort of synchronization to ensure consistency. But if we don’t allow updates, each machine can do its work without concern for the others.

That said, at some point our feet need to touch earth. Time is a thing, and state does change. Haskell does recognize this, but it becomes a different language. It becomes the language of actions. In Haskell, an action produces a side effect. The most obvious side effect is console output. We can use putStrLn to achieve this:

main = putStrLn "Goodbye, Pluto!"

If we want to issue several side effects, we can sequence them together in a do block:

main = do
  putStrLn "Thanks, Obama!"
  putStrLn "You're welcome!"

This should look more like what you’re used to, right?

Other kinds of side effects exist. Like opening a web browser with openBrowser, which is provided in module Web.Browser by the open-browser package:

import Web.Browser

main = do
  putStrLn "Thanks, Obama!"
  openBrowser "https://www.youtube.com/watch?v=79DijItQXMM"
  putStrLn "You're welcome!"

The documentation shows what kind of thing openBrowser returns:

openBrowser :: String -> IO Bool

Not a Bool, but an IO Bool. Haskell is serious about about keeping separate the world of side effects and the world of truth and identity. If you go get some data from the side effect world, it is tainted. Any result that a side effecting function gives back is packaged up in a capsule—an IO—that isolates the contagion.

Time is another side effect. Can you think of any functions from your imperative languages that alter time? Haskell’s got those too. Check out threadDelay:

threadDelay :: Int -> IO ()

That set of parentheses is the empty tuple. Sometimes it’s called unit. You can think of it as void. This function is used purely for its side effects, giving back no result. But notice that even that non-result is tainted. Also note that it takes milliseconds, so that number has to be pretty big to be perceptable:

main = do
  putStrLn "Thanks, Obama!"
  threadDelay 3000000

Dealing with the running process’ environment is also done in the world of side effects. Suppose we want to get the user’s username. We can use functions from the System.Posix.User module. Its documentation tells us of the getLoginName function:

getLoginName :: IO String

It gives us back a tainted string. Hmm… Let’s try using it:

main = do
  putStrLn "Thanks, Obama!"
  threadDelay 3000000
  putStrLn $ getLoginName

This doesn’t work. putStrLn doesn’t want your tainted junk. It wants a pure string. What do we do? We must open the sealed capsule. The bind operator <- does that for us:

main = do
  putStrLn "Thanks, Obama!"
  threadDelay 3000000
  username <- getLoginName
  putStrLn $ "You're welcome, " ++ username ++ "!"

The bind operator effectively unwraps the data from its IO seal and adds a new variable into our execution environment. But here’s the thing: we can only bind like this inside another IO action. We can’t open up one of these capsules inside the world of truth and identity. An IO action represents state, and Haskell makes it impossible to introduce state into the cleanroom that is pure functional programming.

What’s another side effect? How about input? There’s getLine. Can you guess what its signature is?

getLine :: IO String

To echo out the typed line, we do this:

main = do
  line <- getLine
  putStrLine line

What if we wanted an Int from the user? Well, we can write our own little helper that uses getLine and then parses it:

getInt :: IO Int
getInt = do
  line <- getLine
  ???

But then what do we do with it? Remember the show function? It turns values of many different types into a string representation. Here we have the string and we want to go in reverse. The opposite of show is read. However, since the view of the relationship in this direction is one-to-many, we must give read a clue about what type we want:

getInt :: IO Int
getInt = do
  line <- getLine
  let int = read line :: Int
  ???

Also weird here is our use of let and =. Variable line isn’t locked up in an IO action here, so can use regular assignments. Except regular assignments inside IO actions must be preceded by the let keyword. Go figure.

Okay, now what about our return value? We have a beautiful and pure Int, but it was born in the world of taint. So, we must encapsulate it. That’s what return is for:

getInt :: IO Int
getInt = do
  line <- getLine
  let int = read line :: Int
  return (int)

return has this type signature:

return :: a -> IO a

It’s not really like a return in imperative languages, as it doesn’t do anything with control flow. It just generates a tainted capsule. If that capsule is the last expression of the do block, it will get sent back to the caller, just as with pure functions.

Let’s deal with command-line parameters. This, like getting the username, involves communicating with the process, which is a stateful thing. We enter the world of side effects. The getArgs function in System.Environment has this signature:

getArgs :: IO [String]

Let’s extract out the first argument and turn into an Int:

main = do
  args <- getArgs              -- unwrap from the IO
  let arg0 = head arg          -- pull out just the first
  let n = read arg0 :: Int     -- parse the int

Erm… We should do something interesting here. Let’s print out all the factors of n. How can we do this in Haskell? Through filtering!

factors n = filter (\x -> mod n x == 0) [1..n]

main = do
  args <- getArgs              -- unwrap from the IO
  let arg0 = head arg          -- pull out just the first
  let n = read arg0 :: Int     -- parse the int
  putStrLn $ show $ factors n

We should remind ourselves why factors are important. Wasn’t it the Babylonians who chose to use 360 for the number of degrees in a circle because 360 had so many ways to divide it evenly? They made it far easier to split a pizza amongst groups of various sizes.

Let’s do one last example, this time with a loop. Let’s print the first command-line parameter vertically. For, like, acrostics and crossword puzzles and stuff. We still don’t have loops, but we can use recursion. Here’s our function that “loops”:

downer :: String -> IO ()
downer [] = return ()
downer (first:rest) = do
  putChar first
  putStrLn ""
  downer rest

And our main:

main = do
  args <- getArgs
  let arg0 = head args
  downer arg0

One of your homework problems requires a loop until the user gets a right answer. We can’t use pattern matching for that, so let’s rewrite this using a more flexible construct:

downer list = do
  if list == [] then
    return ()
  else
    putChar first
    putStrLn ""
    downer rest

See you next time, when we discuss laziness and tail recursion!

Sincerely,

P.S. It’s Haiku Friday!

Programs come in parts
Input, output, and a third
The fun between them