» Toward Blocks-Text Parity

Toward Blocks-Text Parity

July 31, 2017 by Chris Johnson. Filed under public, talks.

I gave a talk on a paper that a student and I wrote at COMPSAC 2017 in Torino, Italy. Our work was a position paper responding to the folks who say that blocks programming languages are going to take over the world. These people do exist. The following is a rough manuscript of my talk.

In recent years, block-based programming environments have gotten significant attention. By block-based programming environments, I am referring to platforms like Scratch from MIT:

And Snap! from Berkeley:

And Pencil Code:

And Alice from Carnegie Mellon:

And the hundreds of others, including my own for a language I’m writing for programmatically generating 3D models:

The vast majority of these tools are marketed to young children learning how to program. For someone new to programming, blocks provide some significant advantages:

Blocks interfaces are based on simply recognizing commands listed in a toolbox, instead of recalling them from memory.
Syntax concerns virtually disappear. Programmers simply snap program structures together with no concern for semicolons, matching braces, or indentation.
Typing is minimized. One of the more painful experiences I’ve endured as a father is trying to teach my kids to code before they are familiar with the keyboard.
They are usually tightly-coupled to media manipulation and integrated into a development environment that provides immediate feedback. Many of us think these media-rich domains are more interesting and accessible to children than more abstract concepts, like internet routing protocols.

I think it’s noble that we’re building tools to help young people learn to program. But when I attend workshops on blocks programming, I hear something more. People say that soon professional developers will be using block-based languages. That text languages will die.

Now you—like my reviewers and like me—might think the last of these claims is a little absurd. But my country of the United States has shown me that what seems like an absurdity can become reality, even without the popular vote. I’ve been in these conversations pitting blocks against text, and I felt like I needed to sit down and think clearly about why I think that text languages will not be overtaken by blocks. That thinking led me to this paper. So here’s the overarching question:

What would it take for a developer to choose a blocks language over a text language?

This question seems to be too sensitive to a developer’s opinions, so allow me to scale it back a little to be more about languages themselves. I’m fairly confident that a develop would not use a blocks language that wasn’t as powerful as the text languages they currently use, so I revise my question to this:

What would it take for a blocks language to be just as powerful as a text language? And vice versa? To achieve blocks-text parity?

Right now a lot of the debates in computer science education about blocks vs. text are really debates between Scratch or Snap! and Java. These two classes of languages are very different, and their differences threaten the validity of any claims made about blocks languages. I think we need more languages that can be expressed in both blocks and text forms—which is only possible in languages that have blocks-text parity—for several reasons:

Because then we can truly control for the differences between blocks and text and discern their respective advantages.
Because then we can provide smoother transitions for learners going from blocks to text. Instead of jumping across languages, they can simply jump across interfaces.
Because then we can see how blocks as an interface scales up to general purpose languages.

Even if we don’t develop languages that support both blocks and text, I have a backup question:

How can we have rational debates about blocks and text?

Let’s walk through a catalog of four goals that we need to meet to achieve blocks-text parity—or at least have rational debates about the two.

Syntax

Altadmri and Brown collected error data from 37 million compilations in BlueJ, an IDE for learning object-oriented programming—Java, in particular. Syntax errors occurred 1.5 times more often than semantic errors—like reaching across scopes or referencing uninitialized variables. And both syntax errors and semantic errors occurred considerably more often than type errors.

Blocks evangelists will tell you that in a blocks language, this most prominent type of error mostly disappears. Blocks simply don’t let you forget a semi-colon. However, I have found that these evangelists are often implicitly comparing blocks to syntactically-intricate languages like C and Java, which are rich in punctuation unfamiliar to new programmers. Indeed, Altadmri and Brown identify the most common syntax error in their dataset as unbalanced parentheses. If you remove the mandatory parentheses and semi-colons from a language, one wonders how many syntax errors are left.

Here’s the entire table of errors from their paper:

Error C is unbalanced parentheses. It’s about twice as prevalent as error I, which is providing the wrong type or number of parameters in a method call. Error O is omitting a return statement. Error A is confusing the assignment operator = for the relational operator ==. This confusion can be prevented by using more distinct operators. I conclude that a text language can be designed so that syntax errors are not the primary concern. Even in Java, this table suggests that syntax errors are some of the most quickly fixed.

Goal #1

This leads us to parity goal #1: design text languages with less syntactic burden. And to debate goal #1: compare blocks to text languages with lighter syntax, like Python, and make sure syntax issues are not your only selling point of blocks. Discuss how blocks communicate the API, how they reduce typing, and how they lead the programmer along in the construction of a program.

Abstractions

Nearly all blocks languages are domain-specific. A domain-specific language has a small standard library. Scratch, for example, provides just over 100 commands, most of them dealing with manipulating sprites and sounds. Let’s see some examples of the abstractions we see in blocks languages.

Loops

Scratch’s domain centers on animations. To support the continued application of some behavior across all frames, it supports a forever block:

Most of our general purpose languages consider infinite loops to be poor design and don’t provide an abstraction for them. To do something similar in Python, we’d have to write:

while True:
  # repeated step

To a 5-year-old, the difference between forever and while True is not insignificant.

Events

Our blocks platforms also tend to provide very convenient hooks that let programmers write very short and very specific code to be run on certain events. For example, here are some event blocks from an exercise on Code.org:

Providing such direct abstractions is possible because the number of events is small. Our general purpose text languages don’t have the luxury of being so tidy. In Java, we’d have to hook up a key listener:

public class GameWindow {
  public void keyPressed(KeyEvent e) {
    if (e.getKeyCode() == KeyEvent.VK_UP) {
      ...
    }
  }
}

Oh, but because we’re in Java, we need to fall into a polymorphic hierarchy:

public class GameWindow implements KeyListener {
  public GameWindow() {
    addKeyListener(this);
  }

  public void keyPressed(KeyEvent e) {
    if (e.getKeyCode() == KeyEvent.VK_UP) {
      ...
    }
  }
}

Oh, but because we’re in Java, we need to provide implementations of the all methods of the interface, so really we need this:

public class GameWindow implements KeyListener {
  public GameWindow() {
    addKeyListener(this);
  }

  public void keyPressed(KeyEvent e) {
    if (e.getKeyCode() == KeyEvent.VK_UP) {
      ...
    }
  }

  public void keyReleased(KeyEvent e) {}

  public void keyTyped(KeyEvent e) {}
}

Goal #2

One of the more serious concerns I have about blocks evangelism is that what we think are advantages of the blocks interface are really just advantages of small languages that are carefully tuned to their domain. Just last week I had a conversation at a grading conference for a national exam in the United States with several teachers who couldn’t wait for a blocks language to knock out a mainstream text language. I fear that teachers are becoming anti-text simply because the easiest distinction to make between Scratch and Java is that one uses blocks and the other uses text. There are other differences, and the size of the domain to which they are currently applied is probably the most significant.

That leads us to parity goal #2: design small text languages for manipulating media. A library of utility classes and functions is not sufficient. The core language must itself be small if we are going to be able to compare blocks and text. Or design a blocks language to be general purpose and see how it fares among users. Debate goal #2: consider language size and a code editor’s interface as separate user-experience concerns.

Imperative

Many of our blocks platforms force a very imperative style of programming. In particular, they have much narrower semantics in regards to statements and expressions than we find in mainstream text languages. In PencilCode, here I am trying to create a square function using the blocks interface:

But it’s not working. The function block is expecting a statement block. Unfortunately, multiplication is an expression block, but there’s no way to place that multiplication block inside of it. There’s no special return block, and expression blocks can’t be turned into statement blocks. Interestingly, if I write the function in text first, it translates to blocks exactly as I’d hope:

But I can’t write what I want without resorting to text.

In Scratch, we’ve got even narrower semantics yet. Scratch supports procedures, but not functions:

There’s no way to write a subroutine that returns a value. Even procedures have not always been supported in Scratch.

Goal #3

Our current blocks platforms teach a very imperative style of programming. That leads us to parity goal #3: allow blocks to be amphibious. That is, let them operate as both statements and expressions. Consider a stack data structure. The pop method is sometimes used just to throw away the top element:

You see how it’s got the previous and next connectors to embed it inside a sequence of statements.

But sometimes we need to also process the top element:

Here pop has a value connector so that it may be treated as an expression.

To support both roles, we either need two forms of each block, or we need a way to dynamically alter a block’s connectors.

If we aren’t going to change our languages, then let’s at least achieve debate goal #3: acknowledge that blocks frameworks tend to focus on the programming of side effects, a much narrower task than what we expect of text languages.

Advanced Features

For our last goal, let’s consider a few more advanced semantic constructs that we rarely see in blocks languages.

Lvalues

All of the blocks languages I have used have a very narrow interpretation of what programming language designers call lvalues. Lvalues are the values that live in memory somewhere. Lvalues are in contrast to rvalues, the transient values that are the result of some intermediate computation. Roughly, lvalues are the things that we can assign values too, the things that can appear on the left-hand side of an assignment statement.

In most blocks languages, the only thing that can serve as an lvalue is a variable identifier. Consider these blocks from Tynker:

Where you see i, I could only have entered a variable name. But our mainstream text languages support other kinds of lvalues. Like these four in C++:

*p = 17;             // pointers
freq[5]++;           // subscripting
prefs["width"] = 80; // references
position.x = 0;      // field access

Of these, only subscripting tends to be supported in blocks languages, but interestingly, these blocks languages provide two different setter blocks: one for assigning variables and one for assigning array elements. But from a programmer’s standpoint, an assignment is an assignment.

Closures

I’m only aware of two blocks languages that support closures: PencilCode and Snap!. A closure is a function that can access variables from the surrounding scope. But many blocks don’t allow functions to be defined in line at all—meaning there is no surrounding scope possible. Notice the absence of previous and next connectors on procedure blocks in Scratch:

Procedures in Scratch are lonely islands of code. Any variables they access must be parameters or globals. But we use closures quite often in text languages. One of the most common situations in which we use closures is in registering callbacks, as we do here in Javascript:

function sayAfter(message, delay) {
  setTimeout(function() {
    alert(message);
  }, delay);
}

This function schedules an alert to pop up after a few seconds. It only works because our anonymous function can close up around message and access it after the delay. This kind of programming simply isn’t possible in Scratch. We could make message global, but then it would be vulnerable to changes from other code.

Goal #4

This leads us to design goal #4: design blocks languages that have the same advanced features as the text languages to which we hope to compare them. Debate goal #4: acknowledge that a blocks language that supports all the features of our text languages would not be very fun to use and might compromise their advantages—but that those features are important to a lot of developers.

Conclusion

That’s four. There are a handful of other issues that frustrate our of comparison of blocks and text languages. Like the tight coupling between a blocks language and an IDE, the minimal type system that novice programmers encounter in blocks languages, and the relative difficulty of copying and pasting sequences of blocks. But I’ll leave it at these four for the time being.

Will blocks be the future of programming, even for expert programmers? Blocks work well for an imperative style of programming in a small domain. But will they scale to support the full semantics of our text languages? Our intuition probably wants us to give an answer, but I believe that our current understanding is too narrow to say. But at least now I have a list of things with which I can experiment to better compare the two.