Madeup Status Report #2
Madeup saw a major behind-the-scenes shift this fortnight: I switched to a handwritten lexer and parser! I can honestly say my life has improved significantly since the change. Parser generators are great, but I’m not sure decoupling a grammar specification from the code that recognizes utterances conforming to that grammar is really what we want to do. To generate meaningful error messages and make on-the-fly parsing decisions using semantic predicates, we need to be fully informed about the parse. We don’t get much context information in a grammar specification without tightly coupling it to the parser code.
We also see a major visible update: Madeup has initial support for a block-based code editor. Google’s Blockly made defining custom blocks and generating Madeup programs pretty straightforward. Both text and block editors will be actively supported, but I’m not anticipating that translating back and forth between the two modes is going to be supported. Frameworks like Droplet can help with this, but I have a list of higher priority issues I need to address first.
Below is a breakdown of the week’s progress and the model of the week.
- Enables grid and axis drawing in web IDE:
- Adds precedence to Blockly blocks for better parenthesizing.
- Persists block program between sessions.
- Make block colorings consistent across statements and expression types.
- Fixes block-to-text mode switching.
- Integrates Blockly and grammar into Madeup interface:
- Examines Droplet and Blockly as possible providers for a block interface. Droplet supports text-to-block two-way conversion, which would be nice, but requires another lexer/parser. I can’t write two in one week. Settles on Blockly.
- Implements a Blockly blocks grammar.
- Migrates all constructor and method definitions from headers to implementation files. I had kept declarations and definitions together while I figured out the interfaces of my classes. Since these interfaces have more or less stabilized, I decided it was time to get the defininitions out of the headers for faster compilation.
- Switches style of references from
type &id. I’ve vacillated on this for years, but I’ve adopted putting the
&next to the identifier because in a line with multiple declarations,
*bind to the identifier, not to the type:
int *i = NULL, j = atoi("56"); // i is a pointer to int, j is an int int &k = j, h = j; // k is a reference to j, h is a copy ++k; ++h; // prints "0x0 57 57 57 std::cout << i << " " << " " << j << " " << k << " " << h << std::endl;
Of course, I never declare multiple variables in a single line, but that the language doesn’t associate
&with the type is sufficient reason to put them next to the identifier.
- Incorporates regression testing for lexer, parser, and interpreter output.
- Fixes global environment setup for builtin functions.
- Migrates to a new style:
local_identifier. This is a departure from the wxWidgets style that I was born using.
- Renames a bunch of method names:
- Fixes parameterless call to be a call-with-named-parameters so that it can pick up implicit parameters from environment. Previously, it was a call-with-positional-parameters and implicit parameters were ignored. Only the presence of at least one name (
moveto x:2) rendered it a call-with-named-parameters.
- Fixes a bug in call-with-named-parameters that evaluated the closure under the dynamic environment instead of the closure’s.
- Fixes missing end source locations on call expressions without parameters.
- Adds source to parser so it can generate better error messages.
- Adds informative error messages to all illegal recursive descent branches.
- Adds source locations to AST.
- Cleans up error messages from failing dynamic checks.
- Recognizes both GNU and Clang in CMake configuration.
- Introduces new C++-based interpreter.
- Supports installation of a web client on Mac and Linux.
- Switches to static linking to avoid rpath mess on Mac OS.
- Introduces new handwritten parser.
- Introduces new handwritten lexer. After I found my interpreter hanging on a student’s *.mup source file, I discovered that the problem was not in the interpreter itself. The parser was taking far too long to dissect a 20-term expression. I had been using ANTLR to generate the lexer and parser, and newer versions have introduced infinite lookahead—which can slow parsing down. I was loathe to abandon this tool, but switching to a handwritten lexer and parser provides several benefits: 1) I can remove the Java and JNI layers that I propped up to deliver the parsing results to my existing C++ library, which will making debugging far easier, 2) I get to learn about writing my own lexer and parser, which is invaluable to me as an educator, 3) parsing will be faster, and 4) embedding meaningful error messages will be far easier since the grammar specification will not be decoupled from the parsing code.
- Migrates codebase over to GitHub repository.
Model of the Week
My sons are enamored with DragonBox Elements, a geometry game for iOS and Android. Its players build up an army of triangles and quadrilaterals by tracing them out and proving properties about their sides and angles. I asked my oldest, “What shape has three sides and three right angles?” He said it wasn’t possible. We then grabbed a racquetball and I asked him to trace out this algorithm and shape:
It’s got three corners and three sides, so it’s a triangle, right? He wasn’t sure.