Parsing Expressions
In the last post we defined the shape of a syntax tree — Binary, Unary, Literal, Grouping — but built every tree by hand. In this post we focus on the parser: the thing that reads tokens and builds the tree automatically.
The problem
1 + 2 * 3 can be read two ways:
(1 + 2) * 3 = 9 Incorrect
1 + (2 * 3) = 7 CorrectNothing in the token list says which is correct. You need a rule for precedence.
The trick: who calls who first
Two functions: DoPlus() handles +, DoTimes() handles *. The entire precedence system comes down to one decision:
DoPlus()callsDoTimes()first, before doing anything else.
private Expr DoPlus()
{
Expr left = DoTimes(); // forced to ask DoTimes first
while (Match(TokenType.PLUS))
{
Token op = Previous();
Expr right = DoTimes(); // forced again
left = new Expr.Binary(left, op, right);
}
return left;
}
private Expr DoTimes()
{
Expr left = Primary();
while (Match(TokenType.STAR))
{
Token op = Previous();
Expr right = Primary();
left = new Expr.Binary(left, op, right);
}
return left;
}Tracing 1 + 2 * 3:
DoPlusasksDoTimesfor its left sideDoTimeslooks at1, sees no*after it, returns1aloneDoPlusconsumes the+DoPlusasksDoTimesfor the right side- This time
DoTimessees2 * 3— it glues those together into oneBinarynode before handing anything back DoPlusends up combining1with the already-sealedBinary(2, *, 3)
+
/ \
1 *
/ \
2 3Multiplication didn't "win" by checking priority. It won because it got first access to the tokens and locked them together before addition ever got a turn.
That's the whole mechanism. The real grammar just stacks more layers the same way — equality calls comparison, which calls term (this is DoPlus), which calls factor (this is DoTimes), which calls unary, which calls primary. Each layer hands control down before doing its own work, so by the time a low-priority operator combines anything, every higher-priority operator nearby has already sealed its own piece.
This post is based on Chapter 6 of Crafting Interpreters by Robert Nystrom, one of the best programming books I've read. Highly recommended if you've ever wanted to understand how languages actually work.