Expression Parser

A past section discussed a tiny compiler, but the treatment of the expression analyser (usually called the expression parser) was definitely wanting. This page is a little better.

The main requirement for an expression parser is defining operator priority. For example, multiplication has higher precedence than addition, the same precedence as division, but lower precedence than parentheses. Then there are three different cases to handle: Being a recursive descent parser, this version is certainly more flexible than the 'finite state machine' parser described in the original section.

Note that the more fundamental concept of LR1 grammar has been skipped, though some will argue it is common sense, really: It implies expressions are scanned from left to right, and (at most) one operator must be read ahead before deciding whether to deal with the present operation now or later.

Associativity is a rather secondary notion (for expression parsers): Most operators are left associative, meaning that:

a + b + c

is evaluated as:

(a + b) + c

If you are after an operator which is right associative, consider that some programming languages allow

a = b = c

to stand for

b = c

a = b

and clearly the assignment to b must take place first.

Of course, there are tools, such as the Unix yacc utility, which have been available for decades, and automatically generate code for an LR1 grammar. In fact, there is a fairly oldish book by Kernighan and Pike, which describes a fully programmable calculator in great detail. The title is probably The Unix programming Environment, but the copy at the disposal of the site was not in the English language.

Finally, the site has tidied up some code and presents it, though it is not close to the versatility provided by the example in the book by any means. Here, the version shown accepts real numbers, possibly in scientific format, but not variables, and addition, subtraction, multiplication, division, power (^), bitwise And (&), Or (|) and Xor (x) or any combination thereof, but no parentheses. (Is it easy to add more operators.)

Unary minus is handled by a zero prefix. This implies that logical operations will not work when the leftmost operand is negative, since subtraction has higher precedence than them. (Note that the code is much shorter than the final output from yacc.) As this was not the main point of the exercise, the routine which converts strings to real numbers has not been built from scratch.