[ Jocelyn Ireson-Paine's Home Page | Publications | Dobbs Code Talk Index | Dobbs Blog Version ]

# Stack Machines, Expression Evaluation, and the Magic of Reverse Polish

To teach a friend how compilers work, I've written a compiler for a small subset of Pascal. It lexically analyses and parses a program, code-generates the syntax tree into code for a stack machine not unlike the Ocode virtual machine once used by C's ancestral language BCPL, and interprets the code. I'm going to write a few essays about the compiler and its target machine; in this first one, I'll demonstrate how a stack machine evaluates expressions.

A good place to start is reverse Polish. As many readers will know, this is a notation for expressions whereby all operators and functions follow their arguments. It's important because once we have the reverse Polish form of an expression, we can trivially transcribe it into a program that evaluates the expression.

Here's a very simple example. The reverse Polish form of the expression

```11 * 22
```
is
```11 22 *
```

This works with any number of operators. So, for example, the reverse Polish form of the expression

```11 * 22 + 33
```
is
```11 22 * 33 +
```

Notice that this is the reverse Polish for `E + 33`, which is `E 33 +`, with `E` replaced by the reverse Polish for `11 * 22`. Reverse Polish has this lovely recursive structure.

What's magic about this is that we can transcribe any reverse-Polish expression into a machine-code program that evaluates the expression. Providing, that is, that our machine has a stack.

So, assume a stack. Let there be a `loadconst` instruction, which pushes a constant onto the stack. And let there be an `add` instruction, which pops the top two elements off the stack, adds them, and pushes the result back. Let `mult` operate similarly. And let us transcribe the reverse Polish above, replacing constants by `loadconst`, and renaming operators the corresponding instruction.

Doing this,

```11 22 * 33 +
```
becomes
```loadconst 11 loadconst 22 mult loadconst 33 add
```

I fed the instruction sequence above to my stack-machine interpreter, and this is what it displayed. The stack, which starts off empty, is shown as a list, with the topmost element on the left:

```Stack: []

Stack: 

Stack: [22, 11]

Stack: 

Stack: [33, 242]

Stack: 
```

Here's a bigger example. I'm going to evaluate this expression:

```((11 * 22 + 33 * 44) / 121 = 14)
and
(10 = 20 / 2)
```

And here is the same expression rewritten in reverse Polish. The line breaks have no significance, they just stop me overflowing the blog's right-hand margin:

```11 22 * 33 44 * +
121 /
14 =
10 20 2 /
=
and
```

This is the reverse Polish transcribed into machine instructions:

```loadconst 11
mult
mult
div
eq
div
eq
logand
```

I've used a greater variety of instructions here. The `div` instruction divides the element next to the top of the stack by the element on the top, popping them and pushing back the integer quotient. The `eq` instruction compares the top two elements, pops them, and pushes back 1 (signifying TRUE) if they are equal, 0 (signifying FALSE) otherwise. And the `logand` instruction does a logical "and", popping the top two elements and pushing back 1 if they are both 1, otherwise 0.

Now here is my interpreter interpreting those instructions:

```Stack: []

Stack: 

Stack: [22, 11]

Stack: 

Stack: [33, 242]

Stack: [44, 33, 242]

Stack: [1452, 242]

Stack: 

Stack: [121, 1694]

Stack: 

Stack: [14, 14]

Stack: 

Stack: [10, 1]

Stack: [20, 10, 1]

Stack: [2, 20, 10, 1]