back

Visual Compiler

After taking this online compilers class, I thought it would be fun to build compiler for a simple language that displays output of some intermediate compiler phases.

A Simple Calculator Language

The language I chose to implement is a simple one that only supports functions, if-then-else expressions, addition, subtraction, and integer constants. The body of each function is a single expression.

Parsing

The first step of the compiler is lexing and parsing which I implemented with Jison. Jison and similar tools like Bison and Yacc require you to specify the grammar of your language.

A specification for the calculator language would look something like:

  • A program is made of one or more functions
  • Each function may have parameters, and has a single expression for the body
  • An expression can be addition, subtraction, if-then-else, function invocation, parameter ID, or a constant.
  • An addition expression has left-hand and right-hand side expressions.
  • And so on...

Jison takes this specification and returns a parse tree.

Static Analysis

Static analysis is run over the parse tree to ensure that the program follows some rules of the language.

The rules are:

  • There should be one function named Main that takes no parameters
  • Function names should be unique
  • Parameter names for each function should be unique
  • Functions should be invoked with the correct number of arguments
  • All variables in a function should refer to one of the function's parameters

You can click on the example "Static Analysis Errors" below to see a program that breaks some of these rules.

Code Generation

Code generation is the final step of this compiler and runs via a recursive descent of the parse tree.

Code generation targets the simple instruction set of MIPS. The following registers are used:

  • $a0: the accumulator
  • $t1: for temporary results
  • $sp: the stack pointer
    • Address of next unallocated location on stack kept here
  • $fp: the frame pointer
    • Pointer to currently active stack frame
  • $ra: the return address

Evaluating an expression will always leave the stack unchanged and the result of the expression in $a0, the accumulator register.

All code can be found here.

Simple Example

Sum To

Static Analysis Errors

Parse Tree:

program

function

Function name: Main

Function body:

Addition

LHS:

Literal

Value: 1

RHS:

Literal

Value: 2

Generated Machine Code

label: Main
mv $fp $sp
sw $ra 0($sp)
addiu $sp $sp -4
li $a0 1
sw $a0 0($sp)
addiu $sp $sp -4
li $a0 2
lw $t1 4($sp)
add $a0 $a0 $t1
addiu $sp $sp 4
lw $ra 4($sp)
addiu $sp $sp 8
lw $fp 0($sp)
jr $ra