Visual Compiler
After taking this online compilers class, I thought it would be fun to build compiler for a simple language that displays output of some intermediate compiler phases.
A Simple Calculator Language
The language I chose to implement is a simple one that only supports functions, if-then-else expressions, addition, subtraction, and integer constants. The body of each function is a single expression.
Parsing
The first step of the compiler is lexing and parsing which I implemented with Jison. Jison and similar tools like Bison and Yacc require you to specify the grammar of your language.
A specification for the calculator language would look something like:
- A program is made of one or more functions
- Each function may have parameters, and has a single expression for the body
- An expression can be addition, subtraction, if-then-else, function invocation, parameter ID, or a constant.
- An addition expression has left-hand and right-hand side expressions.
- And so on...
Jison takes this specification and returns a parse tree.
Static Analysis
Static analysis is run over the parse tree to ensure that the program follows some rules of the language.
The rules are:
- There should be one function named
Main
that takes no parameters - Function names should be unique
- Parameter names for each function should be unique
- Functions should be invoked with the correct number of arguments
- All variables in a function should refer to one of the function's parameters
You can click on the example "Static Analysis Errors" below to see a program that breaks some of these rules.
Code Generation
Code generation is the final step of this compiler and runs via a recursive descent of the parse tree.
Code generation targets the simple instruction set of MIPS. The following registers are used:
$a0
: the accumulator$t1
: for temporary results$sp
: the stack pointer- Address of next unallocated location on stack kept here
$fp
: the frame pointer- Pointer to currently active stack frame
$ra
: the return address
Evaluating an expression will always leave the stack unchanged and the result of the expression in
$a0
, the accumulator register.
All code can be found here.
Simple Example
Sum To
Static Analysis Errors
Parse Tree:
program
function
Function name: Main
Function body:
Addition
LHS:
Literal
Value: 1
RHS:
Literal
Value: 2
Generated Machine Code
label: Main mv $fp $sp sw $ra 0($sp) addiu $sp $sp -4 li $a0 1 sw $a0 0($sp) addiu $sp $sp -4 li $a0 2 lw $t1 4($sp) add $a0 $a0 $t1 addiu $sp $sp 4 lw $ra 4($sp) addiu $sp $sp 8 lw $fp 0($sp) jr $ra