Writing Your Own Toy Compiler Using Flex, Bison and LLVM

By Loren Segal on September 18th, 2009 at 1:08 PM

Building Our Toy Language

We have the code, but how to we build it? LLVM can be complicated to link, fortunately if you installed LLVM you also got an llvm-config binary which returns all of the compiler/linker flags you need.

We also need to update our main.cpp file from earlier to actually compile and run our code, this time:

Listing of main.cpp:

#include <iostream>
#include "codegen.h"
#include "node.h"

using namespace std;

extern int yyparse();
extern NBlock* programBlock;

int main(int argc, char **argv)
{
    yyparse();
    std::cout << programBlock << std::endl;

    CodeGenContext context;
    context.generateCode(*programBlock);
    context.runCode();

    return 0;
}

Now all we need to do is compile:

$ g++ -o parser `llvm-config --libs core jit native --cxxflags --ldflags` *.cpp

You can also grab this Makefile to make things easier:

Listing of Makefile:

all: parser

clean:
	rm parser.cpp parser.hpp parser tokens.cpp

parser.cpp: parser.y
	bison -d -o $@ $^

parser.hpp: parser.cpp

tokens.cpp: tokens.l parser.hpp
	lex -o $@ $^

parser: parser.cpp codegen.cpp main.cpp tokens.cpp
	g++ -o $@ `llvm-config --libs core jit native --cxxflags --ldflags` *.cpp

Running Our Toy Language

Hopefully everything compiled properly. At this point you should have your parser binary that spits out code. Play with it. Remember our canonical example? Let’s see what our program does.

$ echo 'int do_math(int a) { int x = a * 5 + 3 } do_math(10)' | ./parser
0x100a00f10
Generating code...
Generating code for 20NFunctionDeclaration
Creating variable declaration int a
Generating code for 20NVariableDeclaration
Creating variable declaration int x
Creating assignment for x
Creating binary operation 276
Creating binary operation 274
Creating integer: 3
Creating integer: 5
Creating identifier reference: a
Creating block
Creating function: do_math
Generating code for 20NExpressionStatement
Generating code for 11NMethodCall
Creating integer: 10
Creating method call: do_math
Creating block
Code is generated.
; ModuleID = 'main'

define internal void @main() {
entry:
	%0 = call i64 @do_math(i64 10)		;  [#uses=0]
	ret void
}

define internal i64 @do_math(i64) {
entry:
	%a = alloca i64		;  [#uses=1]
	%x = alloca i64		;  [#uses=1]
	%1 = add i64 5, 3	;  [#uses=1]
	%2 = load i64* %a	;  [#uses=1]
	%3 = mul i64 %2, %1	;  [#uses=1]
	store i64 %3, i64* %x
	ret void
}
Running code...
Code was run.

Questions? Comments? Follow me on Twitter (@lsegal) or email me.