Writing Your Own Toy Compiler Using Flex, Bison and LLVM
Generating Our Code
So we have our “tokens.l” file for Flex and our “parser.y” file for Bison. To generate our C++ source files from these definition files we need to pass them through the tools. Note that Bison will also be creating a “parser.hpp” header file for Flex; it does this because of the –d switch, which separates our token declarations from the source so we can include and use those tokens elsewhere. The following commands should create our parser.cpp, parser.hpp and tokens.cpp source files.
$ bison -d -o parser.cpp parser.y $ lex -o tokens.cpp tokens.l
If everything went well we should now have 2 out of 3 parts of our compiler. If you want to test this, create a short main function in a main.cpp file:
Listing of main.cpp:
#include <iostream>
#include "node.h"
extern NBlock* programBlock;
extern int yyparse();
int main(int argc, char **argv)
{
yyparse();
std::cout << programBlock << std::endl;
return 0;
}
You can then compile your source files:
$ g++ -o parser parser.cpp tokens.cpp main.cpp
You will need to have installed LLVM by now for the llvm::Value reference in "node.h". If you don’t want to go that far just yet, you can comment out the codeGen() methods in node.h to test just your lexer/parser combo.
If all goes well you should now have a “parser” binary that takes source in stdin and prints out a hopefully nonzero address representing the root node of our AST in memory.
I'm Loren Segal, a programmer, Rubyist, author of