CS445 Introduction to Compilers
Assignment 3
(Symbol Table and Type Checking)
300 points
DUE: Thu Oct 29 at 5 PM PST

In this assignment we will do semantic analysis and semantic error generation. We will also add some more command line options.

Do not be fooled. This is a nontrivial homework. Do not put off this assignment. It is very complicated. Note: I will be gone from Oct 22-24!

Two New Compiler Options

This time we add two new options. The first option is -p which prints the syntax tree. That is, it prints the syntax tree we did for assignment 2 with the line numbers attached to every node. See the section Tracking Line Numbers below.

The second option is -s and turns on symbol table tracing. See the section on the symbol table below.

Original option -d continues to turn on the yydebug option as specified in assignment 2.

It should still accept a single input file either from a filename given on the command line or redirected as standard input.

We will continue adding options throughout the semester. Note: the options need not come before the file name. (HINT: Getopt code that this in this getopt.cpp file and is free for your use. Check out: a getopt example that is a silly example of how to do a variety of things with getopt. This will make your life easier and let you focus on the compiler stuff. I also have the getopt man page available. )

Semantic Errors

We want to generate errors that are tagged with useful line numbers. So we will need to tag each node in the abstract syntax tree with a useful line number. To do this effectively we need to grab the line number as soon as possible (in flex) and associate it with the token. This can be done nicely (portably) by passing back a struct for each token (as you have probably already done) in the yylval which has all the information about the token such as its line number, lexeme, constant value, even token class. (A struct allows you to return more than a single value.) You should avoid using global variables for token information when possible.

Scope and Type Checking

After checking if you should print the abstract syntax tree, you will now traverse the tree looking for typing and program structure errors. So your main() might look something like mine:
    // compile the code
	int numErrors; 

	yyparse();

	if (printSyntaxTree) printTree(syntaxTree);

	numErrors = scopeAndType(syntaxTree);

        // report the number of errors and warnings
        printf("Number of warnings: %d\n", numWarnings);
        printf("Number of errors: %d\n", numErrors);
Your main may look quite different. The routine scopeAndType will process the tree by calling a treeTraverse routine that starts at the root node for the tree and recursively calls itself for children and siblings until it gets to the leaves. Declarations will make entries in the symbol table (see below). Your job in writing the treeTraverse routine is to catch a variety of warnings and errors and duplicate my output for any input given. You should keep count of the number of warnings and errors in a pair of global variables and report that at the end of a run. Here are the errors right out of my version Here is the list of error messages sorted by type of error message.

Here are some details by node type but this list is not exhaustive. You are in control of the design as long as it duplicates my output.

Symbol Table

Here is a useful C++ symbol table object you can use:

symtab.cpp
symtab.h

Here is a brute-force translation to C of the the above files (for a single symbol table).

symtab.c
symtab.h

and a tar file of all symtab files:

symtab tests
symtab expected output
tar of all symtab stuff

It provides a symbol table object with insert and lookup methods for symbols and a pointer (you can use the pointer to point to a TreeNode. It also has enter and leave methods of managing the scope stack. Read the symtab.h for more information on how to use it. You might want to just play with it to see how it works before you put it into your compiler (see test routines).

One feature of the symbol table is the debug method and the two DEBUG flags. At construction time the SymTab object is in nondebugging mode. But by setting the flags with the debug method you can get the object to spew out info. Use the -s flag to set the debug flags to DEBUG_TABLE. This will announce entry into every scope and prints the symbol table on exit from a scope along with the scope names.

Finally the constructor takes a print routine of the type void blah(void *). So if you define something to print a node given a TreeNode * then you can supply that name to the constructor to print out your symbol table stack. That way I don't have to know what you TreeNode looks like internally. For instance in my code:

 
    symtab = new SymTab(nodePrint);
creates the symbol table and nodePrint is defined as in this prototype:
    void nodePrint(void *p);

Example Runs

Here is basicAll.c- and basicAll.out. Here is a tar of a bunch of tests

Hints

[will go here]

Submission

Homework will be submitted as an uncompressed tar file to the homework submission page. You can submit as many times as you like. The LAST file you submit BEFORE the deadline will be the one graded. For all submissions you will receive email showing how your file performed on the pre-grade tests. The grading program will use more extensive tests and those results will be mailed to when they are run.

If you have tests you really think are important or just cool please send them to me and I will consider adding them to the test suite.


Robert Heckendorn Up One Level Last updated: Mar 5, 2007 23:23