Assignments for CS 241 | ||
---|---|---|
← Assignment 5 | Assignment 6 | Assignment 7 → |
Friday, Nov 15th at 5:00 pm | ||
P1 • P2 • P3 • P4 • P5 |
In this assignment, you will implement the context-sensitive analysis phase (also known as semantic analysis) of the WLP4 compiler.
The main goal of this phase is to catch all remaning compile-time errors that were not caught by the scanner or parser. A secondary goal is to compute type information for expressions, which will be needed in the code generation phase.
In this assignment, like the assembler assignment, you are gradually building up one large program across multiple problems. You can submit the same program to all problems if you want.
For all problems, you must submit one of the following two files:
You can submit additional files if you wish to divide your program into multiple files, but a file with one of the above names must be present in your submission.
The input for all problems will be a
WLP4 Intermediate (.wlp4i) file
representing a parse tree, that is, the output of the parsing phase.
If you did not complete your own WLP4 parser
in the previous assignment, you can use the
provided
wlp4parse
tool to generate inputs. Note that wlp4parse
expects input in the format produced by the scanner
wlp4scan
,
so you will need to run your WLP4 source code through both commands
to produce a .wlp4i file:
$ wlp4scan < program.wlp4 | wlp4parse > program.wlp4i
If the input program has an error in it, you simply need to produce
an error message containing ERROR
on standard error,
as usual.
If the program is correct, should produce a WLP4 Typed Intermediate (.wlp4ti) file on standard output. This is a representation of the parse tree that has been annotated with type information. You can produce this output by traversing the tree and assigning a type to all expression nodes, then doing a second (preorder) traversal where you print out the tree along with the assigned types. An expression node is defined as follows:
NUM
, NULL
,
or ID
token.
expr
, term
, factor
,
or lvalue
.
The wlp4type tool can be used to convert a .wlp4i file to a .wlp4ti file. It performs context sensitive analysis and produces the type-annotated parse tree. You can use this to check the correctness of your output on the assignment.
In this problem, you will perform context-sensitive analysis on a heavily restricted class of WLP4 programs:
More formally, only the following grammar rules will appear in the parse tree of the program:
start → BOF procedures EOF procedures → main main → INT WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE type → INT type → INT STAR dcl → type ID dcls → statements → expr → term term → factor factor → NUM factor → NULL factor → LPAREN expr RPAREN
Your main task is to read in a .wlp4i file
(from standard input)
which is a representation of the parse tree that the parser constructed,
and rebuild the tree.
(Having to construct the tree in the parser
and then immediately reconstruct the tree here
may seem silly, but the assignment is
structured this way so that people who did not finish the WLP4 parser
can still do this assignment.)
After building the tree, you will need traverse it to check for semantic errors in the WLP4 program. There are only a few possible semantic errors in these restricted WLP4 programs:
If the program contains a semantic error, you should print
ERROR
to standard error in all caps.
If there are no semantic errors, you are to print a .wlp4ti file representing the parse tree of the WLP4 program with type annotations to standard output.
To produce this output, you will need to assign a type to each node that represents an expression. For these restricted WLP4 programs, this should not be too difficult. The only nodes representing expressions are the ID nodes in the dcl subtrees, and the nodes in the return expression subtree. For the dcl subtrees, the type is stored right beside the ID, and for the return expression subtree, the type of each node just depends on whether there is NUM or NULL at the leaf (and having NULL at the leaf is a semantic error).
For the following WLP4 program:
int wain(int a, int b) { return 241; }
The input would not be this program, but rather the
corresponding .wlp4i file
produced by passing the program through wlp4scan
and wlp4parse
:
start BOF procedures EOF BOF BOF procedures main main INT WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE INT int WAIN wain LPAREN ( dcl type ID type INT INT int ID a COMMA , dcl type ID type INT INT int ID b RPAREN ) LBRACE { dcls .EMPTY statements .EMPTY RETURN return expr term term factor factor NUM NUM 241 SEMI ; RBRACE } EOF EOF
And the output would be the corresponding type-annotated .wlp4ti file:
start BOF procedures EOF BOF BOF procedures main main INT WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE INT int WAIN wain LPAREN ( dcl type ID type INT INT int ID a : int COMMA , dcl type ID type INT INT int ID b : int RPAREN ) LBRACE { dcls .EMPTY statements .EMPTY RETURN return expr term : int term factor : int factor NUM : int NUM 241 : int SEMI ; RBRACE } EOF EOF
Notice that the expr
, term
,
factor
, ID
, and NUM
lines have been annotated with the corresponding type.
The wlp4type tool can be used to generate expected outputs. The input is a .wlp4i file (not WLP4 source code).
Extend your context-sensitive analysis to allow for programs with declarations in the body of wain, and programs where the return expression of wain is a variable.
This means the following additional rules can appear in the parse tree:
dcls → dcls dcl BECOMES NUM SEMI dcls → dcls dcl BECOMES NULL SEMI factor → ID
The main change to your program in this problem is that you will need to create a symbol table to keep track of which variables have been declared and their types.
Your program now must detect the following semantic errors:
If the program contains a semantic error, you should print
ERROR
to standard error in all caps.
If there are no semantic errors, you
are to print a .wlp4ti file
representing the parse tree of the
WLP4 program with type annotations
to standard output.
Extend your context-sensitive analysis to allow for binary arithmetic
operations (+ - * / %
) and unary pointer operations
(* & new
) in expressions.
This means you should handle the following additional grammar rules:
expr → expr PLUS term expr → expr MINUS term term → term STAR factor term → term SLASH factor term → term PCT factor factor → AMP lvalue factor → STAR factor factor → NEW INT LBRACK expr RBRACK lvalue → ID lvalue → STAR factor lvalue → LPAREN lvalue RPAREN
The main change to your program in this problem will be in your code for assigning types to the nodes in the return expression subtree of wain. This will become significantly more complicated.
These new rules introduce many new possible semantic errors. You will need to handle all type errors in expressions that do not involve procedure calls.
If the program contains a semantic error, you should print
ERROR
to standard error in all caps.
If there are no semantic errors, you
are to print a .wlp4ti file
representing the parse tree of the
WLP4 program with type annotations
to standard output.
factor → ID
rule
in the case where the ID is a parameter of wain, so you will
need to at least keep track of those two parameter types.Extend your context-sensitive analysis to allow for procedures other than wain in the program, and to allow for procedure calls in expressions. Procedures are still not allowed to contain statements.
This means you should handle the following additional grammar rules:
procedures → procedure procedures procedure → INT ID LPAREN params RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE params → params → paramlist paramlist → dcl paramlist → dcl COMMA paramlist factor → ID LPAREN RPAREN factor → ID LPAREN arglist RPAREN factor → GETCHAR LPAREN RPAREN arglist → expr arglist → expr COMMA arglist
Your program will need several major changes:
variable()
or 1 + procedure
.
If the program contains a semantic error, you should print
ERROR
to standard error in all caps.
If there are no semantic errors, you
are to print a .wlp4ti file
representing the parse tree of the
WLP4 program with type annotations
to standard output.
Extend your context-sensitive analysis to allow for statements. You should now handle arbitrary WLP4 programs; all rules from the WLP4 grammar can appear in the parse tree. The following grammar rules are new for this problem:
statements → statements statement statement → lvalue BECOMES expr SEMI statement → IF LPAREN test RPAREN LBRACE statements RBRACE ELSE LBRACE statements RBRACE statement → WHILE LPAREN test RPAREN LBRACE statements RBRACE statement → PRINTLN LPAREN expr RPAREN SEMI statement → PUTCHAR LPAREN expr RPAREN SEMI statement → DELETE LBRACK RBRACK expr SEMI test → expr EQ expr test → expr NE expr test → expr LT expr test → expr LE expr test → expr GE expr test → expr GT expr
You will need to extend your analysis for expressions to process all expressions that appear in statements. You will also need to check for type errors in statements. Determining the full list of new semantic errors to check for is left as an exercise.
If the program contains a semantic error, you should print
ERROR
to standard error in all caps.
If there are no semantic errors, you
are to print a .wlp4ti file
representing the parse tree of the
WLP4 program with type annotations
to standard output.