CS 241 — Winter 2021 — Assignment 7

Assignments for CS 241
← Assignment 6 Assignment 7 Assignment 8 →
Friday, March 19th at 5:00 pm Friday, March 26th at 5:00 pm Friday, April 2nd at 5:00 pm
P1P2P3P4P5

In Assignment 7, you will implement the semantic analysis phase of the WLP4 compiler for which you wrote a scanner in Assignment 5 and a parser in Assignment 6. In particular, it is the job of this assignment to catch all remaining compile-time errors in the source program; a program that makes it through this phase of compilation is guaranteed to be free of compile-time errors.

A Note About Future Assignments

In Assignments 8 and 9, you will build on your solution for this assignment. In order to proceed to those assignments, you must, at minimum, complete the part of this assignment about building the symbol table for wain (that is, you must at least do Problem 1). This will be sufficient to complete Assignment 8. However, in order to fully finish Assignment 9 (which deals with pointer operations and procedures), you will need to do Problems 1 through 4. Problem 5 is not required for Assignments 8 and 9, since on those assignments you will always be given valid WLP4 programs as input, and the only purpose of Problem 5 is to reject certain kinds of invalid programs.

We will not provide a sample solution for Assignment 7. You must at least partially finish Assignment 7 on your own to be able to do Assignments 8 and 9. Since all test programs on Assignment 8 and 9 will be valid, you do not necessarily need to get perfect marks on Assignment 7, as long as you handle valid programs correctly.

Resources

The specific semantic rules that must be enforced are those outlined in the "Context-sensitive Syntax" section of the WLP4 Specification. A summary of the semantic rules of WLP4 related to types is available for download. The summary contains formal descriptions of type rules for WLP4; although these should match the English descriptions on the WLP4 specification page, the WLP4 specification should be considered the definitive resource. Please report any discrepancies between the two documents.

Testing

To test your semantic analyzer, you will need .wlp4i files (WLP4I format) as generated by the parser specified in Assignment 6. In case you do not wish to use your own parser, you can use one that we have provided. Invoke it using the command wlp4parse.

Starting from a .wlp4 source file, the complete sequence of commands to test your semantic analyzer on it is:

wlp4scan < foo.wlp4 > foo.scanned
wlp4parse < foo.scanned > foo.wlp4i
racket wlp4gen.rkt < foo.wlp4i

or

wlp4scan < foo.wlp4 > foo.scanned
wlp4parse < foo.scanned > foo.wlp4i
./wlp4gen < foo.wlp4i

This can be abbreviated to

cat foo.wlp4 | wlp4scan | wlp4parse | racket wlp4gen.rkt

or

cat foo.wlp4 | wlp4scan | wlp4parse | ./wlp4gen

Marmoset Notes

For each question, Marmoset has public and release tests, and often has secret tests as well. When you release test a submission, Marmoset will only report on public and release tests. No information about secret tests will be visible until after the assignment deadline.

Although the inputs for this assignment are WLP4I files, Marmoset will not show you WLP4I files in the test results because they are large and hard to read. Marmoset will show you WLP4 source code. You will have to run the WLP4 programs Marmoset shows you through wlp4scan and wlp4parse to produce WLP4I files that you can test with, as explained above.

While the problems have different output requirements, the Marmoset tests are designed so that the same code can be submitted to all problems, without needing to remove or comment out certain parts. For example, if you produce symbol tables for all procedures (as required by A7P3) in your solution to A7P2, which only requires signatures for procedures, the extra information will just be ignored.

For all problems where you have to output symbol tables, Marmoset will ignore extra whitespace at the end of lines (for example if you have extra spaces before the end of a line). However, it will not ignore all extra whitespace; for example, extra newlines that are not supposed to be part of the output might cause you to fail tests. Every line should be terminated with a newline character, but there should be no additional newlines.

For this assignment, secret tests are worth 20% of the Marmoset test marks.

Hints

The WLP4I files produced by the parser contain a representation of a parse tree for a WLP4 program. You should approach this assignment by traversing the parse tree to collect information, compute types, and check for errors.

Recall that in Assignment 4 Problem 10 you had to read a preorder traversal of a tree. You may find your solution to A4P10 helpful for starting this assignment. The WLP4I format is essentially a preorder traversal.

The leaf nodes of the parse tree are terminal tokens. In the WLP4 grammar, terminals are always written in ALL CAPS and nonterminals are written in lowercase. This gives an easy way to check when you have reached a leaf. Alternatively, here is an explicit list of terminals you can check against:
"BOF", "BECOMES", "COMMA", "ELSE", "EOF", "EQ", "GE", "GT", "ID", "IF", "INT", "LBRACE", "LE", "LPAREN", "LT", "MINUS", "NE", "NUM", "PCT", "PLUS", "PRINTLN", "RBRACE", "RETURN", "RPAREN", "SEMI", "SLASH", "STAR", "WAIN", "WHILE", "AMP", "LBRACK", "RBRACK", "NEW", "DELETE", "NULL"

It may be useful to store the following items at each node in the tree:

You might want to extend your tree class with other useful information or helper methods as you progress through the assignment.

You may assume for all problems that there is never an identifier with ERROR in the name.

Problem 1 — 15 marks of 125 (filename: wlp4gen.rkt or wlp4gen.cc)

For Problem 1, assume that the WLP4 program contains only the procedure wain. In other words, assume that the production procedures → procedure procedures is not used, so that the non-terminal procedures is guaranteed to derive wain directly.

Read in a WLP4I file (WLP4I format) and construct a parse tree. Traverse the parse tree to gather information about the declared variables and corresponding types. If a variable is declared more than once, or used without being declared, output a string containing ERROR to standard error. You are encouraged, but not required, to output an informative error message; Marmoset will just check that your message contains ERROR somewhere.

If no errors occur, output a symbol table for wain to standard error. A symbol table for a procedure consists of a line describing the signature of the procedure, followed by one line for each local variable (including parameters) in the procedure.

The signature line consists of the procedure's name followed by a colon, followed by the sequence of parameter types (int or int*) in order. The return type is not part of the signature, because WLP4 procedures always return int. The local variable lines consist of the variable's name, followed by the variable's type. Single spaces are used to separate information. The signature line must appear first, followed by the local variable lines, but the order of the local variable lines does not matter. Try to avoid extraneous whitespace; Marmoset will be a little lenient about this (see the Marmoset Notes section) but it is still possible for unneeded whitespace to cause test failures.

For example, if the input program (shown below in WLP4 format, not WLP4I) is this:
int wain(int *a, int b) {
  int c = 0;
  return *a + b + c;
}
then a possible correct output on standard error is:
wain: int* int
a int*
b int
c int
Because the order of local variable lines doesn't matter, the following would also be valid:
wain: int* int
b int
c int
a int*

Problem 2 — 15 marks of 125 (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your solution for Problem 1 to include signatures for all procedures, but still only show the variables local to procedure wain. You do not need to process procedures other than wain beyond computing their signatures; you may assume for this problem that procedures other than wain do not contain variable declaration or use errors.

For this problem, you must produce an error (by writing a string containing ERROR to standard error) if a procedure is declared more than once. However, you do not need to check whether procedure calls that occur in expressions are valid. That means you do not need to check whether a procedure that is called has been previously declared, and you do not need to check that the number and type of arguments matches the procedure's signature.

For example, for this WLP4 program:

int id(int x) { return x; }
int wain(int *a, int b) {
  int c = 0;
  return *a + id(b) + c;
}
a possible correct output on standard error is:
id: int
wain: int* int
a int*
b int
c int
The order in which you print the procedure signatures does not matter. Additionally, as in Problem 1, the order of the local variables in wain's symbol table does not matter. For example, the following output is also valid:
wain: int* int
c int
a int*
b int
id: int

Problem 3 — 25 marks of 125, 5 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your solution for Problem 2 to include type information for all variables in all procedures. You must produce an error (by writing a string containing ERROR to standard error) for each of the following scenarios: By the end of this problem, all errors related to declarations and scope should be caught.

It is not necessary to check that when a procedure is called, the number and type of arguments matches the procedure's signature. That is done in the next problem.

For example, for this WLP4 program:

int nothing() { return 241; }
int id(int x) { int y = 0; return x; }
int wain(int *a, int b) {
  int c = 0;
  return *a + id(b) + c;
}
a possible correct output on standard error is:
nothing:
id: int
x int
y int
wain: int* int
a int*
b int
c int
The order in which you print the procedure symbol tables does not matter. Additionally, within the symbol tables, the order of local variables does not matter. For example, the following output is also valid:
wain: int* int
b int
a int*
c int
nothing:
id: int
y int
x int

Problem 4 — 50 marks of 125, 10 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your solution for Problem 3 to check for type errors in expressions. In particular, for every subtree with an expr or lvalue node at its root, compute the type associated with that subtree, or output a string containing ERROR to standard error if the expression cannot be validly typed.

If every expression in the program has a valid type, and there are no declaration or scope errors, we recommend that your program produces no output. However, to allow for the same program to be submitted to all questions, Marmoset will accept essentially any kind of non-ERROR output on standard error (as long as it doesn't exceed Marmoset's file size limit). For example, it is fine to produce symbol table information as in A7P3 for this problem.

For example, in a node corresponding to the rule

expr → expr PLUS term
if the expr and term on the right-hand side both have type int*, then your program should output ERROR to standard error. Otherwise, the expr on the left-hand side is well-typed, and its type is as given on the semantic rules handout.

Problem 5 — 20 marks of 125, 10 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your solution for Problem 4 to check for all remaining type errors that can occur in WLP4. This means you should check for errors in program elements that do not themselves have types, but demand that some of their subelements be given particular types. For example, println can only be used with type int, and delete [] can only be used with type int*.

By the end of this problem, your program should check and enforce every type rule given on the semantic rules handout. If a source program contains a semantic error, you must output a string containing ERROR to standard error. If the source program is free of semantic errors, we recommend that your program produces no output, but Marmoset will accept essentially any kind of non-ERROR output (as in A7P4).

Click here to go back to the top of the page.