CS241 — Winter 2026

Assignment 6

Out of 15 marks

Assignments for CS241

← A5
Assignment 6

Due Friday, March 20th, 5:00 PM

P1 P2 P3 P4 P5

In this assignment, you will implement the context-sensitive analysis phase (also known as semantic analysis) of the WLP4 compiler.

The main goal of this phase is to catch all remaning compile-time errors that were not caught by the scanner or parser. A secondary goal is to compute type information for expressions, which will be needed in the code generation phase.

Filenames

In this assignment, like the assembler assignment, you are gradually building up one large program across multiple problems. You can submit the same program to all problems if you want.

For all problems, you must submit one of the following two files:

You can submit additional files if you wish to divide your program into multiple files, but a file with one of the above names must be present in your submission.

Input and Output Format

The input for all problems will be a WLP4 Intermediate (.wlp4i) file representing a parse tree, that is, the output of the parsing phase. If you did not complete your own WLP4 parser in the previous assignment, you can use the provided cs241.wlp4parse tool to generate inputs. Note that cs241.wlp4parse expects input in the format produced by the scanner cs241.wlp4scan, so you will need to run your WLP4 source code through both commands to produce a .wlp4i file:

$ cs241.wlp4scan < program.wlp4 | cs241.wlp4parse > program.wlp4i

If the input program has an error in it, you simply need to produce an error message containing ERROR on standard error, as usual.

If the input program parse tree is correct, it should produce a WLP4 Typed Intermediate (.wlp4ti) file on standard output. This is a representation of the parse tree that has been annotated with type information. You can produce this output by traversing the tree and assigning a type to all expression nodes, then doing a second (preorder) traversal where you print out the tree along with the assigned types. An expression node is defined as either of the following:

The cs241.wlp4type tool can be used to convert a .wlp4i file to a .wlp4ti file. It performs context sensitive analysis and produces the type-annotated parse tree. You can use this to check the correctness of your output on the assignment.

Problem 1: Simple Programs [2 marks, 1 release+secret]

In this problem, you will perform context-sensitive analysis on a heavily restricted class of WLP4 programs:

More formally, only the following grammar rules will appear in the parse tree of the program:

    start → BOF procedures EOF
    procedures → main
    main → LONG WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE 
    type → LONG
    type → LONG STAR
    dcl → type ID
    dcls →
    statements →
    expr → term 
    term → factor 
    factor → NUM  
    factor → NULL
    factor → LPAREN expr RPAREN  
    

Your main task is to read in a .wlp4i file from standard input, which is a representation of the parse tree that the parser constructed, and rebuild the tree.
(Having to construct the tree in the parser and then immediately reconstruct the tree here may seem silly, but the assignment is structured this way so that people who did not finish the WLP4 parser can still do this assignment.)

After building the tree, you will need traverse it to check for semantic errors in the WLP4 program. There are only a few possible semantic errors in these restricted WLP4 programs:

If the program contains a semantic error, you should print ERROR to standard error in all caps.

If there are no semantic errors, you are to print a .wlp4ti file representing the parse tree of the WLP4 program with type annotations to standard output.

To produce this output, you will need to assign a type to each node that represents an expression. For these restricted WLP4 programs, this should not be too difficult. The only nodes representing expressions are the ID nodes in the dcl subtrees, and the nodes in the return expression subtree. For the dcl subtrees, the type is stored right beside the ID, and for the return expression subtree, the type of each node just depends on whether there is NUM or NULL at the leaf (and having NULL at the leaf is a semantic error).

Example

For the following WLP4 program:

      long wain(long a, long b) { return 241; }
    

The input would not be this program, but rather the corresponding .wlp4i file produced by passing the program through wlp4scan and wlp4parse:

      start BOF procedures EOF
      BOF BOF
      procedures main
      main LONG WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE
      LONG long
      WAIN wain
      LPAREN (
      dcl type ID
      type LONG
      LONG long
      ID a
      COMMA ,
      dcl type ID
      type LONG
      LONG long
      ID b
      RPAREN )
      LBRACE {
      dcls .EMPTY
      statements .EMPTY
      RETURN return
      expr term
      term factor
      factor NUM
      NUM 241
      SEMI ;
      RBRACE }
      EOF EOF
    

And the output would be the corresponding type-annotated .wlp4ti file:

      start BOF procedures EOF
      BOF BOF
      procedures main
      main LONG WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE
      LONG long
      WAIN wain
      LPAREN (
      dcl type ID
      type LONG
      LONG long
      ID a : long
      COMMA ,
      dcl type ID
      type LONG
      LONG long
      ID b : long
      RPAREN )
      LBRACE {
      dcls .EMPTY
      statements .EMPTY
      RETURN return
      expr term : long
      term factor : long
      factor NUM : long
      NUM 241 : long
      SEMI ;
      RBRACE }
      EOF EOF
    

Notice that the expr, term, factor, ID, and NUM lines have been annotated with the corresponding type.

The wlp4type tool can be used to generate expected outputs. The input is a .wlp4i file (not WLP4 source code).

Problem 2: Building a Symbol Table [2 marks, 1 release+secret]

Extend your context-sensitive analysis to allow for programs with declarations in the body of wain, and programs where the return expression of wain is a variable.

This means the following additional rules can appear in the parse tree:

      dcls → dcls dcl BECOMES NUM SEMI
      dcls → dcls dcl BECOMES NULL SEMI
      factor → ID
    

The main change to your program in this problem is that you will need to create a symbol table to keep track of which variables have been declared and their types.

Your program now must detect the following semantic errors:

If the program contains a semantic error, you should print ERROR to standard error in all caps.
If there are no semantic errors, you are to print a .wlp4ti file representing the parse tree of the WLP4 program with type annotations to standard output.

Problem 3: Type Checking for Operations [3 marks, 2 release+secret]

Extend your context-sensitive analysis to allow for binary arithmetic operations (+ - * / %) and unary pointer operations (* & new) in expressions.

This means you should handle the following additional grammar rules:

      expr → expr PLUS term
      expr → expr MINUS term
      term → term STAR factor
      term → term SLASH factor
      term → term PCT factor
      factor → AMP lvalue
      factor → STAR factor
      factor → NEW LONG LBRACK expr RBRACK
      lvalue → ID
      lvalue → STAR factor
      lvalue → LPAREN lvalue RPAREN
    

The main change to your program in this problem will be in your code for assigning types to the nodes in the return expression subtree of wain. This will become significantly more complicated.

These new rules introduce many new possible semantic errors. You will need to handle all type errors in expressions that do not involve procedure calls.

If the program contains a semantic error, you should print ERROR to standard error in all caps.
If there are no semantic errors, you are to print a .wlp4ti file representing the parse tree of the WLP4 program with type annotations to standard output.

Dependencies and Part Marks

Problem 4: Procedures [4 marks, 2 release+secret, 1 secret]

Extend your context-sensitive analysis to allow for procedures other than wain in the program, and to allow for procedure calls in expressions. Procedures are still not allowed to contain statements.

This means you should handle the following additional grammar rules:

      procedures → procedure procedures
      procedure → LONG ID LPAREN params RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE
      params →
      params → paramlist
      paramlist → dcl
      paramlist → dcl COMMA paramlist
      factor → ID LPAREN RPAREN
      factor → ID LPAREN arglist RPAREN
      factor → GETCHAR LPAREN RPAREN
      arglist → expr
      arglist → expr COMMA arglist
    

Your program will need several major changes:

If the program contains a semantic error, you should print ERROR to standard error in all caps.
If there are no semantic errors, you are to print a .wlp4ti file representing the parse tree of the WLP4 program with type annotations to standard output.

Dependencies and Part Marks

Problem 5: Statements [4 marks, 2 release+secret, 1 secret]

Extend your context-sensitive analysis to allow for statements. You should now handle arbitrary WLP4 programs; all rules from the WLP4 grammar can appear in the parse tree. The following grammar rules are new for this problem:

      statements → statements statement
      statement → lvalue BECOMES expr SEMI
      statement → IF LPAREN test RPAREN LBRACE statements RBRACE ELSE LBRACE statements RBRACE
      statement → WHILE LPAREN test RPAREN LBRACE statements RBRACE
      statement → PRINTLN LPAREN expr RPAREN SEMI
      statement → PUTCHAR LPAREN expr RPAREN SEMI
      statement → DELETE LBRACK RBRACK expr SEMI
      test → expr EQ expr
      test → expr NE expr
      test → expr LT expr
      test → expr LE expr
      test → expr GE expr
      test → expr GT expr
    

You will need to extend your analysis for expressions to process all expressions that appear in statements. You will also need to check for type errors in statements. Determining the full list of new semantic errors to check for is left as an exercise.

If the program contains a semantic error, you should print ERROR to standard error in all caps.
If there are no semantic errors, you are to print a .wlp4ti file representing the parse tree of the WLP4 program with type annotations to standard output.

Dependencies and Part Marks