CS 241 — Spring 2022 — Assignment 7 & 8

Assignment 7: Due Friday, July 15, 5:00 pm / Out of 10 marks

Assignment 8: Due Friday, July 22, 5:00 pm / Out of 10 marks

In Assignment 7 and 8, you will write a code generator for WLP4, and finally complete the process of translating WLP4 source code to MIPS assembly!

In Assignment 7, you implement the basic features of WLP4: the wain procedure, integer variables and constants, declarations, assignment, arithmetic, control flow, and printing.

In Assignment 8, you build on these features and add support for pointers, other procedures, and dynamic memory allocation.

Click here to skip the preamble and go to the problem statements.

Filenames

In both Assignment 7 and Assignment 8, you are gradually building up one large program across multiple problems. You can submit the same program to all problems (on both assignments!) if you want.

For all problems, you must submit one of the following two files:

You can submit additional files if you wish to divide your program into multiple files, but a file with one of the above names must be present in your submission.

Input, Output, and Testing

The input format on this assignment is the WLP4 Typed Intermediate (.wlp4ti) format that you produced as the output for Assignment 6. This is a preorder traversal of a WLP4 parse tree that includes type information for expression nodes. You will not need the type information in Assignment 7, but it will be needed in Assignment 8.

Given some WLP4 source code, you can produce the .wlp4ti input file using the following command:

wlp4scan < input.wlp4 | wlp4parse | wlp4type > input.wlp4ti

The output format is MIPS assembly code. The goal of this assignment is to produce MIPS assembly code that implements the behaviour specified by the WLP4 program!

After running the .wlp4ti file through your code generator, you can test your program by running it through cs241.binasm* and then using either mips.twoints or mips.array, depending on whether the WLP4 program expects two integers or an array as input. Here are the commands for running it with mips.twoints, assuming your code generator executable is named wlp4gen.

./wlp4gen < input.wlp4ti > output.asm
cs241.binasm < output.asm > output.mips
mips.twoints output.mips

* In Problem 5 of each assignment, you will need to use a different assembler. More details are given in the problem.

In these assignments, there is no "correct executable" to compare your output to, because that concept does not make sense for a code generator. Instead, test your code generator by comparing the behaviour of the compiled program to the expected behaviour. If you are unsure what the expected behaviour is, you can use wlp4c to compile the program, and compare "your compiled program" with "wlp4c's compiled program".

wlp4c < input.wlp4 > expected.mips

Note that wlp4c takes WLP4 source code as input, not a .wlp4ti file like your program, and produces MIPS machine code as output, not MIPS assembly code like your program.

You can then run both MIPS programs and check if the return value in $3 matches, as well as standard output if the program includes printing.

Dependencies

Ultimately, in Assignments 7 and 8 you are writing one large program, a code generator. You cannot get full marks on Assignment 8 without completing Assignment 7. But we have tried to set up the problems so that if you partially complete Assignment 7, then you will be able to partially complete Assignment 8.

In general, this is how the dependencies between problems work:

For example, suppose you only complete Assignment 7 up to Problem 3. You should be able to complete Assignment 8 up to Problem 3 as well.

The following diagram shows the dependencies graphically.

It may also be possible to earn part marks on some problems even if you haven't completed one of the problems it "depends" on. More details about opportunities for part marks will be added to the assignment page once Marmoset is set up.

Problems

To reflect the dependency structure described above, we have grouped the problems for each assignment together. That is, we only list five "problems" below, but each problem has an "Assignment 7 version" and an "Assignment 8 version". The "Assignment 8 version" builds on the "Assignment 7 version" by asking you to implement additional WLP4 features related to pointers and procedures.

There is also an optional "Bonus Problem" at the very end, which is to be completed after finishing both Assignment 7 and Assignment 8.

Efficiency Warning

There are no particular efficiency requirements for these assignments, and most of the test programs are small. However, students have a tendency to accidentally write tree-building code that uses an exponential amount of memory in the tree size (particularly if they're trying to avoid using pointers, and don't understand C++ move/copy semantics). Exponential time or memory usage is not acceptable even on an assignment with relaxed efficiency requirements.

If you make this mistake, then you will likely exceed Marmoset's time or memory limit on tests with larger programs. The largest programs are reserved for the secret tests, so this could lead you to fail secret tests despite passing all the other tests.

A7P1 on Marmoset includes a "Warning" test which is not for marks and is just intended to warn you about potential efficiency issues. Passing this test does not guarantee your compiler is efficient, though.

Problem 1: Simple Programs

Assignment 7 [1 mark]

Write a code generator for WLP4 programs whose parse tree contains only the following rules:

start → BOF procedures EOF
procedures → main
main → INT WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE 
type → INT
dcl → type ID
dcls →
statements →
expr → term 
term → factor 
factor → NUM  
factor → ID
factor → LPAREN expr RPAREN
These programs are very restricted: All these programs can do is return a value, which is either a numeric constant, or one of the parameters of wain.

Example Program

int wain(int a, int b) { return a; }

When run with mips.twoints, the return value in $3 should be the first parameter of wain (the value given for $1).

Assignment 8 [1 mark]

In addition to the rules required by A7P1, add support for the following additional rules:

procedures → procedure procedures
procedure → INT ID LPAREN params RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE
params →
params → paramlist
paramlist → dcl
paramlist → dcl COMMA paramlist
factor → ID LPAREN RPAREN
factor → ID LPAREN arglist RPAREN
arglist → expr
arglist → expr COMMA arglist

In other words, you must support non-wain procedures. Note that under this restricted set of grammar rules, procedures (including) wain can only return a numeric constant, return one of their parameters, or return the result of calling another procedure.

You do not need to support pointers (int* variables) in this problem.

Example Program

int p() { return 241;}
int wain(int a, int b) { return p(); }

When run with mips.twoints, the return value in $3 should be 241.

A8P1: Future Problems and Part Marks

On future Assignment 8 problems, you can always earn at least 1 mark even if you have not solved Problem 1. That is, there will always be 1 mark for tests that do not involve non-wain procedures.

Problem 2: Declarations, Assignment, and Unary Operations

Assignment 7 [2 marks]

Extend your code generator to support programs whose parse tree contains the following additional rules:

dcls → dcls dcl BECOMES NUM SEMI
statements → statements statement
statement → lvalue BECOMES expr SEMI
lvalue → ID
lvalue → LPAREN lvalue RPAREN

In other words, now you have to support declarations in the body of wain, and assignment statements.

Example Program

int wain(int a, int b) {
  int c = 241;
  b = c;
  a = b;
  return a;
}

When run with mips.twoints, the return value in $3 should be 241.

A7P2: Part Marks

The public test cases only contain declarations (no assignment statements), so you can earn 1 mark on this problem without implementing assignment statements.

Assignment 8 [3 marks]

Add support for pointer variables (int*) and unary pointer operations (dereference and address-of), which means that in addition to the rules required by A7P2 and A8P1, you must support the following additional rules:

type → INT STAR
dcls → dcls dcl BECOMES NULL SEMI
factor → NULL
factor → AMP lvalue
factor → STAR factor
lvalue → STAR factor

If NULL is dereferenced, the expected behaviour is that the WLP4 program crashes, as opposed to carrying on without issue. There will be a test that checks for this behaviour.

Additionally, because of the rules introduced in A7P2, all procedures must now support declarations and assignment statements in their bodies.

Example Program

int wain(int* a, int b) {
  *a = b;
  return *a;
}

When run with mips.array and the input array [241,241,241], the return value in $3 should be 3 (the size of the array, which is the second parameter of wain).

A8P2: Dependencies and Part Marks

Only 1 mark on this problem is associated with test cases that contain non-wain procedures. In other words, even if you did not complete A8P1, you can earn up to 2 marks on this problem.

Problem 3: Binary Operations

Assignment 7 [2 marks]

Extend your code generator to support programs whose parse tree contains the following additional rules:

expr → expr PLUS term
expr → expr MINUS term
term → term STAR factor
term → term SLASH factor
term → term PCT factor

In other words, you now have to support expressions with addition, subtraction, multiplication, division, and modulo.

Example Program

int wain(int a, int b) {
  return (240 + 1) - (2410 / 10);
}

When run with mips.twoints, the return value in $3 should be 0.

A7P3: Dependencies

The test cases in this problem do not contain declarations or assignment statements (the features implemented in A7P2), so you can earn full marks on this problem even if you have not solved A7P2.

Assignment 8 [1 mark]

Add support for pointer arithmetic. This does not involve supporting any new grammar rules other than the ones required by A7P3, but you will have to consider pointers when generating code for addition and subtraction.

Example Program

int wain(int* a, int b) {
  return *(a+2);
}

When run with mips.array and the input array [0,1,2,3,4], the return value in $3 should be 2.

A8P3: Dependencies

The tests for this problem will be programs that only contain the wain procedure. You can earn full marks on this problem even if you do not support other procedures (i.e., if you have not solved A8P1).

Problem 4: Control Structures

Assignment 7 [3 marks, 1 secret]

Extend your code generator to support programs whose parse tree contains the following additional rules:

statement → IF LPAREN test RPAREN LBRACE statements RBRACE ELSE LBRACE statements RBRACE
statement → WHILE LPAREN test RPAREN LBRACE statements RBRACE
test → expr EQ expr
test → expr NE expr
test → expr LT expr
test → expr LE expr
test → expr GE expr
test → expr GT expr

In other words, you now have to support if-else statements and while loops, as well as Boolean comparisons.

Example Program

int wain(int a, int b) {
  if(a < b) {
    while(a < b) {
      a = a + 1;
    }
  } else { 
    a = 241;
  }
  return a;
}

When run with mips.twoints, the return value in $3 depends on the value of the first parameter (given in $1) and the value of the second parameter (given in $2). If $1 < $2, the return value should match the value of $2. Otherwise, the return value should be 241.

A7P4: Dependencies and Part Marks

The public test cases do not contain if-else statements, so you can earn 1 mark on this problem if you just implement while loops.

The public test cases do contain assignment statements and arithmetic, so you need to have solved A7P2 and A7P3.

Assignment 8 [1 mark]

Add support for pointer comparisons. This does not involve supporting any new grammar rules other than the ones required by A7P4, but you will have to consider pointers when generating code for comparison tests.

To properly test pointer comparisons, Marmoset needs to test your code generator with comparisons that involve very large addresses. Technically, expressions like array+536870912 are considered undefined behaviour in C/C++ if the array has less than 536,870,912 elements, because the pointer ends up outside the range of the array. However, the Marmoset tests will use expressions like this because it is the easiest way to obtain large addresses in WLP4. We do not consider expressions like this undefined behaviour in WLP4 (although dereferencing an address outside the range of an array is undefined behaviour).

Example Program

int wain(int* a, int b) {
  int* last = NULL;
  last = a+b-1;
  while(a < last) {
    a = a+1;
  }
  return *a;
}

When run with mips.array and the input array [0,1,2,3,4], the return value in $3 should be 4, the last element of the array.

A8P4: Dependencies

The tests for this problem will be programs that only contain the wain procedure. You can earn full marks on this problem even if you do not support other procedures (i.e., if you have not solved A8P1).

Problem 5: External Libraries

Assignment 7 [2 marks, 1 secret]

Extend your code generator to support programs whose parse tree contains the following additional rule:

statement → PRINTLN LPAREN expr RPAREN SEMI

In other words, you now need to support the println statement.

Since this is the final problem on Assignment 7, you will be tested with programs that combine all the features you implemented on Assignment 7.

Example Program

int wain(int a, int b) {
  println(a+b);
  return 0;
}

When run with mips.twoints, the return value in $3 will be 0. However, the program should produce the value of $1 + $2 (the sum of the two parameters) on standard output.

Importing, Using, and Linking the Print Library

Supporting this statement introduces some complications, because unless you want to use your own print procedure from Assignment 2, you will need to import the print procedure from an external library, and link it with your generated code when testing.

The print library is available here: print.merl

To import the print procedure, include the following line at the start of your generated code:

.import print

You can then call "print" in the same way you would call any other MIPS procedure, as if there was a label called "print" in your code. The procedure expects the value to print in $1, and preserves all registers.

Unfortunately, our trusty assembler cs241.binasm does not know what ".import" means and will produce an error when it sees the ".import print" line. To test our code, we will need to use a different assembler that supports imports.

This assembler is called cs241.linkasm. It produces MERL object files instead of raw MIPS machine code.

Once you run the output of your code generator through cs241.linkasm, you need to use cs241.linker to link your code with the print library. That will produce a combined MERL file you can run with mips.twoints.

The sequence of commands to compile and run a WLP4 program (with mips.twoints) now looks like this:

wlp4scan < input.wlp4 | wlp4parse | wlp4type > input.wlp4ti
./wlp4gen < input.wlp4ti > output.asm
cs241.linkasm < output.asm > output.merl
cs241.linker output.merl print.merl > linked.merl
mips.twoints linked.merl

A7P5: Dependencies and Part Marks

For both the public and secret tests, you must have implemented the features from all the previous Assignment 7 problems to pass. There are no part marks for just implementing printing.

Assignment 8 [4 marks, 2 secret]

In addition to the rules required by A7P5 and A8P4, extend your code generator to support programs whose parse tree contains the following rules:

factor → NEW INT LBRACK expr RBRACK
statement → DELETE LBRACK RBRACK expr SEMI

In other words, you now need to support dynamic memory allocation with "new" and "delete".

After implementing this, your code generator should now support all WLP4 features. In addition to testing dynamic memory allocation, there will be test cases covering all the other features of WLP4 you have implemented.

Example Program

int wain(int a, int b) {
  int* array = NULL;
  array = new int[2];
  *array = a;
  *(array+1) = b;
  println(*array);
  println(*(array+1));
  delete [] array; 
  return b;
}

When run with mips.twoints, the return value in $3 will be the value of $2 (the second parameter). However, the program should also produce two lines of output to standard output containing the values of $1 and $2.

Importing, Using, and Linking the Allocation Library

The allocation library is available here: alloc.merl

This library actually contains three procedures:

Usage notes:

The assembly and linking process is similar to what we did for the print library, but there is one additional step needed for technical reasons. The "init" procedure that initializes the heap does not account for the metadata contained in MERL files. To ensure heap allocation works properly, we must use the cs241.merl tool to strip the metadata and obtain an ordinary MIPS file.

Additionally, the order of linking matters. The allocation library does not work properly unless it is linked last.

Accounting for these changes, the new sequence of commands to compile and run a WLP4 program (with mips.twoints) looks like:

wlp4scan < input.wlp4 | wlp4parse | wlp4type > input.wlp4ti
./wlp4gen < input.wlp4ti > output.asm
cs241.linkasm < output.asm > output.merl
cs241.linker output.merl print.merl alloc.merl > linked.merl
cs241.merl 0 < linked.merl > final.mips 2> /dev/null
mips.twoints final.mips

A8P5: Part Marks

You can earn 1 mark (the public test mark) on this problem if your code generator does not support non-wain procedures.

The 2 secret test marks require implementing non-wain procedures. However, 1 of the secret test marks only requires implementing non-wain procedures that have no parameters, so you have a chance of getting part marks even if you do not fully finish the implementation of non-wain procedures.

Bonus Problem: Code Size Optimization [+1% on course grade]

In this problem, Marmoset will run your compiler against a large test suite of complex programs, and measure the size in bytes of the generated MIPS code (after it is assembled).

You earn the bonus mark if:

This 180,000 byte threshold is difficult to meet and will require you to implement a number of optimization techniques to reduce the amount of code you generate.

There is an anonymous scoreboard showing the size in bytes achieved by each student who has passed all the test cases on the bonus problem. It is updated automatically every 5 minutes.