In Assignment 7 and 8, you will write a code generator for WLP4, and finally complete the process of translating WLP4 source code to MIPS assembly!
In Assignment 7, you implement the basic features of WLP4: the wain procedure, integer variables and constants, declarations, assignment, arithmetic, control flow, and printing.
In Assignment 8, you build on these features and add support for pointers, other procedures, and dynamic memory allocation.
In both Assignment 7 and Assignment 8, you are gradually building up one large program across multiple problems. You can submit the same program to all problems (on both assignments!) if you want.
For all problems, you must submit one of the following two files:
You can submit additional files if you wish to divide your program into multiple files, but a file with one of the above names must be present in your submission.
The input format on this assignment is the WLP4 Typed Intermediate (.wlp4ti) format that you produced as the output for Assignment 6. This is a preorder traversal of a WLP4 parse tree that includes type information for expression nodes. You will not need the type information in Assignment 7, but it will be needed in Assignment 8.
Given some WLP4 source code, you can produce the .wlp4ti input file using the following command:
wlp4scan < input.wlp4 | wlp4parse | wlp4type > input.wlp4ti
The output format is MIPS assembly code. The goal of this assignment is to produce MIPS assembly code that implements the behaviour specified by the WLP4 program!
After running the .wlp4ti file through your code generator,
you can test your program by running it through cs241.binasm
*
and then using either mips.twoints
or mips.array
,
depending on whether the WLP4 program expects two integers or an array as input.
Here are the commands for running it
with mips.twoints
, assuming your
code generator executable is named wlp4gen
.
./wlp4gen < input.wlp4ti > output.asm cs241.binasm < output.asm > output.mips mips.twoints output.mips
* In Problem 5 of each assignment, you will need to use a different assembler. More details are given in the problem.
In these assignments, there is no "correct executable" to compare your
output to, because that concept does not make sense for a code generator.
Instead, test your code generator by comparing the behaviour of the
compiled program to the expected behaviour. If you are unsure
what the expected behaviour is, you can use
wlp4c
to compile the program, and compare "your compiled program" with
"wlp4c
's compiled program".
wlp4c < input.wlp4 > expected.mips
Note that wlp4c
takes WLP4 source code as input, not a .wlp4ti file like your program,
and produces MIPS machine code as output, not MIPS assembly code like
your program.
You can then run both MIPS programs and check if the return value in $3 matches, as well as standard output if the program includes printing.
Ultimately, in Assignments 7 and 8 you are writing one large program, a code generator. You cannot get full marks on Assignment 8 without completing Assignment 7. But we have tried to set up the problems so that if you partially complete Assignment 7, then you will be able to partially complete Assignment 8.
In general, this is how the dependencies between problems work:
For example, suppose you only complete Assignment 7 up to Problem 3. You should be able to complete Assignment 8 up to Problem 3 as well.
The following diagram shows the dependencies graphically.
It may also be possible to earn part marks on some problems even if you haven't completed one of the problems it "depends" on. More details about opportunities for part marks will be added to the assignment page once Marmoset is set up.
To reflect the dependency structure described above, we have grouped the problems for each assignment together. That is, we only list five "problems" below, but each problem has an "Assignment 7 version" and an "Assignment 8 version". The "Assignment 8 version" builds on the "Assignment 7 version" by asking you to implement additional WLP4 features related to pointers and procedures.
There is also an optional "Bonus Problem" at the very end, which is to be completed after finishing both Assignment 7 and Assignment 8.
There are no particular efficiency requirements for these assignments, and most of the test programs are small. However, students have a tendency to accidentally write tree-building code that uses an exponential amount of memory in the tree size (particularly if they're trying to avoid using pointers, and don't understand C++ move/copy semantics). Exponential time or memory usage is not acceptable even on an assignment with relaxed efficiency requirements.
If you make this mistake, then you will likely exceed Marmoset's time or memory limit on tests with larger programs. The largest programs are reserved for the secret tests, so this could lead you to fail secret tests despite passing all the other tests.A7P1 on Marmoset includes a "Warning" test which is not for marks and is just intended to warn you about potential efficiency issues. Passing this test does not guarantee your compiler is efficient, though.
Write a code generator for WLP4 programs whose parse tree contains only the following rules:
start → BOF procedures EOF procedures → main main → INT WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE type → INT dcl → type ID dcls → statements → expr → term term → factor factor → NUM factor → ID factor → LPAREN expr RPARENThese programs are very restricted:
int wain(int a, int b) { return a; }
When run with mips.twoints
, the return value in $3
should be the first parameter of wain (the value given for $1).
In addition to the rules required by A7P1, add support for the following additional rules:
procedures → procedure procedures procedure → INT ID LPAREN params RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE params → params → paramlist paramlist → dcl paramlist → dcl COMMA paramlist factor → ID LPAREN RPAREN factor → ID LPAREN arglist RPAREN arglist → expr arglist → expr COMMA arglist
In other words, you must support non-wain procedures. Note that under this restricted set of grammar rules, procedures (including) wain can only return a numeric constant, return one of their parameters, or return the result of calling another procedure.
You do not need to support pointers (int* variables) in this problem.
int p() { return 241;} int wain(int a, int b) { return p(); }
When run with mips.twoints
, the return value in $3
should be 241.
On future Assignment 8 problems, you can always earn at least 1 mark even if you have not solved Problem 1. That is, there will always be 1 mark for tests that do not involve non-wain procedures.
Extend your code generator to support programs whose parse tree contains the following additional rules:
dcls → dcls dcl BECOMES NUM SEMI statements → statements statement statement → lvalue BECOMES expr SEMI lvalue → ID lvalue → LPAREN lvalue RPAREN
In other words, now you have to support declarations in the body of wain, and assignment statements.
int wain(int a, int b) { int c = 241; b = c; a = b; return a; }
When run with mips.twoints
, the return value in $3
should be 241.
The public test cases only contain declarations (no assignment statements), so you can earn 1 mark on this problem without implementing assignment statements.
Add support for pointer variables (int*) and unary pointer operations (dereference and address-of), which means that in addition to the rules required by A7P2 and A8P1, you must support the following additional rules:
type → INT STAR dcls → dcls dcl BECOMES NULL SEMI factor → NULL factor → AMP lvalue factor → STAR factor lvalue → STAR factor
If NULL is dereferenced, the expected behaviour is that the WLP4 program crashes, as opposed to carrying on without issue. There will be a test that checks for this behaviour.
Additionally, because of the rules introduced in A7P2, all procedures must now support declarations and assignment statements in their bodies.
int wain(int* a, int b) { *a = b; return *a; }
When run with mips.array
and the input array [241,241,241], the return value in $3
should be 3 (the size of the array, which is the second
parameter of wain).
Only 1 mark on this problem is associated with test cases that contain non-wain procedures. In other words, even if you did not complete A8P1, you can earn up to 2 marks on this problem.
Extend your code generator to support programs whose parse tree contains the following additional rules:
expr → expr PLUS term expr → expr MINUS term term → term STAR factor term → term SLASH factor term → term PCT factor
In other words, you now have to support expressions with addition, subtraction, multiplication, division, and modulo.
int wain(int a, int b) { return (240 + 1) - (2410 / 10); }
When run with mips.twoints
, the return value in $3
should be 0.
The test cases in this problem do not contain declarations or assignment statements (the features implemented in A7P2), so you can earn full marks on this problem even if you have not solved A7P2.
Add support for pointer arithmetic. This does not involve supporting any new grammar rules other than the ones required by A7P3, but you will have to consider pointers when generating code for addition and subtraction.
int wain(int* a, int b) { return *(a+2); }
When run with mips.array
and the input array [0,1,2,3,4],
the return value in $3
should be 2.
The tests for this problem will be programs that only contain the wain procedure. You can earn full marks on this problem even if you do not support other procedures (i.e., if you have not solved A8P1).
Extend your code generator to support programs whose parse tree contains the following additional rules:
statement → IF LPAREN test RPAREN LBRACE statements RBRACE ELSE LBRACE statements RBRACE statement → WHILE LPAREN test RPAREN LBRACE statements RBRACE test → expr EQ expr test → expr NE expr test → expr LT expr test → expr LE expr test → expr GE expr test → expr GT expr
In other words, you now have to support if-else statements and while loops, as well as Boolean comparisons.
int wain(int a, int b) { if(a < b) { while(a < b) { a = a + 1; } } else { a = 241; } return a; }
When run with mips.twoints
, the return value in $3
depends on the value of the first parameter (given in $1)
and the value of the second parameter (given in $2).
If $1 < $2, the return value should match the value of $2.
Otherwise, the return value should be 241.
The public test cases do not contain if-else statements, so you can earn 1 mark on this problem if you just implement while loops.
The public test cases do contain assignment statements and arithmetic, so you need to have solved A7P2 and A7P3.
Add support for pointer comparisons. This does not involve supporting any new grammar rules other than the ones required by A7P4, but you will have to consider pointers when generating code for comparison tests.
To properly test pointer comparisons, Marmoset needs to test
your code generator with comparisons that involve very large
addresses. Technically, expressions like array+536870912
are considered undefined behaviour in C/C++ if the array has less than
536,870,912 elements, because the pointer ends up outside
the range of the array. However, the Marmoset tests will use
expressions like this because it is the easiest way to obtain
large addresses in WLP4. We do not consider expressions like
this undefined behaviour in WLP4
(although dereferencing an address outside the range of
an array is undefined behaviour).
int wain(int* a, int b) { int* last = NULL; last = a+b-1; while(a < last) { a = a+1; } return *a; }
When run with mips.array
and the input array [0,1,2,3,4],
the return value in $3
should be 4, the last element of the array.
The tests for this problem will be programs that only contain the wain procedure. You can earn full marks on this problem even if you do not support other procedures (i.e., if you have not solved A8P1).
Extend your code generator to support programs whose parse tree contains the following additional rule:
statement → PRINTLN LPAREN expr RPAREN SEMI
In other words, you now need to support the println statement.
Since this is the final problem on Assignment 7, you will be tested with programs that combine all the features you implemented on Assignment 7.
int wain(int a, int b) { println(a+b); return 0; }
When run with mips.twoints
, the return value in $3
will be 0. However, the program should produce the value of
$1 + $2 (the sum of the two parameters) on standard output.
Supporting this statement introduces some complications, because unless you want to use your own print procedure from Assignment 2, you will need to import the print procedure from an external library, and link it with your generated code when testing.
The print library is available here: print.merl
To import the print procedure, include the following line at the start of your generated code:
.import print
You can then call "print" in the same way you would call any other MIPS procedure, as if there was a label called "print" in your code. The procedure expects the value to print in $1, and preserves all registers.
Unfortunately, our trusty assembler cs241.binasm
does not
know what ".import" means and will produce an error when it sees the
".import print" line.
To test our code, we will need to use a different assembler that
supports imports.
This assembler is called
cs241.linkasm
.
It produces MERL object files instead of raw MIPS machine code.
Once you run the output of your code generator through
cs241.linkasm
, you need to use
cs241.linker
to link your code with the print library. That will produce
a combined MERL file you can run with mips.twoints
.
The sequence of commands to compile and run a WLP4 program
(with mips.twoints
)
now looks like this:
wlp4scan < input.wlp4 | wlp4parse | wlp4type > input.wlp4ti ./wlp4gen < input.wlp4ti > output.asm cs241.linkasm < output.asm > output.merl cs241.linker output.merl print.merl > linked.merl mips.twoints linked.merl
For both the public and secret tests, you must have implemented the features from all the previous Assignment 7 problems to pass. There are no part marks for just implementing printing.
In addition to the rules required by A7P5 and A8P4, extend your code generator to support programs whose parse tree contains the following rules:
factor → NEW INT LBRACK expr RBRACK statement → DELETE LBRACK RBRACK expr SEMI
In other words, you now need to support dynamic memory allocation with "new" and "delete".
After implementing this, your code generator should now support all WLP4 features. In addition to testing dynamic memory allocation, there will be test cases covering all the other features of WLP4 you have implemented.
int wain(int a, int b) { int* array = NULL; array = new int[2]; *array = a; *(array+1) = b; println(*array); println(*(array+1)); delete [] array; return b; }
When run with mips.twoints
, the return value in $3
will be the value of $2 (the second parameter).
However, the program should also produce two lines of output
to standard output
containing the values of $1 and $2.
The allocation library is available here: alloc.merl
This library actually contains three procedures:
init:
Initializes the heap. This should only be called once.
The call should occur before calls to new or delete. Takes a parameter in $2, which
is the size of the array that was given to the WLP4 program as input. If the WLP4
program does not expect an array as input (i.e., the parameters of wain are both
int
type), then pass 0 in $2. The procedure preserves all registers.
new:
Takes a parameter n
in $1 and attempts to allocate
a block of n
words. If the allocation succeeds, the address of the block
is returned in $3. If it fails, 0 is returned in $3. The procedure preserves all
registers except $3.
delete:
Takes a parameter in $1, which must be the address of a currently
allocated block in the heap, and frees the block, allowing the memory to be reused
by future allocations. The procedure preserves all registers.
Usage notes:
The assembly and linking process is similar to what we did for the print library, but there
is one additional step needed for technical reasons.
The "init" procedure that initializes the heap does not account for
the metadata contained in MERL files. To ensure heap allocation works properly,
we must use the
cs241.merl
tool to strip the metadata and obtain an ordinary MIPS file.
Additionally, the order of linking matters. The allocation library does not work properly unless it is linked last.
Accounting for these changes, the new sequence of commands to compile and run
a WLP4 program
(with mips.twoints
)
looks like:
wlp4scan < input.wlp4 | wlp4parse | wlp4type > input.wlp4ti ./wlp4gen < input.wlp4ti > output.asm cs241.linkasm < output.asm > output.merl cs241.linker output.merl print.merl alloc.merl > linked.merl cs241.merl 0 < linked.merl > final.mips 2> /dev/null mips.twoints final.mips
You can earn 1 mark (the public test mark) on this problem if your code generator does not support non-wain procedures.
The 2 secret test marks require implementing non-wain procedures. However, 1 of the secret test marks only requires implementing non-wain procedures that have no parameters, so you have a chance of getting part marks even if you do not fully finish the implementation of non-wain procedures.
In this problem, Marmoset will run your compiler against a large test suite of complex programs, and measure the size in bytes of the generated MIPS code (after it is assembled).
You earn the bonus mark if:
This 180,000 byte threshold is difficult to meet and will require you to implement a number of optimization techniques to reduce the amount of code you generate.
There is an anonymous scoreboard showing the size in bytes achieved by each student who has passed all the test cases on the bonus problem. It is updated automatically every 5 minutes.