CS 241 — Spring 2025 — Assignment 8

Assignment 8: Due Friday, July 25th, 5:00 pm / Out of 23 marks

In Assignment 8, you will finish writing a code generator for WLP4, completing the process of translating WLP4 source code to MIPS assembly. In Assignment 7, you implemented the basic features of WLP4 and in Assignment 8 you will build on these features and add support for pointers, other procedures, and dynamic memory allocation.

The sections on Filenames, Input, Output, and Testing, Dependencies and Efficiency Warnings in Assignment 7 also apply to this assignment. In particular, carefully consider the dependences between the two assignments, e.g. A8P1 depends on A7P1 and will build on that question, so it is worth your time to review A7P1 before starting A8P1. The same applies to P2 through to P5.

Problem 1: Simple Programs

Assignment 8 [2 marks, 1 release+secret]

In addition to the rules required by A7P1, add support for the following additional rules:

procedures → procedure procedures
procedure → INT ID LPAREN params RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE
params →
params → paramlist
paramlist → dcl
paramlist → dcl COMMA paramlist
factor → ID LPAREN RPAREN
factor → ID LPAREN arglist RPAREN
arglist → expr
arglist → expr COMMA arglist

In other words, you must support non-wain procedures. Note that under this restricted set of grammar rules, procedures (including) wain can only return a numeric constant, return one of their parameters, or return the result of calling another procedure.

You do not need to support pointers (int* variables) in this problem.

Example Program

int p() { return 241;}
int wain(int a, int b) { return p(); }

When run with mips.twoints, the return value in $3 should be 241.

A8P1: Future Problems and Part Marks

On future Assignment 8 problems, you can always earn at least 1 mark even if you have not solved Problem 1. That is, there will always be 1 mark for tests that do not involve non-wain procedures.

Problem 2: Declarations, Assignment, and Unary Operations

Assignment 8 [4 marks, 3 release+secret]

Add support for pointer variables (int*) and unary pointer operations (dereference and address-of), which means that in addition to the rules required by A7P2 and A8P1, you must support the following additional rules:

type → INT STAR
dcls → dcls dcl BECOMES NULL SEMI
factor → NULL
factor → AMP lvalue
factor → STAR factor
lvalue → STAR factor

If NULL is dereferenced, the expected behaviour is that the WLP4 program crashes, as opposed to carrying on without issue. There will be a test that checks for this behaviour.

Additionally, because of the rules introduced in A7P2, all procedures must now support declarations and assignment statements in their bodies.

Example Program

int wain(int* a, int b) {
  *a = b;
  return *a;
}

When run with mips.array and the input array [241,241,241], the return value in $3 should be 3 (the size of the array, which is the second parameter of wain).

A8P2: Dependencies and Part Marks

Only 1 mark on this problem is associated with test cases that contain non-wain procedures. In other words, even if you did not complete A8P1, you can earn up to 3 marks on this problem.

Problem 3: Binary Operations

Assignment 8 [2 marks, 1 release+secret]

Add support for pointer arithmetic. This does not involve supporting any new grammar rules other than the ones required by A7P3, but you will have to consider pointers when generating code for addition and subtraction.

Example Program

int wain(int* a, int b) {
  return *(a+2);
}

When run with mips.array and the input array [0,1,2,3,4], the return value in $3 should be 2.

A8P3: Dependencies

The tests for this problem will be programs that only contain the wain procedure. You can earn full marks on this problem even if you do not support other procedures (i.e., if you have not solved A8P1).

Problem 4: Control Structures

Assignment 8 [2 marks, 1 release+secret]

Add support for pointer comparisons. This does not involve supporting any new grammar rules other than the ones required by A7P4, but you will have to consider pointers when generating code for comparison tests.

To properly test pointer comparisons, Marmoset needs to test your code generator with comparisons that involve very large addresses. Technically, expressions like array+536870912 are considered undefined behaviour in C/C++ if the array has less than 536,870,912 elements, because the pointer ends up outside the range of the array. However, the Marmoset tests will use expressions like this because it is the easiest way to obtain large addresses in WLP4. We do not consider expressions like this undefined behaviour in WLP4 (although dereferencing an address outside the range of an array is undefined behaviour).

Example Program

int wain(int* a, int b) {
  int* last = NULL;
  last = a+b-1;
  while(a < last) {
    a = a+1;
  }
  return *a;
}

When run with mips.array and the input array [0,1,2,3,4], the return value in $3 should be 4, the last element of the array.

A8P4: Dependencies

Problem 5: I/O and External Libraries

Assignment 8 [5 marks, 2 release+secret, 2 secret]

In addition to the rules required by A7P5 and A8P4, extend your code generator to support programs whose parse tree contains the following rules:

factor → NEW INT LBRACK expr RBRACK
statement → DELETE LBRACK RBRACK expr SEMI

In other words, you now need to support dynamic memory allocation with "new" and "delete".

After implementing this, your code generator should now support all WLP4 features. In addition to testing dynamic memory allocation, there will be test cases covering all the other features of WLP4 you have implemented.

Example Program

int wain(int a, int b) {
  int *array = NULL;
  array = new int[2];
  *array = a;
  *(array+1) = b;
  println(*array);
  println(*(array+1));
  delete [] array; 
  return b;
}

When run with mips.twoints, the return value in $3 will be the value of $2 (the second parameter). However, the program should also produce two lines of output to standard output containing the values of $1 and $2.

Importing, Using, and Linking the Allocation Library

The allocation library is available here: alloc.merl

This library actually contains three procedures:

init: Initializes the heap. This should only be called once. The call should occur before calls to new or delete. Takes a parameter in $2, which is the size of the array that was given to the WLP4 program as input. If the WLP4 program does not expect an array as input (i.e., the parameters of wain are both int type), then pass 0 in $2. The procedure preserves all registers.
new: Takes a parameter n in $1 and attempts to allocate a block of n words. If the allocation succeeds, the address of the block is returned in $3. If it fails, 0 is returned in $3. The procedure preserves all registers except $3.
delete: Takes a parameter in $1, which must be the address of a currently allocated block in the heap, and frees the block, allowing the memory to be reused by future allocations. The procedure preserves all registers.

Usage notes:

You will need two different cases for calling "init", depending on whether wain is set up to take an array as input, or just two integers.
In WLP4, a failed allocation with "new" should return NULL, and by convention we treat NULL as 1 in WLP4. Note that the "new" procedure in the library returns 0 on a failed allocation, not 1.
In WLP4, the expected behaviour when "delete" is called on a NULL pointer is to do nothing. Deleting a NULL pointer should not cause a crash. However, the "delete" procedure in the library will likely crash if you pass it the value 1.

If you are not using the web tools, then one additional step is needed after linking for technical reasons. The "init" procedure that initializes the heap does not account for the metadata contained in MERL files. To ensure heap allocation works properly, we must use the cs241.merl tool to strip the metadata and obtain an ordinary MIPS file. The web version of cs241.linker strips MERL metadata itself, so does not require this additional step.

Additionally, the order of linking matters. The allocation library does not work properly unless it is linked last.

On the Student Linux server, accounting for these changes, the new sequence of commands to compile and run a WLP4 program (with mips.twoints) looks like:

wlp4scan < input.wlp4 | wlp4parse | wlp4type > input.wlp4ti
./wlp4gen < input.wlp4ti > output.asm
cs241.linkasm < output.asm > output.merl
cs241.linker output.merl print.merl alloc.merl > linked.merl
cs241.merl 0 < linked.merl > final.mips 2> /dev/null
mips.twoints final.mips

In the web version, the order of operations is unchanged; just make sure you include alloc.merl at the end.

A8P5: Part Marks

You can earn 1 mark (the public test mark) on this problem if your code generator does not support non-wain procedures.

The 2 secret test marks require implementing non-wain procedures. However, 1 of the secret test marks only requires implementing non-wain procedures that have no parameters, so you have a chance of getting part marks even if you do not fully finish the implementation of non-wain procedures.

Problem 6: Print, Execute, Print (filename: load.asm) [4 marks, 2 release+secret, 1 secret]

In Problems 6 and 7, you are gradually building the loader you started creating in A7P7. You can submit the same program to problems A7P7, A8P6, and A8P7 if you want. You may want to review Assignment 7 P6 before starting this question.

Write a MIPS program that reads input (from standard input) consisting of a 32-bit memory address α followed by a MERL file.

Your program should load the MIPS code segment of the MERL file into memory at address α, and print each word of the MIPS code segment as it gets loaded.
Then, it should jump to address α and execute the loaded MIPS code.
After the loaded MIPS code finishes running, print the loaded MIPS code (as it exists in memory at address α) then return.

All words should be printed using the provided printHex procedure.

Implementation Notes

The MERL file will have an empty footer (no REL, ESR, or ESD entries). You do not need to perform relocation.
The address α will always be between 0x10000 and 0x100000.
The MIPS code is a program, not a procedure, so it is not guaranteed to preserve any registers.
The loaded MIPS program may modify memory in the range between α (inclusive) and $30 (exclusive) when it is executed.
The loaded MIPS program may modify its own code in memory! When you print the loaded MIPS code after execution, these changes must be reflected in your output.
The loaded MIPS program won't necessarily preserve $30 (i.e., it might use the stack without properly restoring the stack pointer). However, it will not modify memory at addresses greater than or equal to the value of $30 that it is given.
Your loader must provide at least 0x10000 bytes of space to the loaded program. This means $30 must be greater than α + 0x10000 when you jump to the loaded MIPS program.
The MIPS code you are provided will be a valid program. It will not crash or get stuck in an infinite loop, as long as you follow the restrictions above.

Hint: If you don't understand all the implications of the notes above, just write code that jumps to the loaded MIPS program and see what happens. You may be able to earn part marks with a simple implementation that works most of the time and only fails on "weird" programs.

Example

Suppose that input.asm now contains the following self-modifying code, and the load address α is 0x10000.

beq $0, $0, skip
.word 0
skip: lis $3
.word 241
lis $4
.word 0x10004
sw $3, 0($4)
jr $31

Your program's output should be:

Problem 7: Print, Relocate, Execute, Print (filename: load.asm) [4 marks, 2 release+secret, 1 secret]

Write a MIPS program that reads input (from standard input) consisting of a 32-bit memory address α followed by a MERL file.

Your program should load the MIPS code segment of the MERL file into memory at address α, and print each word of the MIPS code segment as it gets loaded.
Next, it should read the footer of the MERL file and perform relocation on the loaded MIPS code.
Then, it should jump to address α and execute the loaded MIPS code.
After the loaded MIPS code finishes running, print the loaded MIPS code (as it exists in memory at address α) then return.

All words should be printed using the provided printHex procedure.

Implementation Notes

This time, the MERL file can have a non-empty footer. However, the footer of the MERL file will only contain REL entries. You will not encounter ESR and ESD entries.

All other implementation notes from Problem 6 still apply. The main difference is that in this problem, you will be tested with programs that do not work when loaded at nonzero addresses unless relocation is performed correctly.

Example

Suppose that input.asm now contains the following self-modifying code, and the load address α is 0x10000.

beq $0, $0, skip
changeMe: .word 0
skip:
lis $3
.word 241
lis $4
.word changeMe
sw $3, 0($4)
jr $31

Your program's output should be:

Notice that in the first print, the 6th line is 0x10. In the second print, it has been relocated to 0x10004.

Also, in the second print, the 2nd line has changed to 0xF1 due to the self-modifying code. The self-modification should work regardless of the load address α.

Bonus Problem: Code Size Optimization [+1% on course grade]

In this problem, Marmoset will run your compiler against a large test suite of complex programs, and measure the size in bytes of the generated MIPS code (after it is assembled).

You earn the bonus mark if:

Your compiler generates correct code for each program in the test suite and never crashes or exceeds Marmoset's resource limits.
The total combined size of all your generated code for the test suite is less than 180,000 bytes.

This 180,000 byte threshold is difficult to meet and will require you to implement a number of optimization techniques to reduce the amount of code you generate.

There is an anonymous scoreboard showing the size in bytes achieved by each student who has passed all the test cases on the bonus problem. It is updated automatically every 5 minutes.