CS 241 — Winter 2021 — Assignment 8 (followed by 9 below)

Assignments for CS 241
← Assignment 7 Assignment 8 Assignment 9 ↓
Friday, March 26th at 5:00 pm Friday, April 2nd at 5:00 pm WEDNESDAY, April 14th at 11:59 pm
P1P2P3P4P5P6P7P8

In Assignments 8 and 9 you will complete the WLP4 compiler for which you wrote a scanner in Assignment 5, a parser in Assignment 6, and a context-sensitive analyzer in Assignment 7. For each problem, you will write a MIPS code generator for an increasingly larger subset of WLP4, culminating with a full WLP4-to-MIPS compiler. Each code generator will be a Racket or C++ program that takes as input a .wlp4i file and, if it conforms to the context-sensitive syntax of WLP4, produces as output a MIPS .asm file that may be assembled with cs241.binasm (or cs241.linkasm for Assignment 8 Problem 4 onwards) and then executed with mips.twoints or mips.array.

For Assignments 8 and 9, you should start with your context-sensitive analyzer from Assignment 7. However, all of the test inputs for Assignments 8 and 9 will be valid WLP4 programs. Therefore, if you did not complete Assignment 7 to reject all invalid WLP4 programs, you can still use your partial solution as a starting point for Assignments 8 and 9. In particular, if your program passes all tests for Assignment 7 Problems 1 to 4, you should have no issues doing Assignments 8 and 9. Even if you do not pass all tests, you may still be fine provided you pass all secret tests, and the non-secret tests you fail only involve invalid programs.

Additionally, for Assignment 8 and Assignment 9 Problems 1 and 2, the only part of Assignment 7 you need is the symbol table (Assignment 7 Problem 1). However, to complete the rest of Assignment 9, you will need a working solution for Assignment 7 Problems 1 through 4.

Testing

You must test your code generator yourself in Linux. To test your code generator, you will need .wlp4i files as generated by the parser specified in A6P5. In case you do not wish to use your own parser, you can use one that we have provided. Invoke it using the command wlp4parse.

Starting from a .wlp4 source file, the complete sequence of commands to generate MIPS machine language and run it is:

wlp4scan < foo.wlp4 > foo.scanned
wlp4parse < foo.scanned > foo.wlp4i
racket wlp4gen.rkt < foo.wlp4i > foo.asm
cs241.binasm < foo.asm > foo.mips
mips.twoints foo.mips OR mips.array foo.mips

OR

wlp4scan < foo.wlp4 > foo.scanned
wlp4parse < foo.scanned > foo.wlp4i
./wlp4gen < foo.wlp4i > foo.asm
cs241.binasm < foo.asm > foo.mips
mips.twoints foo.mips OR mips.array foo.mips

This can be abbreviated to

cat foo.wlp4 | wlp4scan | wlp4parse | racket wlp4gen.rkt | cs241.binasm > foo.mips
mips.twoints foo.mips

OR

cat foo.wlp4 | wlp4scan | wlp4parse | ./wlp4gen | cs241.binasm > foo.mips
mips.twoints foo.mips

The above can be made much easier to do with a shell script.

Marmoset Notes

For each question, Marmoset has public and release tests, and often has secret tests as well. When you release test a submission, Marmoset will only report on public and release tests. No information about secret tests will be visible until after the assignment deadline.

Although the inputs for these assignments are WLP4I files, Marmoset will not show you WLP4I files in the test results because they are large and hard to read. Marmoset will show you WLP4 source code. You will have to run the WLP4 programs Marmoset shows you through wlp4scan and wlp4parse to produce WLP4I files that you can test with, as explained above.

Since each problem builds upon the previous problems, you can submit the same program to every problem, provided it implements the required functionality for every problem.

If your program outputs the string ERROR to standard error for a test case, Marmoset will assume your compiler rejected the input program and you will fail the test. Since all inputs are valid for these assignments, there is never a reason for your compiler to output ERROR. If you are not confident in your solution to Assignment 7, you may wish to remove the error checking functionality so that you do not incorrectly reject programs.

For Assignment 8, secret tests are worth 30% of the Marmoset test marks.

For Assignment 9, secret tests are also worth 30% of the Marmoset test marks.

Problem 1 — 8 marks of 100 (filename: wlp4gen.rkt or wlp4gen.cc)

Write a code generator for correct WLP4 programs that conform to the syntax rules:

     start → BOF procedures EOF
     procedures → main
     main → INT WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE
     type → INT
     dcls →
     dcl → type ID
     statements →
     expr → term
     term → factor
     factor → ID 

You may assume that the test input is a valid WLP4 program that satisfies the context-sensitive syntax of WLP4, and whose derivation contains only the above production rules.

Note: The only thing a WLP4 program conforming to this syntax can do is return the value of one of its parameters. Your MIPS output should assume that the parameter values are in $1 and $2 respectively, and return the result in $3.

Problem 2 — 8 marks of 100 (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your solution for Problem 1 to handle inputs whose context-free syntax conforms to the rules for Problem 1, and in addition:

     factor → LPAREN expr RPAREN

Problem 3 — 14 marks of 100, 4 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle WLP4 programs whose syntax is described by the production rules for Problems 1 and 2, and in addition:

     expr → expr PLUS term
     expr → expr MINUS term
     term → term STAR factor
     term → term SLASH factor
     term → term PCT factor
     factor → NUM

That is, you are to implement expressions with integer constants and the operators {+, -, *, /, %}

You do not need to worry about integer overflow for this problem. In particular, for multiplication, you should use mflo to get the result and ignore the value in the hi register.

Problem 4 — 14 marks of 100, 4 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle WLP4 programs whose syntax is described by the production rules for Problems 1 to 3, and in addition:

     statements → statements statement
     statement → PRINTLN LPAREN expr RPAREN SEMI

That is, you are to implement the println statement.

You may use the provided print.merl module to implement the println statement. You must download this file from the link in the previous sentence and transfer it to your Linux environment account – it is not accessed using source /u/cs241/setup. Using wget will not work, because the file is restricted and you need to log in to your student account to access it. (If you are unable to access this file and you are enrolled in the course, make a private Piazza post about the issue.)

The output from your compiler should now and henceforth be an assembly source file containing the line:

.import print
The .import directive lets you use a procedure from an external file via linking.

You can assemble such a file with cs241.linkasm instead of cs241.binasm. This will generate a MERL file. You should link this file with the library print.merl that we provide, using the command cs241.linker. This library contains a procedure labelled print, which outputs a decimal representation of the number passed to it in register $1. For example, you could assemble, link, and run the output of your code generator using the following commands:

cs241.linkasm < yourCompilersAssemblyLanguageOutput.asm > output.merl
cs241.linker output.merl print.merl > linked.merl
cs241.merl 0 < linked.merl > final.mips
mips.twoints final.mips
The cs241.merl command converts the linked MERL file to plain MIPS code, stripping out the relocation and linking metadata and relocating the program to run at address 0. This is not really necessary right now, but the memory allocation code you will use in Assignment 9 does not work correctly if MERL metadata is present.

Problem 5 — 12 marks of 100, 4 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle variable declarations and assignment statements. Your solution should handle WLP4 programs whose syntax conforms to the rules given in Problems 1 to 4, and in addition:

     dcls → dcls dcl BECOMES NUM SEMI
     statement → lvalue BECOMES expr SEMI
     lvalue → ID
     lvalue → LPAREN lvalue RPAREN

Problem 6 — 14 marks of 100, 4 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle WLP4 programs whose syntax is described by the production rules in Problems 1 to 5 and in addition:

     test → expr LT expr
     statement → WHILE LPAREN test RPAREN LBRACE statements RBRACE

That is, you are to implement while loops whose continuation condition is expressed with "<".

Problem 7 — 10 marks of 100, 4 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle WLP4 programs whose syntax is described by the production rules in Problems 1 to 6 and in addition:

     test → expr EQ expr
     test → expr NE expr
     test → expr LE expr
     test → expr GE expr
     test → expr GT expr

That is, you are to implement the other five comparison relations.

Problem 8 — 20 marks of 100, 10 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle WLP4 programs whose syntax is described by the production rules in Problems 1 to 7 and in addition:

     statement → IF LPAREN test RPAREN LBRACE statements RBRACE ELSE LBRACE statements RBRACE

This includes all WLP4 programs that do not involve pointers or procedures.

Click here to go back to the top of Assignment 8.


CS 241 — Winter 2021 — Assignment 9

Assignments for CS 241
↑ Assignment 8 Assignment 9 → Assignment 10
Friday, April 2nd at 5:00 pm WEDNESDAY, April 14th at 11:59 pm Friday, April 23rd at 11:59 pm
P1P2P3P4P5P6Bonus

Problem 1 — 10 marks of 100, 1 secret mark (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle WLP4 programs whose syntax is described by the production rules in Assignment 8 and in addition:

     type → INT STAR
     dcls → dcls dcl BECOMES NULL SEMI
     factor → NULL
     factor → AMP lvalue
     factor → STAR factor
     lvalue → STAR factor
Additionally, pointer-type expressions should be allowed to appear in the following rules:
 test → expr EQ expr
 test → expr NE expr

For this problem and the following two problems, we will not be doing any pointer comparisons other than equality (==) or inequality (!=). Additionally, we will not be doing any pointer arithmetic; you may assume all expressions involving PLUS and MINUS only have integer operands.

This problem includes WLP4 programs that involve pointers, but do not do dynamic memory allocation, pointer arithmetic, or pointer comparisons other than equality or inequality.

Note: Although the C++ standard leaves the result of dereferencing a null pointer undefined, we require that dereferencing NULL must crash the MIPS machine. You may fail some Marmoset tests if dereferencing NULL does not cause a crash.

Problem 2 — 8 marks of 100 (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle WLP4 programs whose syntax is described by the production rules in Assignment 8, Assignment 9 Problem 1, and in addition:

 factor → NEW INT LBRACK expr RBRACK
 statement → DELETE LBRACK RBRACK expr SEMI

The restrictions on pointer arithmetic and comparisons from Problem 1 still apply. That is, this problem includes WLP4 programs that do dynamic memory allocation, but do not do pointer arithmetic or pointer comparisons other than equality or inequality.

We provide a module alloc.merl that implements a memory allocator, which you should use to implement the new and delete statements. You must download this file from the link in the previous sentence and transfer it to your Linux environment account – it is not accessed using source /u/cs241/setup. Using wget will not work, because the file is restricted and you need to log in to your student account to access it. (If you are unable to access this file and you are enrolled in the course, make a private post on Piazza about the issue.)

Note that alloc.merl must be linked last.

The memory allocator contains three MIPS procedures: init, new, and delete. You must import each procedure individually (by writing .import init, .import new, and .import delete on three separate lines, rather than simply .import alloc).

The init procedure must be called once at the beginning of your generated program before any calls to new or delete.

The new procedure produces in register $3 the address of the beginning of a contiguous sequence of n words of memory, where n is the value passed in register $1 when the procedure was called. If new fails in the allocation, it will return 0. Your generated code should return NULL if new fails, rather than 0, since 0 is a valid memory address.

The delete procedure frees for reuse the block of memory whose address is passed in register $1 when the procedure is called. The address passed to delete must be an address that was earlier returned from new, and has not since already been passed to delete. You do not need to enforce this at compile time; it is the programmer's job to make sure that they use delete properly. However, note that passing NULL to delete is valid in C++ and the result is that nothing happens (no call to delete is made); your generated code should mimic this behaviour.

Marmoset uses a modified version of this memory allocator that prints debugging information to verify that you are calling init, new and delete correctly. Thus even if your program produces the correct output, it may not pass the Marmoset tests if you are not calling init, new and delete at the correct points with the correct argument values.

From now on, remember to use the cs241.merl command to remove MERL metadata from your program before executing it. Otherwise, heap allocation may overwrite part of the input array when used with mips.array. This is how you should assemble and link your compiler output from now on:

cs241.linkasm < yourCompilersAssemblyLanguageOutput.asm > output.merl
cs241.linker output.merl print.merl alloc.merl > linked.merl
cs241.merl 0 < linked.merl > final.mips

Problem 3 — 12 marks of 100 (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle WLP4 programs whose syntax is described by the production rules in Assignment 8, Assignment 9 Problems 1 and 2, and in addition, allow for pointer-type expressions to appear in the following rules:

 expr → expr PLUS term
 expr → expr MINUS term

The restrictions on pointer comparisons from Problem 1 still apply. That is, this problem includes WLP4 programs that do pointer arithmetic but do not do pointer comparisons other than equality or inequality.

Note: To properly test the behaviour of pointer arithmetic and comparison, some of the test programs will use pointer arithmetic expressions that would be undefined in the context of C++. Here is how pointer arithmetic is defined in WLP4:

Note that the reduction modulo 232 if the address is out of range is already handled by the MIPS "add" and "sub" instructions, and by "mflo" after multiplication. You don't have to do anything special to implement this. In particular, the modulo/remainder operation is not needed to implement this behaviour.

While you might be able to pass the tests on Problem 3 if your compiler does not conform to this behaviour, you could encounter issues on the later problems (particularly Problem 4 and the Bonus Problem) if pointer arithmetic is not implemented in this way. The implementation described in the course notes conforms to the rules above.

You may assume the following behaviour will not appear in the Marmoset tests (for this and all other problems):

Problem 4 — 20 marks of 100, 9 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle WLP4 programs whose syntax is described by the production rules in Assignment 8, Assignment 9 Problems 1-3, and in addition, allow for pointer-type expressions to appear in the following rules:

 test → expr LT expr
 test → expr LE expr
 test → expr GE expr
 test → expr GT expr
This problem includes all WLP4 programs consisting of only the main procedure wain.

Note: To pass all the tests for pointer comparisons, make sure your implementation of pointer arithmetic follows the "Note" in Problem 3.

Problem 5 — 20 marks of 100, 10 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to handle WLP4 programs whose syntax is described by the production rules in Assignment 8, Assignment 9 Problems 1-4, and in addition:

procedures → procedure procedures
procedure → INT ID LPAREN params RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE
params →
factor → ID LPAREN RPAREN
This problem includes all WLP4 programs in which all procedures other than wain take no arguments.

Problem 6 — 30 marks of 100, 10 secret marks (filename: wlp4gen.rkt or wlp4gen.cc)

Extend your WLP4 compiler to add support for procedures with a non-zero number of parameters:

params → paramlist
paramlist → dcl
paramlist → dcl COMMA paramlist
factor → ID LPAREN arglist RPAREN
arglist → expr
arglist → expr COMMA arglist
Your compiler should now handle the full WLP4 language.

Bonus Problem — 6 bonus marks (filename: wlp4gen.rkt or wlp4gen.cc)

Modify your WLP4 compiler to reduce the size of the MIPS program that it generates for a particular WLP4 program (that is, the size in bytes of the binary MIPS program after it is assembled, not the number of lines or number of bytes in the assembly code that your compiler outputs). Although there exists a minimal MIPS program for a given WLP4 program, determining such a program is in general an undecidable problem.

Submissions that generate a MIPS program which is less than 180 000 bytes will get a small amount of bonus marks on top of their Assignment 9 grade.

A scoreboard of everyone's best submission for the bonus is available.

The list is anonymized but if you choose to enter, you can see how you are doing compared to others in the class.

Click here to go back to the top of Assignment 9.