Assignments for CS241
Due Monday, April 6th, 5:00 PM
In Assignment 8, you will finish writing a code generator for WLP4, completing the process of translating WLP4 source code to ARM64 assembly. In Assignment 7, you implemented the basic features of WLP4 and in Assignment 8 you will build on these features and add support for pointers, other procedures, and dynamic memory allocation.
The sections on Filenames, Input, Output, and Testing, Dependencies and Efficiency Warnings in Assignment 7 also apply to this assignment. In particular, carefully consider the dependences between the two assignments, e.g. A8P1 depends on A7P1 and will build on that question, so it is worth your time to review A7P1 before starting A8P1. The same applies to P2 through to P5.
In addition to the rules required by A7P1, add support for the following additional rules:
procedures → procedure procedures procedure → LONG ID LPAREN params RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE params → params → paramlist paramlist → dcl paramlist → dcl COMMA paramlist factor → ID LPAREN RPAREN factor → ID LPAREN arglist RPAREN arglist → expr arglist → expr COMMA arglist
In other words, you must support non-wain procedures. Note that
under this restricted set of grammar rules, procedures
(including wain) can only return a numeric constant, return one of their
parameters, or return the result of calling another procedure.
You do not need to support pointers (long* variables) in this problem.
long p() { return 241; }
long wain(long a, long b) { return p(); }
When run with cs241.arm64emu, the return value in x0
should be 241.
On future Assignment 8 problems, you can always earn at least 1 mark even if you have not solved Problem 1. That is, there will always be 1 mark for tests that do not involve non-wain procedures.
Add support for pointer variables (long*) and
unary pointer operations (dereference and address-of). This means that in addition to the rules required by A7P2 and A8P1,
you must support the following additional rules:
type → LONG STAR dcls → dcls dcl BECOMES NULL SEMI factor → NULL factor → AMP lvalue factor → STAR factor lvalue → STAR factor
If NULL is dereferenced, the expected behaviour is that the WLP4 program crashes, as opposed to carrying on without issue. There will be a test that checks for this behaviour.
Additionally, because of the rules introduced in A7P2, all procedures must now support declarations and assignment statements in their bodies.
long wain(long* a, long b) {
*a = b;
return *a;
}
When run with cs241.arm64emu
and the input array [241,241,241], the return value in x0
should be 3 (the size of the array, which is the second
parameter of wain).
Only 1 mark on this problem is associated with test cases that contain non-wain procedures. In other words, even if you did not complete A8P1, you can earn up to 3 marks on this problem.
Add support for pointer arithmetic. This does not involve supporting any new grammar rules other than the ones required by A7P3, but you will have to consider pointers when generating code for addition and subtraction.
long wain(long* a, long b) {
return *(a+2);
}
When run with cs241.arm64emu
and the input array [0,1,2,3,4],
the return value in x0
should be 2.
The tests for this problem will be programs that only contain the wain procedure. You can earn full marks on this problem even if you do not support other procedures (i.e., if you have not solved A8P1).
Add support for pointer comparisons. This does not involve supporting any new grammar rules other than the ones required by A7P4, but you will have to consider pointers when generating code for comparison tests.
To properly test pointer comparisons, Marmoset needs to test
your code generator with comparisons that involve very large
addresses. Technically, expressions like array+536870912
are considered undefined behaviour in C/C++ if the array has less than
536,870,912 elements, because the pointer ends up outside
the range of the array. However, the Marmoset tests will use
expressions like this because it is the easiest way to obtain
large addresses in WLP4. We do not consider expressions like
this to be undefined behaviour in WLP4
(although dereferencing an address outside the range of
an array is undefined behaviour).
long wain(long* a, long b) {
long* last = NULL;
last = a+b-1;
while(a < last) {
a = a+1;
}
return *a;
}
When run with cs241.arm64emu
and the input array [0,1,2,3,4],
the return value in x0
should be 4, the last element of the array.
The tests for this problem will be programs that only contain the wain procedure. You can earn full marks on this problem even if you do not support other procedures (i.e., if you have not solved A8P1).
In addition to the rules required by A7P5 and A8P4, extend your code generator to support programs whose parse tree contains the following rules:
factor → NEW LONG LBRACK expr RBRACK statement → DELETE LBRACK RBRACK expr SEMI
In other words, you now need to support dynamic memory allocation with "new" and "delete".
After implementing this, your code generator should now support all WLP4 features. In addition to testing dynamic memory allocation, there will be test cases covering all the other features of WLP4 you have implemented.
long wain(long a, long b) {
long *array = NULL;
array = new long[2];
*array = a;
*(array+1) = b;
println(*array);
println(*(array+1));
delete [] array;
return b;
}
When run with cs241.arm64emu, the return value in x0
will be the value of b (the second parameter).
However, the program should also produce two lines of output
to standard output
containing the values of a and b.
The allocation library is available here: alloc.com
This library actually contains three procedures:
init: Initializes the heap. This should only be called once.
The call should occur before calls to new or delete. Takes a parameter in x1, which
is the size of the array that was given to the WLP4 program as input. If the WLP4
program does not expect an array as input (i.e., the parameters of wain are both
long type), then pass 0 in x1. The procedure preserves all registers.
new: Takes a parameter n in x0 and attempts to allocate
a block of n words (each word is 8 bytes). If the allocation succeeds, the address of the block
is returned in x0. If it fails, 0 is returned in x0. The procedure preserves all
registers except x0.
delete: Takes a parameter in x0, which must be the address of a currently
allocated block in the heap, and frees the block, allowing the memory to be reused
by future allocations. The procedure preserves all registers.
Usage notes:
If you are not using the web tools, one additional step is required for proper compilation.
The "init" procedure that initializes the heap does not account for
the metadata contained in the ARMCOM file footer. To ensure heap allocation works properly,
we must link our executable with the allocation library using cs241.linker-striparmcom instead of cs241.linker. cs241.linker-striparmcom effectively runs cs241.striparmcom on the ARMCOM files after linking.
The web version of cs241.linker strips ARMCOM metadata automatically, and so does not require this step.
Additionally, the order of linking matters. The allocation library does not work properly unless it is linked last.
On the Student Linux server, accounting for these changes, the new sequence of commands to compile and run a WLP4 program looks like this, assuming your executable is called wlp4gen and your source file is called input.wlp4:
cs241.wlp4scan < input.wlp4 | cs241.wlp4parse | cs241.wlp4type | ./wlp4gen | cs241.linkasm > output.com cs241.linker-striparmcom output.com print.com alloc.com > linked.com cs241.arm64emu final.bin
In the web version, the order of operations is unchanged; just make sure you include alloc.com at the end.
You can earn 1 mark (the public test mark) on this problem if your code generator does not support non-wain procedures.
The 2 secret test marks require implementing non-wain procedures. However, 1 of the secret test marks only requires implementing non-wain procedures that have no parameters, so you have a chance of getting part marks even if you do not fully finish the implementation of non-wain procedures.
In Problems 6 and 7, you are gradually building the loader you started creating in A7P7. You can submit the same program to problems A7P7, A8P6, and A8P7 if you want. You may want to review Assignment 7 P6 before starting this question.
Write an ARM64 program that reads input (from standard input) consisting of a 64-bit memory address α followed by an ARMCOM file.
All words should be printed using the printHex procedure provided in A7.
sp (exclusive) when it is executed.sp (i.e., it might use the stack without properly restoring the stack pointer). However, it will not modify memory at addresses greater than or equal to the value of sp that it is given.sp must be greater than or equal to α + 0x10000 when you jump to the loaded ARM64 program.Hint: If you don't understand all the implications of the notes above, just write code that jumps to the loaded ARM64 program and see what happens. You may be able to earn part marks with a simple implementation that works most of the time and only fails on "weird" programs.
Suppose that input.asm contains the following self-modifying code, and the
load address α is 0x10000.
b skip .8byte 0 skip: ldr x2, 8 b 12 .8byte 241 ldr x3, 8 b 12 .8byte 0x10004 stur x2, [x3, 0] br x30
Your program's output should be like below. The coloured text highlights the program changing its .8byte field to 241.
$ cs241.arm64emu linked.com < testalpha.in 03 00 00 14 00 00 00 00 00 00 00 00 42 00 00 58 03 00 00 14 f1 00 00 00 00 00 00 00 43 00 00 58 03 00 00 14 04 00 01 00 00 00 00 00 62 00 00 f8 c0 03 1f d6 00 00 00 00 03 00 00 14 f1 00 00 00 00 00 00 00 42 00 00 58 03 00 00 14 f1 00 00 00 00 00 00 00 43 00 00 58 03 00 00 14 04 00 01 00 00 00 00 00 62 00 00 f8 c0 03 1f d6 00 00 00 00 ... Program exited normally.
Write an ARM64 program that reads input (from standard input) consisting of a 64-bit memory address α followed by an ARMCOM file.
All words should be printed using the provided printHex procedure.
This time, the ARMCOM file can have a non-empty footer. However, the footer of the ARMCOM file will only contain REL entries. You will not encounter ESR and ESD entries.
All other implementation notes from Problem 6 still apply. The main difference is that in this problem, you will be tested with programs that do not work when loaded at nonzero addresses unless relocation is performed correctly.
Suppose that input.asm
now contains the following self-modifying code,
and the load address α is 0x10000.
b skip changeMe: .8byte 0 skip: ldr x1, 8 b 12 .8byte 241 ldr x4, 8 b 12 .8byte changeMe stur x1, [x4, 0] // store 241 at changeMe br x30
Your program's output should be like below. The red text highlights the changeMe label being relocated to point to 0x10004, and the green text highlights the code still correctly changing the value in changeMe to 241.
$ cs241.arm64emu linked.com < testalpha.in 03 00 00 14 00 00 00 00 00 00 00 00 41 00 00 58 03 00 00 14 f1 00 00 00 00 00 00 00 44 00 00 58 03 00 00 14 18 00 00 00 00 00 00 00 81 00 00 f8 c0 03 1f d6 00 00 00 00 03 00 00 14 f1 00 00 00 00 00 00 00 41 00 00 58 03 00 00 14 f1 00 00 00 00 00 00 00 44 00 00 58 03 00 00 14 04 00 01 00 00 00 00 00 81 00 00 f8 c0 03 1f d6 00 00 00 00 ... Program exited normally.
In this problem, Marmoset will run your compiler against a large test suite of complex WLP4 programs, and measure the total number of instructions executed by your final generated programs on specific inputs.
You earn the bonus mark if your program is in the Top 5 in terms of lowest total instruction count across all the test cases. You will also earn the mark if your submission's final instruction tally is within 20% of the highest scoring submission.
You can view the anonymous scoreboard here. Submit the compiler (filename is wlp4gen.(cc/cpp/rkt), same as you did for A8P5) to the problem labelled a8bonus on Marmoset.
If you wish to give yourself a custom name on the scoreboard, you can submit a file called name.txt alongside your submission, and put your desired name in that file. The name can be up to 50 characters long, and should only contain alphanumeric characters, spaces, and hyphens.
© University of Waterloo. | Design by TEMPLATED.