Assignments for CS241
Due Friday, March 27th, 5:00 PM
In Assignments 7 and 8, you will write a code generator for WLP4, and finally complete the process of translating WLP4 source code to ARM64 assembly!
In Assignment 7, you implement the basic features of WLP4:
the wain procedure, integer variables and constants, declarations,
assignment, arithmetic, control flow, and printing.
In Assignment 8, you build on these features and add support for pointers, other procedures, and dynamic memory allocation.
In both Assignment 7 and Assignment 8, you are gradually building up one large program across multiple problems. You can submit the same program to problems P1 to P5 (on both assignments!) if you want.
For problems 1-5 across both assignments, you must submit one of the following files:
You can submit additional files if you wish to divide your program into multiple files, but a file with one of the above names must be present in your submission.
The input format on this assignment
is the
WLP4 Typed Intermediate (.wlp4ti)
format that you produced as the output for Assignment 6.
This is a preorder traversal of a WLP4 parse tree that includes
type information for expression nodes.
You will not need the type information in Assignment 7,
but it will be needed in Assignment 8.
Given some WLP4 source code, you can produce the corresponding .wlp4ti input file using the web tool in one
step, or using the following command on the Student Linux server:
cs241.wlp4scan < input.wlp4 | cs241.wlp4parse | cs241.wlp4type > input.wlp4ti
The output format is ARM64 assembly code (note: not compiled ARM64 machine code). The goal of this assignment is to produce ARM64 assembly code that implements the behaviour specified by the WLP4 program!
After running the .wlp4ti file through your code generator,
you can test your program by running it through cs241.binasm*
and then using cs241.arm64emu to run the resulting ARM64 machine code.
Here are the commands for running your generated code on the Student Linux Server, assuming your
code generator executable is named wlp4gen.
./wlp4gen < input.wlp4ti | cs241.binasm > output.bin
cs241.arm64emu output.bin
When your code is submitted, Marmoset will automatically link your generated code to the alloc.com and print.com libraries to enable dynamic memory allocation and printing functionality. This isn't relevant for problems 1 through 4 for both A7 and A8, but it allows you to submit the same executable to all 10 problems across both assignments and still have it pass. If you submit any auxiliary ARMCOM files alongside your code generation program, Marmoset will also automatically link your executable to those.
In these assignments, there is no "correct executable" to compare your
output to, because that concept does not make sense for a code generator.
Instead, test your code generator by comparing the behaviour of the
compiled program to the expected behaviour. If you are unsure
what the expected behaviour is, you can use
wlp4c
to compile the program, and compare "your compiled program" with
"cs241.wlp4c's compiled program".
cs241.wlp4c < input.wlp4 > expected.asm
Note that cs241.wlp4c
takes WLP4 source code as input, not a .wlp4ti file like your program,
and produces ARM64 machine code as output, not ARM64 assembly code like
your program.
You can then run both ARM64 programs and check if the return value in
x0 matches, as well as standard output if the program includes printing.
There are no particular efficiency requirements for these assignments, and most of the test programs are small. However, students have a tendency to accidentally write tree-building code that uses an exponential amount of memory in the tree size (particularly if they're trying to avoid using pointers, and don't understand C++ move/copy semantics). Exponential time or memory usage is not acceptable even on an assignment with relaxed efficiency requirements.
If you make this mistake, then you will likely exceed Marmoset's time or memory limit on tests with larger programs. The largest programs are reserved for the secret tests, so this could lead you to fail secret tests despite passing all the other tests.A7P1 on Marmoset includes a "Warning" test which is not for marks and is just intended to warn you about potential efficiency issues. Passing this test does not guarantee your compiler is efficient, though.
Write a code generator for WLP4 programs whose parse tree contains only the following rules:
start → BOF procedures EOF
procedures → main
main → LONG WAIN LPAREN dcl COMMA dcl RPAREN LBRACE dcls statements RETURN expr SEMI RBRACE
type → LONG
dcl → type ID
dcls →
statements →
expr → term
term → factor
factor → NUM
factor → ID
factor → LPAREN expr RPAREN
These programs are very restricted:
wain procedure
wain
long variables (no long*)
long wain(long a, long b) { return a; }
When run with cs241.arm64emu, the return value in x0 should be the first parameter of wain (the value given for x0).
Extend your code generator to support programs whose parse tree contains the following additional rules:
dcls → dcls dcl BECOMES NUM SEMI
statements → statements statement
statement → lvalue BECOMES expr SEMI
lvalue → ID
lvalue → LPAREN lvalue RPAREN
In other words, you now have to support declarations in the body
of wain, and assignment statements.
long wain(long a, long b) {
long c = 241;
b = c;
a = b;
return a;
}
When run with cs241.arm64emu, the return value in x0 should be 241.
The public test cases only contain declarations (no assignment statements), so you can earn 1 mark on this problem without implementing assignment statements.
Extend your code generator to support programs whose parse tree contains the following additional rules:
expr → expr PLUS term
expr → expr MINUS term
term → term STAR factor
term → term SLASH factor
term → term PCT factor
In other words, you now have to support expressions with addition, subtraction, multiplication, division, and remainder-taking.
long wain(long a, long b) {
return (240 + 1) - (2410 / 10);
}
When run with cs241.arm64emu, the return value in x0 should be 0.
The test cases in this problem do not contain declarations or assignment statements (the features implemented in A7P2), so you can earn full marks on this problem even if you have not solved A7P2.
Extend your code generator to support programs whose parse tree contains the following additional rules:
statement → IF LPAREN test RPAREN LBRACE statements RBRACE ELSE LBRACE statements RBRACE
statement → WHILE LPAREN test RPAREN LBRACE statements RBRACE
test → expr EQ expr
test → expr NE expr
test → expr LT expr
test → expr LE expr
test → expr GE expr
test → expr GT expr
In other words, you now have to support if-else statements and while loops, as well as boolean comparisons.
long wain(long a, long b) {
if(a < b) {
while(a < b) {
a = a + 1;
}
} else {
a = 241;
}
return a;
}
When run with cs241.arm64emu, the return value in x0 depends on the value of the first parameter (given in x0)
and the value of the second parameter (given in x1).
If x0 < x1, the return value should match the value of x1.
Otherwise, the return value should be 241.
The public test cases do not contain if-else statements, so you can earn 1 mark on this problem if you just implement while loops.
The public test cases do contain assignment statements and arithmetic, so you need to have solved A7P2 and A7P3.
Extend your code generator to support programs whose parse trees contain the following additional rules:
statement → PUTCHAR LPAREN expr RPAREN SEMI
statement → PRINTLN LPAREN expr RPAREN SEMI
factor → GETCHAR LPAREN RPAREN
In other words, you now need to support I/O, through the putchar and println statements, and the getchar expression.
Since this is the final main-line problem on Assignment 7, you will be tested with programs that combine all the features you implemented on Assignment 7.
long wain(long a, long b) {
println(a+b);
return 0;
}
When run with cs241.arm64emu, the return value in x0
will be 0. However, the program should produce the value of
x1 + x2 (the sum of the two parameters) on standard output.
long wain(long a, long b) {
putchar(getchar());
return 0;
}
When run with cs241.arm64emu, the return
value in x0 will be 0. However, the program should echo to standard output
the first character read from standard input.
Supporting the println statement introduces some complications, because
unless you want to use your own print procedure from Assignment 3, you will
need to import the print procedure from an external library, and link it
with your generated code when testing.
The print library is available here: print.com
To import the print procedure, include the following line at the start of your generated code:
.import print
You can then call "print" in the same way you would call any other
ARM64 procedure, as if there was a label called "print" in your code.
The procedure expects the value to print in x0, and preserves all registers.
Unfortunately, our beloved assembler cs241.binasm does not
know what .import means and will produce an error when it sees the
.import print line.
To test our code, we will need to use a different assembler that
supports imports.
That assembler is
cs241.linkasm.
It produces linkable ARMCOM object files instead of raw ARM64 machine code.
Once you run the output of your code generator through
cs241.linkasm, you need to use
cs241.linker
to link your code with the print library. That will produce
a combined ARMCOM file that you can run directly with cs241.arm64emu.
The sequence of commands to compile and run a WLP4 program on the Student Linux server
(with cs241.arm64emu)
now looks like this, assuming your code generator is named wlp4gen:
cs241.wlp4scan < input.wlp4 | cs241.wlp4parse | cs241.wlp4type | ./wlp4gen | cs241.linkasm > output.com
cs241.linker output.com print.com > linked.com
cs241.arm64emu linked.com
For both the public and secret tests, you must have implemented the features from all the previous Assignment 7 problems to pass. There are no part marks for just implementing printing.
In problems 6 and 7, we return to the glory days of writing ARM64 assembly programs. You are to write a loader that can load a ARMCOM file at an arbitrary location in memory, performing the necessary relocation to ensure the file will still run correctly.
We have provided some starter code in the form of three ARM64 procedures:
x0. If standard input has
fewer than 8 bytes remaining, the remainder is padded with zeroes.x0. If standard input has
fewer than 4 bytes remaining, the remainder is padded with zeroes.x0 to standard output, followed by a newline. It does not return a value.All procedures preserve all registers, excpet for those used to return values.
The starter code is provided as both assembly source code and an ARMCOM library. It is up to you whether you simply copy the source code into your submissions, or use the ".import" directive to import the procedures from the ARMCOM file. Marmoset will accept both options - just make sure that the ARMCOM files you submit are zipped together with your submission.
p6.asm)Write an ARM64 program that reads an ARMCOM file from standard input. Your program should print the contents of the ARM64 Program component of the ARMCOM file word-by-word, then return.
The full words which make up the ARM64 Program component should be printed using the provided printHex procedure. If the ARM64 Program component contains an odd number of halfwords, then the last halfword should be combined with 0 to form the final word.
Here is a simple ARM64 program:
ldr x2, 8
b 12
.8byte 241
br x30
You can assemble it into an ARMCOM file (we'll call it input.com) with
cs241.linkasm:
$ cs241.linkasm < input.asm > input.com
You can assemble your own ARM64 program with either
cs241.binasm (if you are not using
.import directives and just copied the starter code
into your program)
or with cs241.linkasm
and cs241.linker (if you are using .import).
On the Student Linux server:
# If not using .import $ cs241.binasm < p6.asm > p6.com # If using .import $ cs241.linkasm < p6.asm > p6.com $ cs241.linker p6.com starter.com > linked.com
If you pass the input.com file to your program, it should print the
the hexadecimal encoding of the inner ARM64 program to standard output, word by word, separated by newlines:
$ cs241.arm64emu [your program] < input.com 42 00 00 58 03 00 00 14 f1 00 00 00 00 00 00 00 c0 03 1f d6 00 00 00 00 ... Program exited normally.
It is not practical to run the loader using the web tools; this you must run on the Student Linux server.
Write an ARM64 program that reads a 64-bit binary integer value α from standard input, followed by an ARMCOM file. Your program should print the contents of the ARM64 Program segment of the ARMCOM file twice, then return.
The words in the ARM64 Program segment should be printed using the provided printHex procedure.
To print the ARM64 Program segment twice, you should write a program with two loops.
The ARMCOM file will have an empty footer (no REL, ESR, or ESD entries). You do not need to perform relocation.
When we say α is given as a 64-bit integer value, we mean that the value of α is encoded as a 64-bit unsigned integer in the first 8 bytes of standard input. We don't mean that the number is given as an ASCII string that you need to parse. Because of this, you should probably read it using readWord.
The address α will always be between 0x10000 and 0x100000. This means as long as your loader code has 8,192 instructions or less, you will not overwrite your own code when loading the provided ARM64 code.
To create test inputs for this problem (and future problems), you need to prepend the input ARMCOM file with the binary encoding of the load address α. For example, if the load address was 0x10000, you could do this as follows:
echo '.8byte 0x10000' | cs241.binasm > address.bin cs241.linkasm < input.asm > input.com cat address.bin input.com > input.in
Assuming input.com is created using the same example
program as in Problem 1, the output would be:
$ cs241.arm64emu [your program] < input.in 42 00 00 58 03 00 00 14 f1 00 00 00 00 00 00 00 c0 03 1f d6 00 00 00 00 42 00 00 58 03 00 00 14 f1 00 00 00 00 00 00 00 c0 03 1f d6 00 00 00 00 ... Program exited normally.
© University of Waterloo. | Design by TEMPLATED.