Assignment 6: Code Generation (Part B)
In this assignment, you will build on your Assignment 5 compiler and implement code generation (i386 assembly, Intel dialect) for the complete Joos 1W language, including its object-oriented features.
You will also implement the following optimizations in your compiler:
- Register allocation using a live variable analysis and with move coalescing.
- Constant propagation and folding (and optionally, dead code removal) at the IR level.
By default, your
joosc compiler should perform these optimizations.
It should also provide three command-line options,
that control which optimizations are performed.
You will evaluate the performance improvements enabled by these optimizations
on your own benchmarks.
You are encouraged—but not required—to implement more optimizations, such as method inlining and common subexpression elimination. Groups whose compilers are the most correct and generate the fastest code may be rewarded.
Submit to Marmoset a
report.pdf accompanying your code submissions
for Assignments 5 and 6.
Your report should follow the guidelines.
The report should not exceed eight pages.
Your report should include the results of your performance evaluation.
Submit to Marmoset a
.zip archive. It should include everything required
to build and run your project. In particular, the
.zip file must
contain a file called
Makefile. Marmoset will run
make on this
Makefile to compile your compiler. The
Makefile must generate an
executable (binary or shell script) called
executable must accept multiple filenames as arguments. All of the files
listed on the
joosc command line, and only those files, are
considered part of the program being compiled.
javac , your
compiler should not look for classes in
.class files on the
CLASSPATH; it should read only the Joos 1W source files
listed on the command line. This means that all classes, including
classes such as
java.lang.Object, must be available in
source form and must be specified on the
joosc command line.
javac, Joos does not care what directory a source
file is in (i.e. it does not require the directory structure
of the source code to match the package structure).
However, the class declared in a file must still have the same name
as the filename.
For example, Java would require that the class
be declared in the file
Object.java in the directory
java/lang, whereas Joos only requires the file to
Object.java, but otherwise allows it to
be in any directory.
For the purposes of this course, a minimalist version of the
Java standard library is provided. This library can be found
in the directory
/u/cs444/pub/stdlib/6.0 in the
environment. Marmoset will include all
files in this library on the
joosc command line for
every test, in addition to other source file(s) specific to that
test. The following versioning scheme is used to make it possible
to correct errors and/or to extend the library for future assignments
(although we aim to minimize the number of changes that will be required).
The 6 in the directory name refers to Assignment 6, and the 0
is the first version of the library. Any corrections to the Assignment 6
version of the library will appear in the directories
As in previous assignments,
joosc should process the Joos 1W files given on the command line,
produce appropriate diagnostic messages on standard error,
and exit with one of the following Unix return codes:
- 0: the input file is valid Joos 1W
- 42: the input file is not valid Joos 1W
- any other value: your compiler crashed
If the input program is valid Joos 1W, your compiler should output,
into the subdirectory
output of the current working directory,
one or more files with the extension
.s containing the assembly
language code implementing the program. It is recommended that the
code implementing each class be generated into a separate
file. You may assume that the
output directory exists
before your compiler runs, and that the directory is empty.
After your compiler runs, each of the
.s files in the
directory will be assembled with the command:
/u/cs444/bin/nasm -O1 -f elf -g -F dwarf filename.s
After all of the files are successfully assembled, the
runtime.s from the standard library (see below for description)
will also be assembled and placed in the
Then, all of the
.o files generated by
nasm in the
output directory will be linked using the command:
ld -melf_i386 -o main output/*.o
Finally, the generated executable
main will be executed.
joosc compiler, the assembler and linker, and your final
main executable will be run on one of the Linux
CPU servers (e.g.,
One of the generated
.s files must define the global symbol
global _start _start:
When your program is run, execution will start from this point.
Unlike in Java, the first method that begins executing is not
static void main(String), but
static int test().
All of the test inputs will have such a method. The class containing
test method will be listed first on the
before any other compilation units.
The code that you generate
_start should initialize all static fields,
then call this method.
When the method returns with return value x,
your program should exit with exit code x using the
sys_exit system call. To execute this system call,
load the value 1 (indicating
sys_exit) into register
eax, load the exit code into register ebx, then execute the
Java specifies a very precise but complicated order in which static fields must be initialized (JLS 12.4). For Joos, the order is specified by the following rules:
- All static fields must be initialized before the startup
code calls the
static int test()method.
- Static fields within the same class must be initialized in the order in which they appear in the class.
- Static fields in different classes can be initialized in any order.
Note that Java and Joos require that any field without an explicit
initializer be initialized to the value
null, depending on its declared type.
Another simplification is that arrays in a Joos program inherit
clone() method from
In Java, arrays are required to implement a different clone
method as specified in Section 10.7 of the Java Language Specification.
runtime.s file included with the standard library
contains several utilities that are likely to be useful.
In particular, assembly code generated by your compiler in this assignment
will call the following functions:
- The function
__mallocallocates a number of bytes of memory. The number of bytes to be allocated must be in the register eax before executing the instruction
call __malloc. The address of the beginning of the allocated memory can be found on register eax after the call. There is no provision for freeing allocated memory; you should not need it for the simple programs that we will be testing with.
- The function
__exceptionends the program with exit code 13. You should call this function in any situation in which the equivalent Java code would throw an exception, such as a failed null check, array bounds check, or cast check.
- The function
NATIVEjava.io.OutputStream.nativeWriteis an implementation of the native method
nativeWritefound in the
java.io.OutputStreamclass of the standard library. The method writes the low-order byte of its parameter to the standard output. This method is used as the basis of more sophisticated output methods in the standard library such as
System.out.println(). In general, you should translate the call of any native method into a call to the symbol that begins with the string
NATIVE, followed by the canonical name of the class containing the native method, followed by a dot (
.), followed by the name of the native method. Although no native methods other than
nativeWritewill appear in the tests, you can use this mechanism in your own tests to implement additional interaction between your Joos code and the operating system. Recall that in Joos 1W, all native methods must be static, must take a single argument of type
int, and must return a value of type
int. The parameter passing conventions for native methods are the following. The argument must be placed in the register eax prior to calling the native method, and the return value can be found in the register eax after the native method returns.
Important note: Although most of the Marmoset tests
do not use or test standard output, all of the secret tests, which
are worth many marks, do. To get a non-zero mark on the secret tests, it
is essential to pass the
J1_Hello test for Assignment 6, which prints
Hello, World! to standard output.
Marmoset tests the code generated by your compiler on a server
running Linux. If you are using Windows or a Mac, the recommended
way to test the generated code is either to copy it to
and run it there, or to run it using Linux in a virtual machine.
Marmoset will test your compiler on the
servers with the
runtime.s file that is included with the
Joos standard library.
Before starting to implement these assignments, it is strongly recommended that you meet with your group to design and agree on conventions for
- parameter passing,
- local variable storage,
- object layout, class dispatch vector layout, interface dispatch table layout, and
- naming of labels for method implementations and data.
It is recommended that you document these conventions at this stage, and include this documentation in the report that you hand in. It is suggested that you modularize the implementation of these conventions in dedicated modules in your compiler, to ensure consistency between the different parts of your compiler that rely on the conventions.
Benchmarking your compiler optimizations
For each optimization (
-opt-reg-only), you should
create three benchmark programs that demonstrate the
effectiveness of the optimization.
It is recommended that for each of your benchmarks, the unoptimized version of the generated
code takes ~1–3 seconds to run natively, so that
the measurements are reliable but not too time-consuming.
A good way is to create benchmarks that perform optimizable computations
The benchmarks should be placed in the subdirectories
benchmarks/reg in your code submission.
Makefile should have a target
bench that compiles the benchmarks (with
the right command-line options), runs the generated code, and outputs a
benchmarks/results.csv file containing the measured running times.
For each benchmark, the csv file should contain a row in the following format:
<opt>,<benchmark name>,<time in ms w/ opt>,<time in ms w/o opt>,<speedup>
You will include your own measurements in your report.
The course staff will run
make bench to reproduce your evaluation results in
The archive should include a file named
a6.log showing the commit history
of your Git repository.
The archive should not include any extraneous non-source files. It should not include any files that ought to be automatically generated by building or running your compiler.
We reserve the right to deduct points if your submission does not meet the requirements above.
The Marmoset tests for the assignments take several minutes to run. Do not submit more than one submission at a time to Marmoset. If Marmoset reports that your previous submission has not been tested yet, do not submit another one. Denial-of-service attacks on Marmoset will result in disciplinary action.