CS 444/644 - Compiler Construction (Winter 2021) - Assignment 2

For the second assignment, you will implement abstract syntax tree building, environment building, type name resolution, and hierarchy checking.

It is recommended but not required that your design follow the above four stages.

As in Assignment 1, you must hand in to Marmoset a .zip archive containing your source code. The .zip file must contain a file called Makefile. Marmoset will run make on this Makefile to compile your compiler. The Makefile must generate an executable (binary or shell script) called joosc.

Unlike in Assignment 1, joosc in this and future assignments must accept multiple filenames as arguments. All of the files listed on the joosc command line, and only those files, are considered part of the program being compiled.

Unlike javac and unlike the dOvs version of Joos, your joosc compiler should not look for classes in .class files on the CLASSPATH; it should read only the Joos 1W source files listed on the command line. This means that all classes, including classes such as java.lang.Object, must be available in source form and must be specified on the joosc command line. Unlike javac, Joos does not care what directory a source file is in (i.e. it does not require the directory structure of the source code to match the package structure). However, the class declared in a file must still have the same name as the filename. For example, Java would require that the class java.lang.Object be declared in the file Object.java in the directory java.lang, whereas Joos only requires the file to be named Object.java, but otherwise allows it to be in any directory.

For the purposes of this course, a minimalist version of the Java standard library is provided. This library can be found in the linux.student.cs environment in the directory /u/cs444/pub/stdlib/2.0. Marmoset will include all files in this library on the joosc command line for every test, in addition to other source file(s) specific to that test. The following versioning scheme is used to make it possible to correct errors and/or to extend the library for future assignments (although we aim to minimize the number of changes that will be required). The 2 in the directory name refers to Assignment 2, and the 0 is the first version of the library. Any corrections to the Assignment 2 version of the library will appear in the directories 2.1, 2.2, etc., and the version of the library for Assignment 3 will appear in the directory 3.0.

As in Assignment 1, joosc should process the Joos 1W files given on the command line, produce appropriate diagnostic messages on standard error, and exit with one of the following Unix return codes:

The Marmoset tests for this assignment take several minutes to run. Do not submit more than one submission at a time to Marmoset. If Marmoset reports that your previous submission has not been tested yet, do not submit another one. Denial-of-service attacks on Marmoset will result in disciplinary action.

You are not required to submit a design document for this assignment. However, Assignment 4 will require a design document covering your design decisions for Assignments 2, 3, and 4, so it is recommended that you start writing such a document. As for Assignment 1, the document should be organized to enable someone unfamiliar with your code to understand the structure of your compiler. In the document, discuss challenges that you encountered and how you tried to overcome them in your design and implementation. Also explain the testing that you did before submitting to Marmoset.

Abstract Syntax Tree Building

A parse tree built using an unambiguous grammar typically contains many redundant nodes. By recursively traversing the parse tree, it is possible to build an abstract syntax tree with the analogous structure but fewer nodes. The grammar of the abstract syntax tree can be ambiguous and therefore simpler, since the structure is obtained from the parse tree. An abstract syntax tree simplifies later phases of the compiler by reducing the number of nodes and the number of different kinds of nodes that each phase needs to handle. When a compiler is implemented in a statically typed language, distinct types are typically defined for each possible kind of tree node so that later compiler phases can use the type to document the tree nodes that they handle.

Environment Building

The environment building stage creates environments (containing classes, interfaces, fields, methods, local variables, and formal parameters) for each scope. Given a name of one of these entities, the environment should be able to locate the correct declaration of the entity.

After constructing all of the environments, the following restrictions of the Joos 1W language must be checked:

Type Linking

The type linking stage connects each use of a named type (class or interface) to the declaration of the type. At this stage, only names that can be syntactically (according to JLS 6.5.1) determined to be names of types need to be linked. Some names are syntactically ambiguous, in the sense that type checking must be done before it can be determined whether they are names of type or of other entities (see JLS 6.5). These ambiguous names will be linked in a later assignment.

When linking type names, the following restrictions of the Joos 1W language must be checked:

Hierarchy Checking

The third stage computes the inheritance relationships for classes, interfaces, methods, and fields, and checks that they conform to the various rules given in Chapters 8 and 9 of the Java Language Specification. Specifically, this stage should check that:

Hint/Warning

A common error in this assignment is that the compiler works correctly only when the source files are listed on the command line in a particular order, for example with all superclasses before their subclasses. When designing your compiler, be sure to consider that the source files could be listed in an arbitrary order. When testing your compiler, make sure that it still works correctly when the source files are reordered.