CS 444/644 - Compiler Construction (Winter 2026) - Assignment 1

Scanning, Parsing, Weeding

The first assignment is to implement lexical and syntactic analysis. The scanner splits up the input into tokens and catches lexical errors (inputs that cannot form valid tokens). The parser checks that the input list of tokens conforms to a syntax specified using a context-free grammar. The weeding stage detects simple errors that generally could have been checked by a context-free grammar, but this would make the grammar very complicated. It is recommended that your design follow the above three stages, though it is not required, as long as you somehow implement the lexical and syntactic specification of Joos 1W.

The following restrictions of the Joos 1W language must be checked (either in the parser or in the weeder):

This phase of the compiler should also determine the values of all literals in the program. For integer literals, it must check that the literal is within the range of a Java int. For character and string literals, it must expand any escape sequences in the literal.

Code Submission

Submit to Marmoset a .zip archive containing everything required to build and run your project. It should also include all your test cases and test code that you used to test your compiler. Be sure to mention where these files are in your document. It must also include a file named git.log showing the commit history of your Git repository.

In addition to the above, the .zip file should contain a file named Makefile in the root directory. Marmoset will run make on this Makefile to build your compiler. The Makefile must generate an executable (binary or shell script) called joosc.

When run with a single argument, a filename, joosc should process the given Joos 1W file, produce appropriate diagnostic messages on standard error, and exit with one of the following Unix return codes:

Your build process and execution of joosc should not send or receive data from/to the internet in any way.

The Marmoset tests for this assignment take several minutes to run. Do not submit more than one submission at a time to Marmoset. If Marmoset reports that your previous submission has not been tested yet, do not submit another one. Denial-of-service attacks on Marmoset will result in disciplinary action.

It is recommended that you automate the creation of the .zip file (e.g. using a shell script or some kind of build tool), since you will be creating the .zip file repeatedly throughout the course.

Document Submission

Submit to Marmoset a PDF document explaining the design of this phase of your compiler. The document will be evaluated based on both writing and technical content. Good writing often reflects good thinking. Be clear, thorough, and concise.

The document should be no more than five letter-sized pages long, should use a 10-12 point font, and should use reasonable margins and line spacing. It should have a title and should list the names and WatIAM (Quest) userids of the group members.

Documents submitted after the assignment deadline will not be marked and will receive a mark of zero. If you cannot finish the implementation by the deadline, document what you have by the deadline, and explain any unfinished parts in your document.

The document should typically cover the following topics:

Design and Implementation

This should be the main part of your document. Things to consider:

Your goal is to enable someone unfamiliar with your compiler to understand it (without looking at the source code) and to convince the course staff that you have thought carefully about the construction of your compiler.

Testing and Known issues

Describe your test cases and the result of testing. Detail any known issues of your compiler. Your compiler will be evaluated on secret Marmoset tests at the end of the term, so you want to achieve good coverage of the input space and of your code in your own tests.

Contributions

Describe what each group member did.

Thoughts

The document will be hand-marked, with 50% of the marks for organization, clarity, and style, and 50% of the marks for technical content.