Various tools are used in CS241. Historically, they were all available only on the student Linux server. They are now slowly being duplicated as browser/web-based versions, to avoid overloading the servers. The web-based versions of these tools are available here
All of these tools are available on the student Linux servers (linux.student.cs.uwaterloo.ca), by running source /u/cs241/setup.
After you have run the setup command, simply type the name of the tool to use it. The tools do not get copied to your user folder or anything like that, the setup script simply does some configuration that lets you run the tools from anywhere.
The setup command only lasts for one login session. If you do not want to manually enter it every time you log in, put the command in the hidden .bash_profile file in your home directory, or in .profile if .bash_profile does not exist.
| Tool | Primarily Used In | Purpose |
|---|---|---|
| marmoset_submit | All Assignments | Command line submission to Marmoset |
| cs241.binview | Assignment 1, 2 & 3 | Binary file viewer |
| cs241.wordasm | Assignment 1 | ARM64 assembler (limited to .8byte directives) |
| cs241.binasm | Assignment 1, 2, 3, 7 & 8 | ARM64 assembler |
| cs241.arm64emu | Assignment 1, 2, 7 & 8 | ARM64 emulator |
| cs241.dfa | Assignment 3 | DFA checker and recognizer |
| cs241.smm | Assignment 4 | Run Simplifed Maximal Munch using a DFA |
| cs241.wlp4c | Assignment 4, 6, 7, & 8 | WLP4 semantic analyzer and compiler |
| cs241.wlp4scan | Assignment 4, 5, 6, 7 & 8 | WLP4 scanner |
| cs241.cfgcheck | Assignment 5 | CFG syntax checker |
| cs241.slr | Assignment 5 | SLR(1) DFA generator |
| cs241.wlp4parse | Assignment 5, 6, 7 & 8 | WLP4 parser |
| cs241.wlp4type | Assignment 6, 7 & 8 | WLP4 semantic analyzer |
| cs241.linkasm | Assignment 7 & 8 | ARM64 assembler (produces linkable ARMCOM files) |
| cs241.linker | Assignment 7 & 8 | ARMCOM linker |
| cs241.striparmcom | Assignment 7 & 8 | Strips ARMCOM metadata |
Usage: marmoset_submit COURSE PROJECT FILE1 [FILE2] ...
This tool lets you submit to Marmoset from the command line. For this course the COURSE value should always be CS241. The PROJECT field should be filled with the name of the problem you are submitting to, like a4p2 or a9bonus. Problem names can be viewed on the Marmoset web interface. When using this tool the problem names must exactly match the name shown on the Marmoset web interface, respecting capitalization. After PROJECT, simply list out all files you wish to submit for the chosen problem. A sample is shown below:
$ marmoset_submit CS241 a1p5 self-modifying.hex
For many problems you will just be submitting a single file, but you can provide multiple files on the command line. For example, if your solution for a question is split into several modules, you can submit those modules together as long as your main required executable is still present. In this course, even if the assignment only specifies one particular file name, you are allowed to submit other files alongside the one with that name. For C++ programs, Marmoset will simply pass all .cc files you submit to the compiler; a Makefile is not needed, and in fact any Makefile you submit will be ignored. (This is specific to the way CS241's Marmoset scripts are set up and may not apply to other courses.)
Note that this is just a wrapper script for the command /u/cs_build/bin/marmoset_submit. You can use the /u/cs_build/bin/marmoset_submit command directly without running the CS241 setup command. You can also use marmoset_submit (whether through our wrapper script or directly using the cs_build command) in other courses that use Marmoset by just replacing CS241 with that course name.
Usage: cs241.binview [OPTION]... FILE
This tool lets you view the binary data stored in files in a human-readable form.
If you view a file with cat or open it in a text editor like vim, your terminal will attempt to display the binary data as ASCII (or maybe Unicode) characters.
For a text file, this is exactly what you want, but for something like ARM64 machine code it will probably look like gibberish.
The cs241.binview tool formats and prints the binary data as a sequence of 0 and 1 characters. It also has options for printing in decimal, hexadecimal, and ASCII.
Suppose the file program.bin contains the machine code version of the following ARM64 program (translated using an assembler like cs241.binasm):
add x1, x2, x3
br x31
You can use cs241.binview with the --all option to view the machine code in binary, decimal, hexadecimal and ASCII form.
$ cs241.binview --all program.bin
#65 #96 #35 #139
0x41 0x60 0x23 0x8B
A ` # \x8B
01000001 01100000 00100011 10001011
#224 #3 #31 #214
0xE0 0x03 0x1F 0xD6
\xE0 ^ETX ^US \xD6
11100000 00000011 00011111 11010110
If --all or other similar options are not used, the default is to produce binary output only.
Notes:
cs241.binview without arguments to get a help message showing all the options.cs241.binview, you must pass the argument - to tell cs241.binview to read from standard input, like this: cs241.binasm < input.asm | cs241.binview -
cat.
xxd. It is basically a version of xxd where the output format and options are more tailored to the needs of CS 241.
Usage: cs241.wordasm < PROGRAM.hex
This is a very restricted ARM64 assembler that only supports the .8byte directive. No other instructions or features like labels are available.
The tool reads ASCII text from standard input. The text should consist of a series of lines, one line for each 64-bit word of binary output.
Each line consists of the string ".8byte" followed by a string giving either the hexadecimal (prefixed with "0x") or decimal (no prefix) representation of the 64-bit word.
A semicolon can be used to start a single-line comment.
Suppose the file cs241.hex contains the following text:
.8byte 0x6f72203134325343 // "CS 241 ro"
.8byte 0x000000000a736b63 // "cks(newline) (4 null characters)"
This is the encoding of the string "CS241 rocks" (with a newline at the end).
Redirecting this file into cs241.wordasm gives the following output on the terminal:
$ cs241.wordasm < cs241.hex
CS241 rocks
Because of the little-endianness of ARM64, the characters in each number are byte-encoded in reverse order. Hence, when converting from the 8 byte value to the corresponding string, the characters must be read in reverse order, as shown below.
IN : 0x6f72203134325343
----------------------
Byte: 6f 72 20 31 34 32 53 43
Char: o r 1 4 2 S C
----------------------
OUT: CS241 ro
You will mostly use cs241.wordasm to write ARM64 machine code programs, but as this example shows, it can be used to create any kind of binary data (as long as the length in bytes is a multiple of 8). In this example, ASCII text was produced as output. When producing machine code output, you will likely want to redirect the output to a file, like this:
cs241.wordasm < input.hex > output.bin
Usage: cs241.binasm < PROGRAM.asm > OUTPUT.bin
This is an assembler that takes an ARM64 assembly language program as input (from
standard input) and produces an ARM64 machine language program in binary as output (on
standard output). Unlike cs241.wordasm, all the instructions and features of the CS241 ARM64 dialect are supported.
Aside from having a much more flexible input format, the usage is identical to cs241.wordasm. Suppose input.asm contains the following text:
add x1, x2, x3
br x30
You can assemble this program directly as follows. It is not necessary to convert the program to .8byte directives like with cs241.wordasm.
cs241.binasm < input.asm > output.bin
Usage: cs241.arm64emu [-a] [-i] PROGRAM.bin [x0 value] [x1 value] ...
This is an emulator for running compiled ARM64 machine language programs. Once you have produced a machine language program using one of our assemblers, you can run it with this tool.
To use the tool, give it the filename of the compiled machine language program you want to run as the first command line argument. By default, it will load the program at address 0x00. You can load numerical inputs into the emulator by passing in additional arguments, which will be loaded into the ARM64 registers before the program runs. For example, running
$ cs241.arm64emu program.bin 10 3 5
will run the program program.bin with:
10 in register x03 in register x15 in register x2
For longer inputs, the -a option stores the input arguments as an array within memory, stores the address of the start of the array in x0 and the length of the array in x1.
The emulator supports reading from standard input. The -i option enables interactive mode, which allows you to step through your code's execution and debug any issues.
Here we have prepared a simple ARM64 program in the format expected by cs241.wordasm. The source code is stored in the file addvalues.hex, and it adds the values in x0 and x1 and stores it in x0. We run it through cs241.wordasm to produce the binary machine language version. We then run the program with cs241.arm64emu:
$ cat addvalues.hex
.8byte 0xD61F03C08B216000 // add x0, x0, x1
// br x30
$ cs241.wordasm < addvalues.hex > prog.bin
$ cs241.bin64emu prog.bin 4 5
x0: 0x9 x16: 0x0
x1: 0x5 x17: 0x0
x2: 0x0 x18: 0x0
x3: 0x0 x19: 0x0
x4: 0x0 x20: 0x0
x5: 0x0 x21: 0x0
x6: 0x0 x22: 0x0
x7: 0x0 x23: 0x0
x8: 0x0 x24: 0x0
x9: 0x0 x25: 0x0
x10: 0x0 x26: 0x0
x11: 0x0 x27: 0x0
x12: 0x0 x28: 0x0
x13: 0x0 x29: 0x1000000
x14: 0x0 x30: 0xfffffe4
x15: 0x0 sp: 0x1000000
pc: 0xfffffe4 instr: hlt
flags: vczn
Program exited normally.
The emulator prints a list of register values to standard error when the program stops (whether it stops normally or crashes). Notice the following about the register values:
4 in x0 and 5 in x1.0x9 in hexadecimal.x0 contains 0x9, as desired.Here are brief explanations of some the error messages you might see from this tool. This is not necessarily a comprehensive list.
ldur, stur, ldr), but this can also happen if you branch (b, b.cond,br, blr) to an address that isn't a multiple of 4.
Usage: cs241.dfa < FILE.dfa
This tool reads a file in the DFA file format from standard input, and checks the file for errors. If no errors are found, the strings in the input section of the file are processed with the DFA. The tool outputs one line per input string, containing the string followed by "true" if the string was accepted and "false" otherwise.
Usage: cs241.smm < FILE.smm
This tool reads a file in the SMM file format as its command line argument. As with cs241.dfa, it checks the file for errors.
If no errors are found, then characters are tokenized according to Simplified Maximal Munch using the provided DFA file. Characters are first read from the input section (which is optional) and then from standard input until EOF. Each time a token is accepted, its lexeme is printed to standard out, followed by a newline. Spaces and newlines are printed using the representation used in the SMM DFA format.
Usage: cs241.wlp4c < PROGRAM.wlp4 > OUTPUT.bin
This tool reads a WLP4 program from standard input and produces on standard output a compiled ARM64 machine language program. The compiled program can be run directly using cs241.arm64emu.
The input should be WLP4 source code like below, not the result of the scanning or parsing phase. The cs241.wlp4c tool will do the scanning and parsing itself.
int main(int a, int b) {
return a;
}
Note that cs241.wlp4c performs semantic analysis and will reject programs that do not follow the name and type rules. Thus you can use cs241.wlp4c to check the semantic validity of a WLP4 program.
Usage: cs241.wlp4scan < PROGRAM.wlp4 > TOKENS.scan
This tool reads a WLP4 program from standard input similar to wlp4c, but only performs the scanning phase of compilation. The result is a list of lines representing WLP4 tokens, with each line containing a "kind" (the type of token) followed by a "lexeme" (the string the token corresponds to).
Usage: cs241.cfgcheck < FILE.cfg
This tool reads (from standard input) a CFG component followed by a sequence of zero or more DERIVATION components. If the CFG and derivations are valid, it prints information about the CFG, and prints the rules used in each derivation. It then prints the terminal string arrived at by the derivation. If the CFG is malformed, or the derivation is invalid, an error message is printed, and the tool quits.
Usage: cs241.slr < FILE.cfg > OUTPUT.slr1
This tool reads a CFG component representing a non-augmented grammar from standard input, augments the CFG, and produces a file representing the augmented CFG and its SLR(1) DFA, suitable for use as a test input for the SLR(1) parser you will write on Assignment 5.
The output contains, in order:
This tool uses the SLR(1) DFA construction, so the tool will not work for all grammars. It will only work if the SLR(1) construction happens to produce a conflict-free DFA for the grammar. In particular, the tool will not work for ambiguous grammars.
Usage: cs241.wlp4parse < TOKENS.scan > PROGRAM.wlp4
This tool takes a scanned WLP4 program (that is, a list of tokens produced by cs241.wlp4scan) on standard input, and produces on standard output a WLP4I file representing the parse tree for the program. A WLP4I file is essentially just a preorder traversal of the parse tree, so it encodes all the same information as the tree itself.
Typically you wouldn't call cs241.wlp4parse on its own, and you would instead pipe it the output from cs241.wlp4scan, since cs241.wlp4parse requires the input to be scanned first. For example, if program.wlp4 contains WLP4 source code (not scanned yet) you could run the following command:
$ cs241.wlp4scan < program.wlp4 | cs241.wlp4parse > program.wlp4i
This avoids the need to create a temporary file to hold the scanned program.
Usage: cs241.wlp4type < PROGRAM.wlp4i > PROGRAM.wlp4ti
This tool can be used to convert a WLP4 Intermediate (.wlp4i) file
to a WLP4 Typed Intermediate (.wlp4ti) file.
The tool checks the program represented by the .wlp4i file for semantic errors, and if there are no errors,
it outputs a .wlp4ti file representing the type-annotated parse tree.
To use cs241.wlp4type with WLP4 program source code, pass
the source code through cs241.wlp4scan and cs241.wlp4parse
first:
cs241.wlp4scan < program.wlp4 | cs241.wlp4parse | cs241.wlp4type > program.wlp4ti
Usage: cs241.linkasm < PROGRAM.asm > PROGRAM.com
This assembler has all the features of cs241.binasm and supports two additional directives: .import and .export. These can be used to import label definitions from assembled ARMCOM programs, or to export label definitions which allow other programs to import them. This allows the creation of ARM64 "libraries" which export useful procedures that other code can use.
This assembler produces ARMCOM files rather than plain machine code. If your program imports labels from other ARMCOM files, you will need to use cs241.linker to combine the ARMCOM file into a usable assembly.
Suppose we have the following assembly files that use the .import and .export directives:
$ cat import.asm
.import label
ldr x0, 8
b 12
.8byte label
br x0
$ cat export.asm
.export label
label:
br x30
These programs don't do anything interesting, and are just to give an example of the syntax. We can assemble these programs as follows:
$ cs241.linkasm < import.asm > import.com
$ cs241.linkasm < export.asm > export.com
Using cs241.binasm to assemble these would produce an error since it does not support .import or .export.
Note that here we have simply assembled two separate programs that don't know anything about each other. If we tried to execute import.com it would get stuck in an infinite loop, since the label import has not been resolved, and unresolved labels default to value 0. To resolve the import we would have to use cs241.linker.
Usage: cs241.linker PROGRAM.com [PROGRAM2.com] ... > LINKED.com
This tool links two or more ARMCOM files together into a single ARMCOM file. The files are provided as command line arguments, not using standard input. For example, we could link the ARMCOM files produced in the cs241.linkasm example as follows:
cs241.linker import.com export.com > linked.com
Note that the order of linkage matters; the first file passed to the linker will be placed first in the linked program, and as such will be the main entrypoint.
(There is no web version of this tool. Instead, its functionality in the web version is included in cs241.linker, above)
Usage: cs241.striparmcom ADDRESS < PROGRAM.com > PROGRAM.bin
This tool reads an ARMCOM file from standard input and produces plain ARM64 code on standard output. It strips out the relocation and linking metadata, and relocates the program to run at the address specified by the command line argument. The ADDRESS in the above usage example should be replaced by a number; it does not mean the literal string "address".
Normally address 0 is used, since the ARM64 machine loads programs at address 0 by default, so typical usage would look like: cs241.striparmcom 0 < input.com > output.bin
The tool will produce information about the ARMCOM metadata on standard error. You can use this to programmatically compare the metadata for two ARMCOM files: sort the standard error results (since the armcom files may have the same metadata entries but different ordering) and run diff on the sorted files. If you do not want to see this metadata information, you can redirect standard error to /dev/null as follows: cs241.striparmcom 0 < input.com > output.bin 2> /dev/null
Note that ARMCOM files can usually be directly executed without needing to strip the metadata. However, for technical reasons, the memory allocation ARMCOM module you will use in some assignments requires the ARMCOM metadata to be stripped to work correctly. That is the main reason to use this tool, aside from the aforementioned trick of using it to compare the metadata of two ARMCOM files.
© University of Waterloo. | Design by TEMPLATED.