CS241: Tools

CS241 Tool Summary

Various tools are used in CS241. Historically, they were all available only on the student Linux server. They are now slowly being duplicated as browser/web-based versions, to avoid overloading the servers. The web-based versions of these tools are available here.

All of these tools are available on the student Linux servers (linux.student.cs.uwaterloo.ca), by running source /u/cs241/setup.

After you have run the setup command, simply type the name of the tool to use it. The tools do not get copied to your user folder or anything like that, the setup script simply does some configuration that lets you run the tools from anywhere.

The setup command only lasts for one login session. If you do not want to manually enter it every time you log in, put the command in the hidden .bash_profile file in your home directory, or in .profile if .bash_profile does not exist.

List of CS241 Tools

Tool	Primarily Used In	Purpose
marmoset_submit	All Assignments	Command line submission to Marmoset
cs241.binview	Assignment 1, 2 & 3	Binary file viewer
cs241.wordasm	Assignment 1	MIPS assembler (limited to .word directives)
mips.twoints	Assignment 1, 2, 7 & 8	MIPS emulator (input is two integers)
mips.stepper_twoints	Assignment 1, 2, 7 & 8	Stepper for mips.twoints
mips.stdin	Assignment 1 & 2	MIPS emulator (input is from standard input)
cs241.binasm	Assignment 1, 2, 3, 7 & 8	MIPS assembler
mips.array	Assignment 2, 7 & 8	MIPS emulator (input is an array)
cs241.DFA	Assignment 3	DFA checker and recognizer
cs241.SMM	Assignment 4	Run Simplifed Maximal Munch using a DFA
wlp4c	Assignment 4, 6, 7, & 8	WLP4 semantic analyzer and compiler
wlp4scan	Assignment 4, 5, 6, 7 & 8	WLP4 scanner
cs241.cfgcheck	Assignment 5	CFG syntax checker
cs241.slr	Assignment 5	SLR(1) DFA generator
wlp4parse	Assignment 5, 6, 7 & 8	WLP4 parser
wlp4type	Assignment 6, 7 & 8	WLP4 semantic analyzer
cs241.linkasm	Assignment 7 & 8	MIPS assembler (produces linkable MERL files)
cs241.linker	Assignment 7 & 8	MERL linker
cs241.merl	Assignment 7 & 8	Strips MERL metadata

Tool Information

marmoset_submit

This tool lets you submit to Marmoset from the command line. Use it as follows:

marmoset_submit CS241 AssignmentProblem file1 file2 file3 ...

The AssignmentProblem part should be replaced with the name of the problem you are submitting to, like A4P2 or A9Bonus. This must exactly match the name shown on the Marmoset web interface and it is case sensitive. After the AssignmentProblem part, list all the files you want to submit.

For many problems you will just be submitting a single file, but you can provide multiple files on the command line, for example if your solution for a question is split into several modules. In this course, even if the assignment only specifies one particular file name, you are allowed to submit other files alongside the one with that name. For C++ programs, Marmoset will simply pass all the .cc files you submit to the compiler; a Makefile is not needed, and in fact any Makefile you submit will be ignored. (This is specific to the way CS241's Marmoset scripts are set up and may not apply to other courses.)

Note that this is just a wrapper script for the command /u/cs_build/bin/marmoset_submit. You can use the /u/cs_build/bin/marmoset_submit command directly without running the CS241 setup command. You can also use marmoset_submit (whether through our wrapper script or directly using the cs_build command) in other courses that use Marmoset by just replacing CS241 with that course name.

cs241.binview

This tool lets you view the binary data stored in files in a human-readable form. If you view a file with cat or open it in a text editor like vim, your terminal will attempt to display the binary data as ASCII (or maybe Unicode) characters. For a text file, this is exactly what you want, but for something like MIPS machine code it will probably look like gibberish.

The cs241.binview tool formats and prints the binary data as a sequence of 0 and 1 characters. It also has options for printing in decimal, hexadecimal, and ASCII.

Example

Suppose the file program.mips contains the machine code version of the following MIPS program (translated using an assembler like cs241.binasm):

add $1, $2, $3
jr $31

You can use cs241.binview with the --all option to view the machine code in binary, decimal, hexadecimal and ASCII form.

$ cs241.binview --all program.mips
#0       #67      #8       #32
0x00     0x43     0x08     0x20
^NUL     C        ^BS      (SP)
00000000 01000011 00001000 00100000

#3       #224     #0       #8
0x03     0xE0     0x00     0x08
^ETX     \xE0     ^NUL     ^BS
00000011 11100000 00000000 00001000

If --all or other similar options are not used, the default is to produce binary output only.

Notes:

Run cs241.binview without arguments to get a help message showing all the options.
When piping data to cs241.binview, you must pass the argument - to tell cs241.binview to read from standard input, like this: cs241.binasm < input.asm | cs241.binview -
Aside from viewing things like machine code, this tool can also be used to check if the output of a program you write has unnecessary characters (like whitespace or "null" bytes) that normally wouldn't show up with cat.
This program has similar functionality to the standard Unix tool xxd. It is basically a version of xxd where the output format and options are more tailored to the needs of CS 241.

cs241.wordasm

This is a very restricted MIPS assembler that only supports the .word directive. No other instructions or features like labels are available.

The tool reads ASCII text from standard input. The text should consist of a series of lines, one line for each 32-bit word of binary output. Each line consists of the string ".word" followed by a string giving either the hexadecimal (prefixed with "0x") or decimal (no prefix) representation of the 32-bit word. A semicolon can be used to start a single-line comment.

Example

Suppose the file cs241.hex contains the following text:

.word 0x43533234 ; C(43) S    (53) 2(32) 4      (34)
.word 0x3120726f ; 1(31) space(20) r(72) o      (6f)
.word 0x636b730a ; c(63) k    (6b) s(73) newline(0a)

This is the hexadecimal encoding of the string "CS241 rocks" (with a newline at the end). The comments indicate how the hexadecimal bytes correspond to the ASCII characters.

Redirecting this file into cs241.wordasm gives the following output on the terminal:

$ cs241.wordasm < cs241.hex
CS241 rocks

You will mostly use cs241.wordasm to write MIPS machine code programs, but as this example shows, it can be used to create any kind of binary data (as long as the length is a multiple of 4). In this example, ASCII text was produced as output. When producing MIPS machine code output, you will likely want to redirect standard output to a file, like this:

cs241.wordasm < input.hex > output.mips

mips.twoints

This is an emulator for running MIPS machine language programs. Once you have produced a machine language program using one of our assemblers, you can run it with this tool.

Obtaining input from standard input in MIPS is rather awkward, so for convenience, this tool allows you to provide two integers as input. The integers will be stored in registers 1 and 2.

To use the tool, give it the filename of the MIPS program you want to run as the first command line argument. By default, it will load the program at address 0x00. A second (optional) command line argument can be provided to load the program at a different address, although this is essentially never necessary in the course.

Example

Here we have prepared a MIPS program in the format expected by cs241.wordasm. The source code is stored in the file Add11And13.hex. We run it through cs241.wordasm to produce the binary machine language version. We then run the program with mips.twoints.

$ cat Add11And13.hex
.word 0x00004014 ; lis $8
.word 0x0000000B ; .word 11
.word 0x00004814 ; lis $9
.word 0x0000000D ; .word 13
.word 0x01091820 ; add $3, $8, $9
.word 0x03E00008 ; jr $31
$ cs241.wordasm < Add11And13.hex > output.mips
$ mips.twoints output.mips
Enter value for register 1: 1
Enter value for register 2: 42
Running MIPS program.
MIPS program completed normally.
$01 = 0x00000001 $02 = 0x0000002a $03 = 0x00000018 $04 = 0x00000000
$05 = 0x00000000 $06 = 0x00000000 $07 = 0x00000000 $08 = 0x0000000b
$09 = 0x0000000d $10 = 0x00000000 $11 = 0x00000000 $12 = 0x00000000
$13 = 0x00000000 $14 = 0x00000000 $15 = 0x00000000 $16 = 0x00000000
$17 = 0x00000000 $18 = 0x00000000 $19 = 0x00000000 $20 = 0x00000000
$21 = 0x00000000 $22 = 0x00000000 $23 = 0x00000000 $24 = 0x00000000
$25 = 0x00000000 $26 = 0x00000000 $27 = 0x00000000 $28 = 0x00000000
$29 = 0x00000000 $30 = 0x01000000 $31 = 0x8123456c

The mips.twoints emulator prints a list of register values to standard error when the program stops (whether it stops normally or crashes). Notice the following about the register values:

When we ran mips.twoints it requested input for register 1 and 2. We gave it the integer values 1 and 42, respectively.
$1 contains 1 and $2 contains 0x2a which is 42 in hexadecimal.
$8 contains 11 and $9 contains 13 (since the program loaded these).
$3 contains 0x18 which is 24. This is the result of adding 11 and 13.

Error Messages

Here are brief explanations of some the error messages you might see from this tool. This is not necessarily a comprehensive list.

Invalid opcode, Invalid function, Invalid trap: These basically all mean the same thing, that the MIPS emulator tried to execute something that is not a valid instruction. Whether it says opcode, function or trap depends on exactly what the 32 bits are that it tried to execute. A very common reason for this is that you forgot to assemble your program, and the emulator is attempting to execute the ASCII text source code instead of machine code. However, it can also happen in actual programs if you jump or branch to a location containing non-instruction data, or if you accidentally overwrite part of your code with non-instruction data.
Unaligned access: This means the emulator tried to access a memory address that is not word-aligned (that is, not a multiple of 4). Usually the "access" is a Store Word (sw) or Load Word (lw) instruction, but this error can also occur if you try to jump to a non-aligned address with Jump Register (jr) or Jump and Link Register (jalr).
Out of bounds: The MIPS emulator only allows your program to access a certain area of memory, from 0x00000000 to 0x00ffffff (inclusive). Accessing memory address 0x01000000 or larger gives this error. Because the stack pointer ($30) starts at 0x01000000, this error is sometimes (but not always) related to incorrect stack management. As with "unaligned access", the "memory access" that causes the error could be a store or load, but it could also be a jump.
Division by zero: You tried to divide by zero in your program.
Bad numeric input: This error does not happen when running the MIPS program, but rather when entering the numbers for $1 and $2 before the program starts. It means your input could not be parsed as a number, or was out of range.

mips.stepper_twoints

This is a version of mips.twoints that allows you to interactively step backwards and forwards through the assembly language program. You may need to increase the size of your terminal window to run it. Further usage instructions are given in the program itself. The tool is maintained by Nomair Naeem and bugs can be reported to nanaeem@uwaterloo.ca. Unfortunately there is no equivalent stepper for mips.array currently.

mips.stdin

Web version

This is a MIPS emulator that works almost exactly like mips.twoints but does not ask the user to input two integers. It is useful when testing programs that primarily work with standard input rather than taking input from registers.

cs241.binasm

Web version

This is an assembler that takes a MIPS assembly language program as input (from standard input) and produces a MIPS machine language program in binary as output (on standard output). Unlike cs241.wordasm, all the instructions and features of the CS241 MIPS dialect are supported.

Example

Aside from having a much more flexible input format, the usage is identical to cs241.wordasm. Suppose input.asm contains the following text:

add $1, $2, $3
jr $31

You can assemble this program directly as follows. It is not necessary to convert the program to .word directives like with cs241.wordasm.

cs241.binasm < input.asm > output.mips

mips.array

Web version

Usage: mips.array program.mips

This is another MIPS emulator that is almost identical in functionality to mips.twoints, except that it allows you to provide an array as input instead of just two integers. When you run the program, it will ask for an integer representing the size of the array, and then ask you to enter each element of the array. The array will be loaded into memory somewhere shortly after the end of the program code. The emulator will assign $1 to be the starting address of the array, and $2 to be the size of the array, then run the program.

For general usage and error message information, see the section on mips.twoints.

cs241.DFA

Web version

Usage: cs241.DFA < input.dfa

This tool reads a file in the DFA file format from standard input, and checks the file for errors. If no errors are found, the strings in the input section of the file are processed with the DFA. The tool outputs one line per input string, containing the string followed by "true" if the string was accepted and "false" otherwise.

cs241.SMM

Web version

Usage: cs241.DFA < input.dfa

This tool reads a file in the SMM file format as its command line argument. As with cs241.DFA, it checks the file for errors. If no errors are found, then characters are tokenized according to Simplified Maximal Munch using the provided DFA file. Characters are first read from the input section (which is optional) and then from standard input until EOF. Each time a token is accepted, its lexeme is printed to standard out, followed by a newline. Spaces and newlines are printed using the representation used in the SMM DFA format.

wlp4c

Web version

Usage: wlp4c < program.wlp4 > output.mips

This tool reads a WLP4 program from standard input and produces on standard output a compiled MIPS machine language program. The compiled program should be run with either mips.twoints if the parameters of wain are (int, int), or with mips.array if the parameters of wain are (int*, int). The values for $1 and $2 are used as the first and second parameters. The return value of the program is placed in $3.

The input should be WLP4 source code like below, not the result of the scanning or parsing phase. The wlp4c tool will do the scanning and parsing itself.

int wain(int a, int b) {
  return a;
}

Note that wlp4c performs semantic analysis and will reject programs that do not follow the name and type rules. Thus you can use wlp4c to check the semantic validity of a WLP4 program.

wlp4scan

Web version

Usage: wlp4scan < program.wlp4 > tokens.scanned

This tool reads a WLP4 program from standard input similar to wlp4c, but only performs the scanning phase of compilation. The result is a list of lines representing WLP4 tokens, with each line containing a "kind" (the type of token) followed by a "lexeme" (the string the token corresponds to).

cs241.cfgcheck

Web version

Command-line usage: cs241.cfgcheck < input.cfg

This tool reads a CFG component followed by a sequence of zero or more DERIVATION components. If the CFG and derivations are valid, it prints information about the CFG, and prints the rules used in each derivation. It then prints the terminal string arrived at by the derivation. If the CFG is malformed, or the derivation is invalid, an error message is printed, and the tool quits.

cs241.slr

Web version

Command-line usage: cs241.slr < input.cfg > output.slr1

This tool takes a CFG component representing a non-augmented grammar on standard input, augments the CFG, and produces a file representing the augmented CFG and its SLR(1) DFA, suitable for use as a test input for the SLR(1) parser you will write on Assignment 5.

The output contains, in order:

A CFG component representing the augmented version of the input CFG.
An INPUT component representing the augmented empty string "BOF EOF".
A TRANSITIONS component representing the transitions of the SLR(1) DFA.
A REDUCTIONS component representing the reducible items of the SLR(1) DFA.

This tool uses the SLR(1) DFA construction, so the tool will not work for all grammars. It will only work if the SLR(1) construction happens to produce a conflict-free DFA for the grammar. In particular, the tool will not work for ambiguous grammars.

wlp4parse

Web version

Usage: wlp4parse < tokens.scanned > program.wlp4

This tool takes a scanned WLP4 program (that is, a list of tokens produced by wlp4scan) on standard input, and produces on standard output a WLP4I file representing the parse tree for the program. A WLP4I file is essentially just a preorder traversal of the parse tree, so it encodes all the same information as the tree itself.

Typically you wouldn't call wlp4parse on its own, and you would instead pipe it the output from wlp4scan, since wlp4parse requires the input to be scanned first. For example, if program.wlp4 contains WLP4 source code (not scanned yet) you could run the following command:

wlp4scan < program.wlp4 | wlp4parse > program.wlp4i

This avoids the need to create a temporary file to hold the scanned program.

wlp4type

Web version

This tool can be used to convert a WLP4 Intermediate (.wlp4i) file to a WLP4 Typed Intermediate (.wlp4ti) file. The tool checks the program represented by the .wlp4i file for semantic errors, and if there are no errors, it outputs a .wlp4ti file representing the type-annotated parse tree.

wlp4type < program.wlp4i > program.wlp4ti

To use wlp4type with WLP4 program source code, pass the source code through wlp4scan and wlp4parse first:

wlp4scan < program.wlp4 | wlp4parse | wlp4type > program.wlp4ti

cs241.linkasm

Web version

This assembler has all the features of cs241.binasm and supports two additional directives: .import and .export. These can be used to import label definitions from assembled MIPS programs, or to export label definitions which allow other programs to import them. This allows the creation of MIPS "libraries" which export useful procedures that other code can use.

This assembler produces MERL files rather than plain machine code. If your program imports labels from other MERL files, you will need to use cs241.linker to combine the MERL files.

Example

Suppose we have the following assembly files that use the .import and .export directives:

$ cat import.asm
.import label
lis $5
.word label
jr $5
$ cat export.asm
.export label
label:
jr $31

These programs don't do anything interesting, and are just to give an example of the syntax. We can assemble these programs as follows:

$ cs241.linkasm < import.asm > import.merl
$ cs241.linkasm < export.asm > export.merl

Using cs241.binasm to assemble these would produce an error since it does not support .import or .export.

Note that here we have simply assembled two separate programs that don't know anything about each other. If we tried to execute import.merl it would get stuck in an infinite loop, since the label import has not been resolved, and unresolved labels default to value 0. To resolve the import we would have to use cs241.linker.

cs241.linker

Web version

Usage: cs241.linker file1.merl file2.merl file3.merl ... > linked.merl

This tool links two or more MERL files together into a single MERL file. The files are provided as command line arguments, not using standard input. For example, we could link the MERL files produced in the cs241.linkasm example as follows:

cs241.linker import.merl export.merl > linked.merl

cs241.merl

(There is no web version of this tool. Instead, its functionality in the web version is included in cs241.linker, above)

Usage: cs241.merl address < input.merl > output.mips

This tool reads a MERL file from standard input and produces plain MIPS code on standard output. It strips out the relocation and linking metadata, and relocates the program to run at the address specified by the command line argument. The address in the above usage example should be replaced by a number; it does not mean the literal string "address".

Normally address 0 is used, since the MIPS machine loads programs at address 0 by default, so typical usage would look like: cs241.merl 0 < input.merl > output.mips

The tool will produce information about the MERL metadata on standard error. You can use this to programmatically compare the metadata for two MERL files: sort the standard error results (since the MERL files may have the same metadata entries but different ordering) and run diff on the sorted files. If you do not want to see this metadata information, you can redirect standard error to /dev/null as follows: cs241.merl 0 < input.merl > output.mips 2> /dev/null

Note that MERL files can usually be directly executed without needing to strip the metadata. However, for technical reasons, the memory allocation MERL module you will use in some assignments requires the MERL metadata to be stripped to work correctly. That is the main reason to use this tool, aside from the aforementioned trick of using it to compare the metadata of two MERL files.

CS241: Tools — University of Waterloo

CS241 Tool Summary

List of CS241 Tools

Tool Information

marmoset_submit

cs241.binview

Example

cs241.wordasm

Example

mips.twoints

Example

Error Messages

mips.stepper_twoints

mips.stdin

cs241.binasm

Example

mips.array

cs241.DFA

cs241.SMM

wlp4c

wlp4scan

cs241.cfgcheck

cs241.slr

wlp4parse

wlp4type

cs241.linkasm

Example

cs241.linker

cs241.merl