CS 241 - WLP4 Programming Language Specification

The WLP4 programming language contains a strict subset of the features of C++. A WLP4 source file contains a WLP4 program, which is a sequence of procedure definitions, ending with the main procedure wain.

Lexical Syntax

A WLP4 program is a sequence of tokens optionally separated by white space consisting of spaces, newlines, or comments. Every valid token is one of the following:

Tokens that contain letters are case-sensitive; for example, int is an INT token, while Int is not.

White space consists of any sequence of the following:

WLP4 programs are constructed by tokenizing (also called scanning or lexing) an ASCII string. To ensure a unique sequence of tokens is produced, the sequence of tokens is constructed by repeatedly choosing the longest prefix of the input that is either a token or white space. If the prefix is a token, it is added to the end of the WLP4 program token sequence. Then the prefix is discarded, and this process repeats with the remainder of the ASCII string input. This continues until either the end of the input is reached, or no prefix of the remaining input is a token or white space. In the latter case, the ASCII string is lexically invalid and does not represent a WLP4 program.

Context-free Syntax

A context-free grammar for a valid WLP4 program is:

Context-sensitive Syntax

A procedure is any string derived from procedure or main. The name of the procedure is the ID in the grammar rule whose left-hand side is procedure. The name of the procedure derived from main is wain. A procedure is said to be declared from the first occurrence of its name in the string that makes up that procedure (i.e., once the name has been encountered in the procedure's header). A procedure cannot be called until it has been declared (formally, the ID in factor → ID LPAREN RPAREN or factor → ID LPAREN arglist RPAREN must be the name of a procedure that has been declared). Thus, a procedure may call itself recursively, and a procedure may call procedures declared before itself, but a procedure may not call procedures declared after itself. Consequently, there is no mutual recursion in WLP4. Two procedures may not have the same name. The procedure wain may not call itself recursively.

Any ID in a sequence derived from dcl within a procedure p is said to be declared in p. Any ID derived from factor or lvalue within p is said to be used in p. Any particular string x that is an ID may be declared at most once within a given procedure. The same ID may be declared in different procedures. A string x which is an ID may be used in any number of places, but only if the same string x is declared. String comparisons are case sensitive; for example, "FOO" and "foo" are distinct.

An ID may have the same name as a procedure. If an ID x is declared in a procedure p, all occurrences of x within p refer to the ID x, even if a procedure named x has been declared. The same is true in the special case that p = x: a declared ID may have the same name as the procedure that contains it; in this case, all occurrences of ID refer to the variable, not the procedure.

Every procedure has a signature, which is a list of strings, each of which is either int or int*. The signature of a procedure is the sequence of strings int or int* that is derived from params. Note that this sequence may be empty.

An ID whose name occurs in a sequence derived from dcl has a type, which is either int or int*:

Other IDs do not have types and are said to be untyped.

Instances of the tokens NUM, NULL and the non-terminals factor, term, expr, and lvalue also have a type, which is either int or int*. Types must satisfy the following rules:

Semantics

Any WLP4 program that obeys the lexical, context-free, and context-sensitive syntax rules above is a also a valid C++ program fragment. The meaning of the WLP4 program is defined to be identical to that of the C++ program formed by inserting the WLP4 program at the indicated location in one of the following C++ program shells: