CS 241 - CFG-R (reversed rightmost) File Format (.cfg-r)
A .cfg-r file is a text file representing a context-free grammar followed by an abbreviated reversed rightmost derivation. The format
differs from the CFG file format in that it uses an
unindented reversed rightmost derivation format, and there must be exactly one derivation (as opposed to zero or more).
Context-free Grammar Representation
The context-free grammar representation is the same as a CFG file. It has four components, in order:
- t, a positive integer giving the number of terminal symbols in the grammar, followed
by t lines, each containing the name of a distinct terminal symbol. Each terminal symbol may
be any string of one or more printable ascii characters, excluding white space.
- n, a positive integer giving the number of nonterminal symbols in the grammar, followed
by n lines, each containing the name of a distinct nonterminal symbol. Each nonterminal
symbol may be any string of one or more printable ascii characters, and must not be the same as any terminal symbol.
- A single line giving the name of the start symbol, which must be one of the nonterminal symbols
- r, a positive integer giving the number of production rules in the grammar, followed by r lines, each denoting a rule consisting of:
- The left-hand-side (LHS) of the rule; a nonterminal symbol
- The right-hand-side (RHS) of the rule; zero or more terminal
or nonterminal symbols separated from each other, and from the LHS, by a
single space
Unindented Reversed Rightmost Derivation
Exactly one derivation immediately follows the context-free grammar (unlike CFG files which allow for zero or multiple derivations).
The derivation is a representation of the parse tree, with
each node in the tree represented by a line in the file containing a production rule.
(In contrast to the derivations in CFG files, these lines do not begin with
any spaces.)
The order and indentation of the lines in the derivation are defined by the following
recursive rules:
- The last line of the file is a production rule whose LHS is the start symbol, indented 0
spaces. This line represents the root of the parse tree.
- Preceding each production rule
that has n nonterminal symbols in its RHS, are n sets of
lines, each set representing the subtree corresponding to one of the nonterminal symbols,
in the order they appear in the rule.
-
The subtree for each nonterminal ends with a rule whose LHS is the
corresponding nonterminal. The preceding lines in the set are determined by applying
this rule recursively.
cs241.cfgrl tool
The student.cs environment includes a tool "CFGRL" that converts a .cfg-r file to a .cfg file. To use the tool:
cs241.cfgrl < sample.cfg-r
The tool's error checking is fairly limited. The best way to check for errors in a .cfg-r file is to pass the output of cs241.cfgrl
into cs241.cfgcheck
:
cs241.cfgrl < sample.cfg-r | cs241.cfgcheck
6
)
(
EOF
BOF
id
-
3
term
S
expr
S
5
term id
term ( expr )
S BOF expr EOF
expr expr - term
expr term
term id
expr term
term id
expr term
term ( expr )
expr expr - term
term id
expr expr - term
S BOF expr EOF