A .dfa file represents a deterministic finite automaton (DFA) together with some optional input data.
You can use the
cs241.DFA
tool
to check the syntax of these files and to process the
strings in the input data with the DFA:
cs241.DFA < file.dfa
Here is an example of a DFA file for
a DFA that recognizes strings of lowercase alphabet letters (a-z),
spaces, and newlines
that contain the substring cs
.
The text in italics
is not part of the file,
and is just there to describe how the file works.
.STATES start seen_c seen_cs! .TRANSITIONS start a b d-z \s \n start start c seen_c seen_c a-r t-z \s \n start seen_c s seen_cs seen_cs a-z \s \n seen_cs |
first state listed is the initial state ! marks a state as accepting loop at 'start' on a, b, d-z, space (\s), and newline (\n) transition from 'start' to 'seen_c' on c transition from 'seen_c' to 'start' on a-r, t-z, space, and newline transition from 'seen_c' to 'seen_cs' on s loop at 'seen_cs' on a-z, space, newline |
![]() |
After the description of the DFA, you can optionally provide any number of input strings for the DFA to process. Here are some examples:
.INPUT hello cs students .INPUT hello cs students .INPUT tactics\nand\slyrics .INPUT cstwofourone\s .INPUT .INPUT bstwofournone |
each input string starts with a .INPUT header whitespace at the end of the final line is ignored; this input does not end with a newline this input DOES end with a newline, but only one, not two escape sequences are interpreted; this input contains one newline and one space this input ends with a space, escape sequence whitespace is not trimmed this input (blank line) is treated as the empty string |
Suppose the above data (DFA description and input) is combined and stored in a file
called cs.dfa
.
The output of cs241.DFA
for this file is:
cs241.DFA < cs.dfa hello cs students: true hello cs students : true tactics and lyrics: true cstwofourone : true : false bstwofournone: false
The output indicates which input strings were accepted by the DFA. Note that the escape sequences \s
and \n
for space and newline have been replaced in the output.
The DFA is encoded in two sections (states and transitions). After this, any number of optional input sections can appear.
.STATE
, followed by a sequence of
whitespace-separated strings representing state names.
.TRANSITIONS
, followed by a sequence of
lines representing transition collections.
A transition collection has three parts, each separated by
(non-newline) whitespace:
\s
representing a space (ASCII 0x20)\n
representing a line feed (ASCII 0x0a)\r
representing a carriage return (ASCII 0x0d)\t
representing a horizontal tab (ASCII 0x09)\xHH
representing the character with ASCII code 0xHH, where H stands for a hex digit (0-9 or a-f or A-F); codes 0x80 to 0xFF (outside the ASCII range) are not allowed-
(ASCII 0x2D), and represents the
range of ASCII characters whose codes are between
the first character and second character (inclusive).state1 a state2
and also have state1 a state3
.
cs241.DFA
tool allows leaving transitions undefined, and will reject an
input string if it would cause an undefined transition to be taken.
.INPUT
, followed an input string.
cs241.DFA
tool
will interpret the contents of each input section
as a string to run
through the DFA using the DFA recognition algorithm.true
(if the string was accepted) or false
(if it was not accepted).
Trailing whitespace is normally trimmed from the last line of the input
string, and then the terminating newline is removed.
You can override this by using escape sequences like \s
or \n
, which are not trimmed, to specify whitespace.
bang!!
would
correspond to a state named bang!
that is accepting.
.TRANSITIONS
cannot be used as a state name, since
it marks the end of the states section..INPUT
section header must appear on its own line,
or else it may be interpreted as a part of the previous section.