CS 480: Translators
Winter Term, 2004, Professor Budd
Homework Assignment 1
Due Monday, January 12
Finite State Automata
- Write a finite state automata that will recognize any string consisting
of a and b characters where the number of a's is even (or zero) and
the number of b's is odd.
- Suppose comments are defined by a two character sequence, a parenthesis
followed by a star, as in the Pascal convention of (* whatever *).
Comments can contain any text, including the star symbol and parenthesis,
just not the two character sequence that ends the comment. So (***) is
a legal comment, but (*) is not. Write a determanistic finite
automata that will recognize legal comments. You need not include error
states, or error transitions, or recognize anything other than comments.
(That is, all transitions should potentially lead to a final state).
- Suppose a floating point number is defined as a series of one or more
digits, and either a decimal or an exponent part, or a decimal part followed
by an exponent part. A decimal part is a decimal point, followed by zero
or more digits. An exponent part is the letter e, and optional sign,
and a sequence of one or more digits. Write a DFA that will recognize
floating point values. Label your final states. Again you need not include
error states or error transitions, only transitions that can led to a
final state.
-
Show the four separate Regular Expressions for the keywords
int, if, inner and for identifiers defined
as a letter followed by string of zero or more letters or digits.
Then, by merging the start states, combine these into one
DFA with four different final states corresponding
to the different tokens.
- Explain why the lexical analyzer must have the ability to peek at the
input.
(Or, alternatively, the ability to push back an already read character).
Given an example (a token description and an example input) that
shows why the lexer would want to peek at a character, but not read it.
(Or, alternatively, push back a character after having read it).
- Explain what actions the program generated by Lex takes when it is
asked to recognize the next token.