CS 480:
Translators
Winter Term, 2007, Professor Budd
Homework Assignment 1
Due Wednesday, January 17
Finite State Automata
- Write a finite state automata that will recognize any string
consisting
of a and b characters where the number of a's is even (or zero) and the
number of b's is odd. (Hint: This can be done in four states. It helps
to label each state with a short description of what it represents).
- Suppose comments are defined by a two character sequence, a
parenthesis
followed by a star, as in the Pascal convention of (* whatever *).
(The language C uses a slash star, but this exercise uses parenthesis
star).
Comments can contain any text, including the star symbol and
parenthesis,
just not the two character sequence that ends the comment. So (***) is
a legal comment, but (*) is not. Write a determanistic finite
automata that will recognize legal comments. You need not include error
states, or error transitions, or recognize anything other than
comments.
(That is, all transitions should potentially lead to a final state).
- Suppose a floating point number is defined as a series of one or
more
digits, and either a decimal or an exponent part, or a decimal part
followed
by an exponent part. A decimal part is a decimal point, followed by
zero
or more digits. An exponent part is the letter e, and optional sign,
and a sequence of one or more digits. Write a DFA that will recognize
floating point values. Label your final states. Again you need not
include
error states or error transitions, only transitions that can led to a
final state.
- Show the four separate Regular Expressions for the keywords int,
if, inner and for identifiers defined
as a letter followed by string of zero or more letters or digits.
Then, by merging the start states, combine these into one DFA
with four different final states corresponding
to the different tokens.
Note that when the NFA is converted into a DFA a state may be split.
This can happen with a final state. So you may end up with more
than four final states. Again, show only states that lead to a final
state (i.e., no error
states).
- Explain why the lexical analyzer must have the ability to peek at
the
input.
(Or, alternatively, the ability to push back an already read
character).
Given an example (a token description and an example input) that
shows why the lexer would want to peek at a character, but not read it.
(Or, alternatively, push back a character after having read it).
- Explain what actions the function generated by Lex takes when it
is
asked to recognize the next token. Include in your description calls on
the push back buffer.