Assignment I

Due date Feb. 14.

1. Write a regular grammar and draw the FSM that describes comments in a PASCAL-like format, but where strings within comments are significant: comments start with the sequence (* and are terminated with the sequence *). They contain arbitrary characters but not any intervening *) unless it appears within quotes " ... ".

2. In words, describe the languages denoted by the following regular expressions:
a) 0*10*10*10*
b) (0 | 1)* 0 (0 | 1) (0 | 1)

3. Write the regular definition for the language that includes all strings of 0's and 1's with an even number of 0's and an odd number of 1's.

4. Numeric literals in Ada can be given in a base different from 10. The following are the relevant productions:
based_literal  ::=   base # based_numeral#
base           ::=   numeral
numeral        ::=   digit | digit numeral
based_numeral  ::=   extended_digit | extended_digit based_numeral
extended_digit ::=   digit | A | B | C | D | E | F

This allows us to write octal numbers   8#7717#, hexadecimal numbers like
16#F00A#, and numbers in less common bases like 5#123123# or 12#AAB9#.
The base itself is always interpreted in base 10.

The rules above do not forbid strings that are meaningless because they include digits that are larger than the base, for example 8#88#, or 3#ABC#. How many rules would we need if we wanted a grammar that only specified legal numerals? Do not list them all, a rough estimate is sufficient.