Programming Languages

G22.2110 Summer 1998


Class #2



Homework # 2: Sethi Exercises 2.8, 2.10, 2.16, 2.19, 2.20

Read Sethi’s Chapters 2, 13.2




C2.0. Class Syllabus

C2.1. Historical Overview of Programming Languages (continued)

C2.2. Programming Language Categories

C2.3. Syntax and Grammars

C2.4. Comments on Sethi's Chapter 2


C2.0. Class Syllabus


Follow link on course home page


C2.1. Historical Overview of Programming Languages (continued)


Revisions to Sethis' Figure 1.4


C2.2. Programming Language Categories


Imperative Languages




Heavy reliance on Von Neumman architecture (not a prerequisite of software development)

Requires programmer to move data values around in storage locations in order to compute.


e.g., FORTRAN, ALGOL 60, Pascal, CPL, BPCL, C


OO (derived from Simula 67):


1. Pure OO

2. Hybrid OO


Applicative Languages




Based on mathematical functions.

Mathematical functions map members of one set (domain set) to another set (range set).

Design basis for one of the most important non imperative style of language.

Evaluation of mathematical expressions is controlled by recursion and conditional expressions.

(v.s. sequencing and iterative repetition typical of imperative PLs)

Mathematical functions do not produce side effect (same value yielded for a given argument set).



(COMMON) LISP (pure)

ML (strongly typed, conventional syntax)

Miranda (pure, partially based on ML)

Scheme (static-scoped)




"Non-procedural" languages, also known as "declarative"

Do not force programmers to convey a step by step description of how to compute a result.

Simply require programmers to specify what properties the desired result should possess.

Programs are expressed in the form of symbolic logic

Apply automatic deduction (logical inferencing process) to produce results.


e.g., Prolog


C2.3. Syntax and Grammars


C2.3.1. Introduction


Formal syntax of a language consists of two layers: lexical & grammar.


Lexical syntax:

Corresponds to the spelling of words (¹ v.s. <>).

Regular expressions provide a convenient notation for lexical syntax.

e.g., Pascal operators <>, <=, and < are treated as units with different meaning.



Key formalism for describing syntax (English used to described semantics).

BNF is a notation for writing grammars (came out of Algol 60's specification process).

EBNF (Extended BNF) is a variant of BNF.

Syntax charts are a graphical notation.


Usefulness of formal languages and grammars notions:

Familiarity with terminology/some aspects of formal languages and grammar helpful to study PLs

Study of formal language introduces the dichotomies: meta-language/language, name/denotation.

Help illustrate important techniques: structural induction, inductive proof, inductive definition.

Introduction of formalisms more powerful than context-free grammars (attribute grammars, and Post systems).


C2.3.2. The Study of Language


PL syntax, semantics, pragmatics:


Syntax encompasses form, format, well-formedness, compositional structure.


Semantics concerns meaning and interpretation of the language.


Syntax and semantics are often difficult to separate as the meaning of words in a sentence is closely intertwined with the syntactic structure of the sentence.


Time flies like an arrow.

Fruit flies like a banana.


Pragmatics concerns everything else such as the origins, uses, and effects of the language.


influence of APL was limited by its need for a non-standard keyboard


Computer science is different from linguistics: linguistics is descriptive while the study of PL is prescriptive or normative.


PL design principles:



Desirable to have constructs that factor out recurring patterns.





Automate mechanical, tedious, or error-prone activities.


Defense in Depth

Have a series of defenses so that if an error is not caught by one, it will probably be caught by another.


Information Hiding:

The language should permit modules to be designed so that the user has all the information needed to use the module correctly and nothing more, and the implementor has all the information needed to implement the module correctly and nothing more.



Avoid arbitrary sequences more than a few items long; do not require the user to know the absolute position of an item in a list. Instead, associate a meaningful label with each item and allow the items to occur in any order.


Localized cost:

Users should only pay for what they use; avoid distributed costs.


Manifest Interface:

All interfaces should be apparent (manifest) in the syntax.



Basic features should be separately understandable and free from unexpected interaction.


Pascal val and var parameters combine the type of use of parameters (input/output)

with the parameter-passing mechanism (call-by-value/call-by-reference).



Avoid features or facilities that are dependent on a particular machine or a small class of machines.


Preservation of Information:

The language should allow the representation of information that the user might know and that the compiler might need.



The fewer concepts to understand in a language the better.



The fewer there are exceptions to the rules the better.


FORTRAN's limitation to three dimensional arrays.



No program that violates the definition of the language, or its own intended structure, should escape detection.



The static structure of the program should correspong in a simple way with the dynamic structure of the corresponding computations.


Syntactic Consistency*

Similar constructs should look similar, different constructs should appear differently.


Array subscripting A(I) with syntax of function call F(I) in FORTRAN/Ada

Makes more sense to use A[I]



Language translator must run quickly and produce efficient object code.



The only reasonable numbers are zero, one, and infinity.


C2.3.3. Grammars




Language syntax has profound effect on ease of use


* = v.s. == in C

* end keyword in PL/I closing any number of begin constructs.

* insignificance of spaces in FORTRAN:

DO 10 I = 1.5

A(I) = X + B(I)


interpreted as DO10I = 1.5 (assignment to undeclared variable).


* break statement in a C program (applies to switch but not to if)

changing while to if statement could cause an error

* use of punctuation as a terminator as opposed to a separator

e.g., optional use of semicolon as a separator in Turing/ML

* PL/I not setting aside reserved words for keywords:



* languages with many reserved keywords such as Ada:

Error messages from compiler may not identify true cause of problem

* comments:


Three mechanisms for describing formal languages are important to the design of PLs:


Definition of language


Formal language:

Set of finite strings of atomic symbols

Set of symbols is called the alphabet

Strings of symbols are written one symbol after another

Empty string is denoted by ""

Some strings belong to a particular language, the others (if any) do not

The strings that belong to the language are sometimes called "sentences" or "words"



languages over the alphabet {a, b}:

L1 = { a, b, ab}

L2 = {aa, aba, abba, abbba, …}


Alphabet symbols may be composed of more than one sign



"begin" as a single element of an alphabet

L3 = {begin x end, begin x | end, begin x || end, … }


L1 is finite, L2 and L3 are infinite.


Examples illustrate the need for more precise methods for defining languages (rather than "…").


Definition of regular expressions


RE invented by mathematician Stephen Kleene, first appeared in a Rand Corp. report about 1950


Regular expression over A denotes a language with alphabet A defined by the following set of rules:

  1. Empty. Symbol Æ is a regular expression denoting the language consisting of no strings {}
  2. Atom. Any single symbol of a Î A is a regular expression denoting the language consisting of the single string {a}.
  3. Alternation. If r1 is a regular expression and r2 is a regular expression, then (r1+r2) is a regular expression. The language it denotes has all the strings from the language denoted by r1 and all the strings from the language denoted by r2.
  4. Concatenation. If r1 and r2 are regular expressions, then (r1.r2) is a regular expression. The language it denotes is the set of all strings formed by concatenating a string from the set denoted by r1 to the end of a string in the set denoted by r2.
  5. Closure. If r is a regular expression, then r* is a regular expression. The language it denotes the set of all strings formed by concatenating zero or more strings in the language denoted by r.


"+", ".", "*", "Æ ", and "( …)" are part of the regular expressions notation, not part of the languages being defined.

The definition is recursive

Common rules of precedence are used in practice to avoid parenthesizing regular expressions fully.



Alphabet A is {a, b}

Then (a+b)*, and (b*, Æ ) are regular expressions


The definition defines a structure, the syntax of regular expressions, and an associated set of strings. Formally, we can make the function subsumed by the definition explicit, say "D" for "denotes". The function maps regular expressions to languages.


D[Æ ] = {}

D[a1] = {a1}


D[an] = {an}

D[(RE1+RE2)] = D[RE1] È D[RE2]


To describe the function D for concatenation, we introduce the concatenation operator "." (a new string is formed by taking the characters of the second string and placing them immediately after the characters of the first string).


D[(RE1.RE2)] = {x.y | x Î D[RE1] & y E D[RE2]}


Note "." as concatenation operator on the left, and "." as concatenation operator for strings on the right.


To describe the function D for closure, we introduce an operation to concatenate a set with itself any number of times. The set S concatenated with itself i times is denoted Si and is defined recursively as follows:

S0 = {" "}

Si+1 = {x.y | x Î S & y Î Si}


D[RE*] = È i (D[RE])i



(a+b)* stands for all strings of a’s and b’s.

(b*.Æ *) stands for all strings of b’s.

If l stands for the regular expression denoting any low/upper-case letter

And d stand for the regular expression denoting any decimal digit

The table below defines the syntax of identifiers in 3 languages:

Modula-3: l.(l+d+_)*

Identifiers begin by a letter and continue with any

number of digits

ML: l.(l+d+_+’)*

In ML, "’" is a legal constuent of an identifier

Ada: l.(l+d)*.(_.(l+d).(l+d)*)*

In Ada, the underscore cannot appear twice in a row


It is easy to decide if a given string belongs to a language defined by a regular expression.

Some programs take a regular expression as input and produce a program to recognize strings of the language (e.g., Flex)



Definition of regular expressions for indefinite length identifiers.

Regular expressions for white space and letters:


LETTER [a-zA-Z_]

WHITE [ \\ n\\t



Regular expression for identifiers:


LETTER ([0-9] | LETTER)* : { /* return identifier token */ }

WHITE : /* do nothing ! */


Regular expression -> Flex -> Lexical analyzer.

Lexical analyzer: input stream -> token stream (token returned for identifiers, white space is ignored.

Lexical analyzer does not handle the situation where the input characters do not match either case.


Why are regular expressions not used to describe the syntax of PLs ?

-> bracketing is not expressible and indispensable


arithmetic expressions require "(" and ")")

sequences of statements are often bracketed by begin/end pairs




Notation invented to describe the syntax of ALGOL 60

BNF <-> Backus-Naur Form (John Backus & Peter Naur)

Idea is to define sequences of tokens that make up the language

Tokens are identifiers, numbers, keywords, punctuation, etc.

Tokens are members of the alphabet of the formal language being defined

Tokens may have a microsyntax as identifiers and numbers do


A BNF description is used to specify that the sequence of tokens, say,


Begin identifier := identifier end


is legal in some language, but the sequence of tokens:


end identifier begin := 2.3


is not


A BNF definition consists of a number of rules

Each rule has a left and right-hand side

Left-hand sides are syntactic categories (name for a set of token sequences)

Right-hand sides are sequences of tokens and syntactic categories


<name> ::= sequence of tokens and syntactic categories


There may be many rules with the same left-hand side

A token sequence belongs to a syntactic category if it can be derived by taking the right-hand sides of rules for the category and replace the syntactic categories occuring in right-hand side with any token sequence belonging to that category


BNF notation:

::= means "is defined to be"

"< >" delimits syntactic categories

"|" means "or"


"or" is used to combine multiple right-hand sides for the same syntactical category (not necessary but practical)



<nested> ::= Î | begin <nested> end

Î is used to to "show" that no sequence of tokens is on the right-hand side

Among the sequences of tokens belonging to the syntactic category "nested" are:

Î (empty sequence)

begin end

begin begin end end

begin begin begin end end end


The definition of <nested> is recursive


Examples of BNF:


  1. Example using BNF to describe the syntax of regular expressions over the alphabet {a, b}:

    <RE> ::= Æ | a | b | (<RE> + <RE>) | (<RE>.<RE>) | <RE>*

    The syntactic category of regular expressions is either the symbol Æ , or the symbol a, or

    the symbol b, or an opening parenthesis followed by a regular expression, followed by a plus sign followed by another regular expression, followed by a closing parenthesis, and so on.

    The set of tokens used in this definition is: {Æ , a, b, (, ), +, ., *}

    (a+b)*, and (b*.Æ ) are regular expressions, they both belong to the language defined by the BNF description for <RE>.


  3. Description of ALGOL 60’s "for" construct:


<for statement> ::= <for clause> <statement> | <label> : <for statement>

<for clause> ::= for <variable> := <for list> do

<for list> ::= <for list element> | <for list>, <for list element>

<for list element> ::= <arith expr>

| <arith expr> step <arith expr> until <arith expr>

| <arith expr> while <boolean expr>


sample for statements in ALGOL 60:


for i:= j step 1 until n do <statement>

A: B: for k := 1 step –1 until n, i+1, while j > 1 do <statement>


These for statements are syntactically legal according to the BNF definition, but they may not be meaningful (ALGOL 60 was not clear about when loop variables were evaluated, tested, and incremented)


Variations on BNF:


Intended to make BNF definitions more readable

Popular extensions: square brackets for optional syntax, curly braces for zero, one, or more instances

Formalism using such constructs is called "extended BNF"

Extensions do not add to the expressive power of the formalism, just to the convenience




<statement> derives the usual variety of assignment and control statements

<list of statements> ::= Î | <statement> {; <statement>}

| <statement> {;<statement>};

Since sections grouped in curly braces can be omitted or repeated any number of times,

The semicolon is either a terminating symbol or a separating symbol


The extended BNF is used in the Ada reference manual.

It also uses a different convention to distinguish syntactic categories from terminals.

Syntactic categories are denoted by simple identifiers possibly containing underscores (without angle bracket)

Keywords and punctuation are in bold face.



block ::= [block_identifier :]

[declare {declaration}]

begin statement {statement}

[exception handler {handler}]

end [block_identifier] ;


The beginning and ending identifiers must be the same.

Since this cannot be easily expressed using BNF, the Ada reference manual resorts to English

In general it is not easy to express constraints like the following in BNF: a function must be called with the same number of arguments with which it was defined.


BNF is a slightly different form of context-free grammars (developed independently by Chomsky).


Definition of grammar


A grammar is a 4-tuple <T, N, P, S>

T is the set of terminal symbols

N is the set of nonterminal symbols T Ç N = Æ

S, a non terminal, is the start symbol

P are the productions of the grammar

A production has the form a -> b where a and b are strings of terminals and noterminals (a ¹ Î ).


We say g a d => g b d (derives in one step) whenever a -> b is a production


We use a =>* b as the reflexive and transitive closure of the => relation


A string w of terminals is in the language defined by grammar G if S =>* w

In this case, the string w is called a sentence of G

If S =>* a where a may contain nonterminals, we say a is a sentential form of the grammar G


Chomsky’s hierarchy relates the power of the different types of grammars.

A completely arbitrary grammar defines the largest possible class of languages called type 0 languages. It takes the full power of a Turing machine to determine is a string belongs to such a language.

Attribute grammars and Post systems formalisms are equally powerful.


Type Name Machine Grammar


0 unrestricted Turing machines, Post systems

1 context-sensitive Linear-bounded automata ê b ê ³ ê a ê

  1. context-free Pushdown automata, BNF a Î N
  2. regular Finite automata, regular expressions A -> w B, A -> w


By restricting the form that productions a -> b can take, four distinct classes of formal languages emerge.

Regular languages are those that can be defined using grammars with productions limited to at most one left-hand side nonterminal and right-hand sides consisting of nonterminals.

Context-free grammars are restricted to productions with left-hand sides that are nonterminals.

Context-sensitive grammars require that the left-hand side be longer than the right-hand side.




Parsing problem: how to efficiently recognize when a string is a sentence of a grammar ?


Parsing of natural languages is difficult die to the ambiguity of language


e.g., 2 different interpretations possible depending on how the words are interpreted

He saw that gasoline can explode.

The chickens are ready to eat.


Similar ambiguity in PLs with the so-called dangling else problem:

ALGOL 60’s "if-then" and "if-then-else" statements lead to an ambiguous grammar


S -> if C then S | if C then S else S | S’


A sequence of tokens of the form:


if C1 then S1 else if C2 then S2 else S3


Has only one interpretation. In contrast the following string has two reasonable interpretations:


if C1 then if C2 then S1 else S2

if C1 then (if C2 then S1 else S2) or if C1 then (if C2 then S1) else S2


A formal definition of a parse tree for a context-free grammar is possible.

Each production used in the derivation of a string appears as a subtree in the diagram.

The left-hand side nonterminal appears as a node, and all the grammar symbols in the right-hand side of the production appear as children of this node.


ALGOL 60 prohibited the nested "if" statement, as it could always be avoided using the begin/end statement.

PL/I and Pascal adopted the solution of matching the dangling else to the nearest unmatched if statement. This solution can be formalized by the following unambiguous grammar:


S -> MS | UMS

MS -> if C then MS else MS | S’

UMS -> if C then S | if C then MS else UMS


S is either a matched statement MS or an unmatched statement UMS.

A S’ is any statement other than an if statement.

The first parse tree for the example of nested if statement is the only one possible using this unambiguous grammar.


ALGOL 68 introduced the keyword fi, which also solves the dangling else problem. This same problem is solved using end if in Ada. The terminating keyword solution appears to be generally favored over the nearest unmatched solution in more recent programming languages.


C.2.3.4. Attribute Grammars




A context-free grammar is a description mechanism for formal languages that is easy to comprehend. This accounts for its popularity as a descriptive tool (v.s. BNF).


However, a number of apparently simple languages are not context-free, although close.

Such languages are often best described by giving a context-free grammar and noting the exceptions and restrictions.



Syntax of the Ada block statement (simplified)


<block> ::= <block identifier> :

begin <statement> {<statement>}

end <block identifier> ;


The Ada reference manual adds the restriction that the second block identifier must be equal to the first block identifier. Since this description is not difficult to express, it would be convenient if it were easily formalizable without resorting to context-sensitive grammars.

One approach is to augment the context-free production above with an attribute OK that will be set to true if the production is used correctly.



Constraint expression


<block>.OK := <block identifier>1.value = <block identifier>2.value


Formalizing the approach of augmenting a context-free grammar with attributes:


Approach attaches additional information to each node of a parse tree.

Two types of information is allowed:

Information synthesized from the subtree below the node, and information inherited from the subtree above the node.


An attribute grammar definition will associate some functions to compute the value of the attributes with each production of the grammar.

If these functions are carefully organized, it is possible to use them to decorate any parse tree of the grammar (these local definitions associated with each production of the grammar define the values of the attributes for all parse trees).

Given the definitions and a parse tree, algorithms exists to compute the attributes of all the nodes in the tree.


Starting with the underlying context-free grammar G = <N, T, P, S>.

For every production p in P, the number of terminal and nonterminal symbols in string a is denoted n(p).

If a is the empty string, then n(p) = 0.

Sometimes, we will find it necessary to consider each symbol of a production individually.

For all productions, p Î P, we write A -> a or


p0 -> p1 p2 … pn(p)


p0 is the left-hand side nonterminal of the production p, and p1 is the first terminal or nonterminal symbol on the right-hand side (if there are any), and so on.


An attribute grammar is a context-free grammar augmented by attributes and semantic rules.

The set of attributes will be denoted At.

For each attribute a Î At, we associate a set of values Domain(a).

An attribute is just a name for a set of values.

The set of attributes is divided into two disjoint classes, the inherited attributes In, and the synthesized attributes Syn (At = In È Syn, and In Ç Syn = Æ )..

To every grammar symbol x Î N È T, we associate a set of attributes At(x) Ì At.

We can think of At(x) as additional information about the symbol x.

And we set

In(x) = {a Î At(x) | a Î In}

Syn(x) = {a Î At(x) | a Î Syn}


We require that In(S) = Æ for the start symbol S (since it cannot inherit any information).

For all t Î T, we must have Syn(t) = Æ (no structure beneath a terminal symbol from which to synthesize information, note that synthesized attributes for terminals may be allowed if we assume that the attribute grammar system sits on top of a lexical analyzer that provides the attribute information for terminals, e.g., the terminal "real" may have its value as an attribute).

We could, without loss of generality, assume that In(x) and Syn(x) are singleton sets (as any set of attributes can be considered a single attribute).


Now consider productions:

It is possible for the same attribute to be associated with different symbols appearing in the same production.



Production S -> AB in some appropriate grammar

S, A, and B may all have the inherited attribute int associated to them

In(S) = In(A) = In(B) = {int}


Thus, we cannot consider the set of attributes associated with all the symbols of a production without losing track of which attributes appear more than once.


More confusing are productions that have a nonterminal symbol appearing more than once



S -> ASA


Hence, we introduce the notion of an attribute occurrence of a production p which is an ordered pair of attributes and natural numbers <a, j> representing the attribute a at position j in production p. This notion is well defined only when we have a Î At(pj) and 0 £ j £ n(p).

For a particular production p Î P an attribute occurrence at j will be written pj.a.

The set of attribute occurences for a production p is defined as follows:


AO(p) = {pj.a | a Î At(pj), 0 £ j £ n(p)}


The set of attribute occurences for a production is divided into two disjoint subsets.

The defined occurences for a production p is defined as follows:


DO(p) = {p0.s | s Î Syn(p0) } È {pj.i | i Î In(pj), 1 £ j £ n(p) }


The used occurences for a production p is defined as follows:


UO(p) = {p0.i | i Î In(p0) } È {pj.s | s Î Syn(pj), 1 £ j £ n(p) }


Considering the parse tree in which the production p has been used, the st UO(p) represents the information flowing into the node of the parse tree labeled p0 and DO(p) represents the information flowing out.



Information flow in a parse tree in which the production S->AB has been used

Used attribute occurences (information flowing in): In(S), Syn(A), Syn(B)

Defined attribute occurences (information flowing out): Syn(S), In(A), In(B)


For every attribute occurrence v Î DO(p), we must have a defining semantic function fp,v.

The semantic functions define values for attributes in DO(p) in terms of the values of the attributes in UO(p). This is a function that produces a value for the attribute a from values of the attributes of UO(p). We can make this precise by extending the notation of domains to attribute occurences.

For v = pj.a, we set Domain(v) to be Domain(a). Thus the type of the semantic function fp,v is


Fp,v : Domain(v1) x … x Domain(vk) -> Domain(v)


where v1, …, vk are attribute occurences from the set UO(p).


There is no requirement that the semantic function fp,v use all the attribute occurences of UO(p) in its computation. The set of attribute occurences it does use is called the dependency set of fp,v and is denoted Dp,v. Dp,v is a subset of UO(p). It is possible that Dp,v is empty, in which case the value of the attribute must be computed without any other additional information, so the function fp,v is a constant.


Based on the above, we are in a position to define an attribute grammar as a context-free grammar with two disjoint sets of attributes (inherited, and synthesized) and semantic functions for all defined attribute occurences.

The language defined by an attribute grammar is usually of no interest. Instead, the values in the decorated parse tree are usually of interest.


Binary digits example


Consider the following context-free grammar generating strings of binary digits.


Four productions:


p: B -> D

q: B -> DB

r: D -> 0

s: D -> 1


We now give an attribute grammar which purpose is to define the value represented by the binary numbers generated by the context-free grammar.


We begin by describing the attributes we wish to use:

Nonterminal B will have synthesized attributes pos and val

Nonterminal D will have inherited attributes pow and synthesized attribute val

The two terminal symbols 0 and 1 will have no attributes

The attribute val will be used to accumulate the value of the binary number

The attribute pow and pos will be used to keep track of the position, and hence what power of 2 in the binary number we are currently considering.


From the above information , we can compute the defined and used occurences for each production.



Defined occurences of q are the synthesized attributes of q0 = B, i.e., q0.pos, and q0.val

To these attribute occurences, we add the inherited attributes of all the grammar symbols

on the right-hand side. In this case, there is only one q1.pow

The used occurences of q are the remaining attribute occurences, namely q1.val, q2.pos,

And q2.val.


Note that we could write the attribute occurrence q0.pos as B.pos, but then q2.pos would also be written B.pos (reason why attribute occurences were introduced in the first place).

As an alternative, if we agree to add subscripts from left to right to all grammar symbols that occur more than once, we can write all attribute occurences with the grammar symbol and not the position in the production.


To complete the description of an attribute grammar based on the given context-free grammar and the above attributes, we must give function definitions for the eight defined attribute occurences. These definitions compute the value of the binary numbers in the language. The productions of the context-free grammar have been augmented by assignments that are meant to define the semantic functions.



p: B -> D

B.pos := 1

B.val := D.val

D. pow := 0

Defined: B.pos, B.val, D.pow, Used: D.val

q: B1 -> D B2

B1.pos := B2.pos + 1

B1.val := B2.val + D.val

D.pow := B2.pos

Defined: B1.pos, B1.val, D.pow, Used: B2.pos, B2.val, D.val

r: D -> 0

D.val := 0

Defined: D.val, Used: D.pow

s: D-> 1

D.val := 2D.pow

Defined: D.val, Used: D.pow


Every assignment defines one semantic function:



assignment associated with production q

B1.val := B2.val + D.val

This is a definition of the semantic function fq,v where v is the attribute occurrence

B1.val (or q0.val).

The definition of this function is necessary by virtue of the fact that q0.val Î DO(q)

since val Î Syn(B).

Notice that attribute occurences B2.val and D.val are in UO(q), since they are

synthesized attributes occuring on the right-hand side.




Evaluation of parse tree for 1010

B (pos=4, val=10)


D (pow=3, val=8) B (pos=3, val=2)

1 D (pow=2, val=0) B (pos=2, val=2)


0 D (pow=1, val=2) B (pos=1, val=0)


    1. D (pos=0, val=0)



Expression language example


Consider an example for evaluating expressions of a simple expression language.


Underlying context-free grammar:


S -> E

E -> 0 | 1 | I | let I = E in E end | (E + E)



1 + 1 is an expression in the language with value 2

let x=1 in x+x end is an expression in the language with value 2


Goal is to design an attribute grammar to give the definition of the value for all expressions in this language. Our attribute grammar to evaluate expressions will have two attributes: val (a synthesized attribute computing the value of the expression), and env (an inherited attribute containing the bindings formed by let expressions).



Synthesized: (none)

Inherited: (none)


Synthesized: val

Inherited: env


The value of an expression will be an integer, thus we take Domain(val) to be the set of integers.

An environment is not a simple data type, but a list of bindings. Each binding is a name/value pair.

We assume that we have some helpful functions that operate on environments (LookUp to determine the value belonging to a name in some environment, Update to add a binding to the environment). Finally, we need some initial environment which we may take as the empty list of bindings.


Attribute grammar to compute the value of expressions is as follows:


S -> E

E.env := InitialEnv

E -> 0

E.val := 0

E -> 1

E.val := 1

E -> I

E. val := LookUp(I, E.env)

E1 -> let I = E2 in E3 end

E2.env := E1.env

E3.env := Update(E1.env, I, E2.val)

E1.val := E3.val

E1 -> (E2 + E3)

E1.val := E2.val + E3.val

E2.env := E1.env

E3.env := E1.env


Note that environment is being passed around.

Programs that evaluate attribute grammars should avoid excessive copying of large data structures such as he environment in this example (or a symbol table in a more realistic example).


Computing of all the attributes for a particular parse tree for let x=1 in x+x:




E (env=nil, val=2)


let x = E (env=nil, val=1) in E (env=(x,1), val =2) end


1 E (env=(x,1), val=1) + E (env=(x,1), val=1)


x x



Note that we do not give techniques for computing the values of all the attributes for a given parse tree. This may be quite complez and involve many passes through the parse tree. For some attribute grammars, it is not possible to compute the values of all the attributes.


C.2.3.5. Post Systems


Formal Systems:


Grammars are useful in describing certain aspects of PLs.

Lambda-calculus is useful for models of PLs.

Deductive formal systems are useful for reasoning about PLs


Deductive systems are rarely presented as grammars, they are more often given as Post system.


Definition of Post systems


A Post system consists of a list of signs, a list of variables, and a finite set of productions.

The signs form the alphabet of the canonical system.

A term is a string of signs and variables

A word is a string of signs

A production is a figure of the form t1 t2 … tn / t, where t, t1, …, tn (n ³ 0) are all terms.

The ti terms are called the premises, and t the conclusion of the production.

A production without premises (n = 0) is called an axiom.

An instance of a production is obtained from a production by substituting strings of signs for all the variables, the same string being substituted for all occurences of one and the same variable.


Given a Post system, we shall define inductively the notion of a proof of a word:


  1. An instance of an axiom is a proof of the conclusion.
  2. If Pr1, Pr2, …, Prn are proofs of a1, a2, …, an respectively, and a1 a2 … an / a is an instance of a production, then Pr1 Pr2 … Prn / a is a proof of a.


A term is provable in a Post system if we can find a proof for it.

The set of words provable from a Post system forms the language derived by the system.

Sometimes it is necessary to consider the language derived by a Post system to be the set of strings from a subset of the signs that are provable.


Simple examples of Post systems


1. Tally notation:


Signs are {N, |}

One variable x

Two productions: / N, Nx /Nx |


This Post system derives the word N, other words can be derived from N using the second production. An instance of the second production is N / N | (x replaced by empty string) which can be used in the proof of N | : / N / N|

Hence, this Post system derives strings of the form N | … |.


2. Tally notation for natural numbers previously introduced:


This Post system derives addition equations of the form x+y=z. It uses the variables x, y, z, and the signs {N, |, +, =}.


/N Nx/Nx | Ny/ +y =y x+y=z / x | +y=z |


The last tow rules reflect the recursive definition of addition in terms of the successor function (concatenating | to a string is like adding one). From this Post system, correct equations are derivable like 2 + 2 = 4:


/N/N | / N || / + || = || / | + || = ||| / || + || = |||


3. MIU system of Hofstadter


/MI xI/xIU Mx/Mxx xIIIy/xUy sUUy/xy


This Post system produces strings beginning with M and containing I an U.


In all above examples, the productions have the following two properties:

Each production has at least one premise.

A variable occurs at most once in a premise.


Relaxing these restrictions provides much more flexibility (proof trees, pattern matching).

The restrictions make the Post system easier to understand, and do not diminish the power of Post systems.



one rule with two premises


/N Nx / Nx | Nx Ny / x+y=xy


This Post system generates all correct equations of the form x+y=z (uses the fact that

Concatenation is addition for tally notation)


Connection with grammars


A Post system is another formalism for an unrestricted grammar.


Post systems live today in the formalism and terminology of formal logical systems.


Posts systems are good for pattern matching (sometimes difficult to capture easily in grammars).



sxa / xxx applies only to strings that have the patern "x" appearing twice, separated by

the terminal symbol a (not easy to simulate with grammars).


Terminology comparison:


Formal system Grammar


Alphabet Alphabet

Well-formed formula String

Axiom Start symbol

Rule of inference Production

Theorem Sentential form


A Post system for propositional logic


Propositional logic consists of a collection of propositions P, R, Q, etc. and statements about these propositional symbols using various connectives. For example Ø P, P & Q, P => Q.


Post system uses the set {P, |, N, C, F, Th} as signs, and {x, y, z} as variables.

Two productions for propositions: /P, Px / Px|


The words of the form Px provable in the Post system are concrete representations of propositions. This set of words is particularly simple, it is just the set:


{P, P |, P ||, …}


For formulas we have three productions:


Px / F Px Fx / F Nx Fx Fy / F Cxy


The words of the form Fx provable in the Post system are concrete representations of formulas in the propositional logic. Strings of the form FNx correspond to negated formulas and strings of the form Fcxy correspond to implications (prefix notation is convenient since no parentheses are required).


For theorems, we have four productions (first three correspond to three axioms of propositional logic, which are not axioms in the Post system because of the well-formedness conditions that are the premises of the productions).


Fx / Th C C N xxx Fx Fy / Th Cx C Nxy Fx Fy Fz / ThCCxyCCyzCxz


These conditions are needed to ensure that if a term has the form Thx, then Fx is a formula.


The last production corresponds to modus ponens


ThCxy Thx / Thy


The language defined by words of the form Thx cannot be described as easily as that of propositions and formulas. This is due to the fact that the collection of words of the form Thx is not just a random selection of symbols.


What does it mean for a string of the form Thx to be true ?


We require first a notion of assignment: a function s from propositions to the set { T, ^ } = Bool, or equivalently from the natural numbers to Bool. If s (I) = T, then Pi is a true proposition. The set of all possible assignments is denoted by å .


For convenience, we associate with each proposition a nonnegative integer in the following manner:


  1. N[P] = 0
  2. N[Px |] = N[Px] + 1


So we can view assignments s Î å as functions from natural numbers to boolean values.


The semantics for propositional logic is given by a function M from å to Bool:


  1. M[FPx] s = s (N[Px])
  2. M[FNx] s = T, iff M[Fx] s = ^
  3. M[Fcxy] s = T, iff M[Fx] s = ^ or M[Fy] s = T


We say that a formula Fx of the propositional logic is valid if M[Fx] s = T for all assignments s .


Theorem: The term Thx is derivable in the Post system for propositional calculus iff the formula represented by Fx is valid.


Proof omitted.


C.2.3.6. Abstract Syntax


We take a view focusing on the underlying significance of the constructs (rather than how to parse constructs or how convenient the constructs are).


Definition of abstract syntax


BNF notation and context-free grammars are used to describe a programming language in a synthetic manner (how the various kinds of programs are built up from their parts).


It is sometimes useful to describe a language in an analytic manner (abstract syntax description).


An abstract syntax is abstract in the sense that it is independent of the notation used to represent the language constructs. It only affirms that they can be built, recognized, and taken apart.


Functions used to build the constructs are sometimes called constructors.


Functions used to recognize constructs are sometimes called recognizers.


Functions used to take constructs apart are called destructors or selectors.


Abstract syntax of while programs


Concrete syntax of a simple PL given by a BNF definition:


W ::= I:= E

W ::= if B then W else W

W ::= while B do W

W ::= W ; W


We may not be interested in concrete syntax issues such as:


  1. Whether the assignment operator is := or something else.
  2. Whether the conditional statement has the terminating delimiter or not
  3. Whether semicolons are separators or terminators


These issues must be resolved to define the BNF.

Also, the grammar is ambiguous, there are two interpretations of the program


while B do W1 ; W2


This is only relevant if we are building a parser.


The idea is to represent the constructs of the while programming language with four constructors: Assignment, Block, While, If.


Assignment(I, E)

Block(W1, W2)

While(B, W)

If(B, W1, W2)


I represents identifiers, E expressions, B boolean-valued expressions, and W other while program fragments.


The above ambiguous program would be rendered as:


While(B, Block(W1, W2)


Block(While(B, W1), W2)


Which leads to an unambiguous program (unlike a parse tree).


To manipulate the abstract representation of the programs, we require selectors to access the subpieces of the constructed program: Ident, LHS, FirstOfBlock, SecondOfBlock, WhileCond, WhileBody, IfCond, ThenStmt, ElseStmt. These selectors extract the indicated parts of the program.


C2.4. Comments on Sethi's Chapter 2


Look at chapter 13.2 (attribute grammars), and chapter 13 in general for additional formalism.