Monday 7:00 - 9:00
Room 102, Warren Weaver Hall
Professor Edmond Schonberg

Instructor: Edmond Schonberg (
additional lectures: Robert Dewar (

Class list

All students should register themselves with the class list, which is used for all technical discussions concerning the course and the course project. To register, send an e-mail message to The contents of the message should be the single line:
     subscribe g22_2130_001_sp99
you will be notified in return that you are a list participant. Please send all your questions to this list (not to the instructor) so that everyone can participate.



Aho, Sethi and Ullman: Compilers, principles, Techniques and tools (Addison Wesley)
This is still the standard textbook, even though more than 10 years old. The following suggested readings touch on more recent topics, but the Dragon book still contains much information that every compiler writer should know.

Throughout the course we will be referring to the sources of the GNAT compiler, which will also be the basis of the course project. The compact overview of GNAT provides a roadmap to its main components.

Recommended readings:

Fraser and Hanson: a retargetable C compiler (Benjamin Cummings, 1996)
Fisher & LeBlanc : Crafting a Compiler (Benjamin Cummings)
Appel : modern compiler implementation in C : basic techniques
Dewar & Smosna : Microprocessors, a programmer's perspective (Mc-Graw Hill)
Muchnick : advanced Compiler Construction (Prentice Hall)
If Ada is new to you, you may want to start with this ada summary .

Course Work

Semester project: write a prototype compiler for a Pascal subset of Ada, using the front-end of the GNAT compiler.
Various problem sets, and a take-home final examination.


Assignment 1
Assignment 2
Assignment 3
Assignment 4

Lecture 1

Language translators: compilers and interpreters. The structure of a compiler: lexical analysis, parsing, semantic analysis, intermediate code generation, register allocation, global optimization. Bootstrapping a compiler.

Readings: ASU chap. 1 and 2.

Lecture 2

A modern compiler : GNAT. The architecture of GNAT: phases, intermediate representations, important data structures. How to build GNAT from sources. How to build modified versions of GNAT, tree-walking routines that use the intermediate representation of GNAT.

Readings: GNAT specifications for syntactic and entity information, the driver routines, interface to the GCC code generator.

an example tree-walking routine
syntax specification for GNAT AST
semantic information for entities in GNAT AST

Lecture 3

Lexical scanning: Token classes, fast keyword recognition, minimizing the code-per-character cost of scanning, scanning numeric literals and string literals. The interface between the scanner and the parser. Hand-written vs. automatically generated scanners.

Readings: GNAT files,, scn-nlit.adb, scn-slit.adb, ASU sections 3.1 - 3.4 (rest of chapter optional)

Lecture 4

Parsing. Abstract syntax vs. concrete syntax. Grammars and the formal specification of certain aspects of programming languages. Top-down parsing and recursive descent.

Readings: GNAT files , par.adb (the top level parsing routine), and its subunits. such as
par-ch3.adb (parsing of declarations),
par-ch4.adb (expressions),
par-ch5.adb (statements),
par-ch6.adb (subprograms),
par-ch7.adb (packages),
par-ch10.adb (compilation units),
and ASU sections 4.1-4.4

Lecture 5

Automatic parser construction. FIRST and FOLLOW functions. LL(1) parsers. LR parsers. Conflicts in LR grammars and how to resolve them.

Readings: ASU sec. 4.7-4.9

Lecture 6

Semantic analysis: attributes and their computation, tree-traversals, visibility and name resolution. Inherited attributes and symbol tables. Name resolution in block-structured languages.

Readings: the following GNAT files:
sem_ch3.adb (analysis of declarations),
sem_ch5.adb (statements),
sem_ch6.adb (subprograms),
sem_ch7.adb (packages),
sem_ch8.adb (visibility),
ASU sections 5.1 - 5.5,

Lecture 7

Type checking. Type systems, varieties of strong typing, overload resolution, polymorphism and dynamic dispatching. Type-checking and type inference, unification.

Readings: GNAT files sem_res, sem_type, sem_disp. ASU chapter 6.

Lecture 8

Run-time organization: storage allocation, non-local references, parameter passing, dynamic storage allocation. Exception handling, debugging information.

Readings: GNAT files sem_res, exp_ch4, exp_ch6, exp_ch11. ASU chapter 7.

Lecture 9

Intermediate code generation: control structures, assignments, aggregates. Lowering the semantic abstraction of high-level constructs.

Readings: GNAT files exp_ch5, exp_aggr. ASU chapter 8.

Lecture 10

Code generation: quadruples, assembly language, machine language. Simple algorithms for register allocation. Formal specification of machine characteristics.

Readings: GCC docs, ASU chapter 9

An excellent GCC tutorial can be found at

Lecture 11

Code generation for RISC machines: pipelines, delay slots, instruction scheduling

Readings : Microprocessors.

Lecture 12

Peephole optimization. Case study: the REALIA Cobol compiler.

Lecture 13

Global optimization: data flow analysis, monotonic frameworks, use-def chaining,constant folding, common subexpression elimination.

Readings: ASU chapter 10, lecture notes.