Compilers
G22.2130.01
Wednesday 5:00 - 7:00
Room 109, Warren Weaver Hall
Professor Robert Dewar
Instructor: Robert Dewar
(dewar@cs.nyu.edu)
Teaching Assistant: Shubin Zhao
(shubinz@cs.nyu.edu)
Class list
All students should register themselves with the class list, which is used for
all technical discussions concerning the course and the course project.
To subscribe to this list visit the web page:
http://www.cs.nyu.edu/mailman/listinfo/g22_2130_001_sp01
where you will find full directions.
Please send all your questions to this list (not to the instructor) so that
everyone can participate.
Textbooks
required:
Aho, Sethi and Ullman: Compilers, principles, Techniques and tools
(Addison Wesley)
This is still the standard textbook, even though more than 10 years old. The
following suggested readings touch on more recent topics, but the Dragon
book still contains much information that every compiler writer should know.
Recommended readings:
Fraser and Hanson: a retargetable C compiler (Benjamin Cummings, 1996)
Muchnick: advanced compiler design and implementation (Morgan Kaufman, 1997)
Fisher & LeBlanc : Crafting a Compiler (Benjamin Cummings)
Appel : modern compiler implementation in C : basic techniques
Dewar & Smosna : Microprocessors, a programmer's perspective
(Mc-Graw Hill)
GNAT Sources
Throughout the course we will be referring to the sources of the GNAT
compiler, which will also be the basis of the course project. The
compact overview of GNAT provides a
roadmap to its main components. You can download GNAT for your
favorite machine, and get the full sources of the compiler, from the
NYU ftp site: ftp cs.nyu.edu,
directory pub/gnat. To build the compiler, you will need the sources
for GCC 2.81.
You may also find it useful to browse
the GNAT sources here. To download the entire set, grab gnat-html.zip, unzip it on your local
machine, and point your browser to index.htm.
(Please report any broken cross references to gingell@gnat.com.)
Ada, GNAT, and tool documentation:
Ada 95 reference manual.
The GNAT user's guide.
The GNAT reference manual.
GCC documentation.
GDB documentation.
All of the above are available for downloading in multiple formats
from the NYU ftp
site.
If Ada is new to you, you may want to start with this
ada summary.
Assembly resources:
The Art of Assembly is a complete online text book covering x86
assembly programming.
For Linux / Unix information see
linuxassembly.org. Especially see the
Linux Assembly HOWTO.
Complete x86 information is available from Intel:
Developer's Manual, Volume 1: Basic Architecture
Developer's Manual, Volume 2: Instruction Set.
Developer's Manual, Volume 3: System Programming Guide.
You may also find it useful to examine assembly generated by GCC from
your own code. To do so, pass GCC the -S flag or the -save-temps
flag. It's also straight forward to step through a program in GDB and
use 'stepi' to watch the instructions being executed for each line.
Course Work
Semester project: write a prototype compiler for a Pascal subset of Ada,
using the front-end of the GNAT compiler.
Various problem sets, and a final examination.
Final Exam
May 9th in class. Open book.
Exam guidelines.
Final Project
Compilers Final Project requirements and
submission instructions.
A minimal test program.
Dead Lines
- May 13th, midnight
- Deadline for initial submission of final project.
- May 20th, midnight
- If and ONLY IF you have made an initial submission of the final
project, you can submit an updated version with improvements etc by
this deadline. The submission must clearly indicate what the changes
are.
Lecture 1
Language translators: compilers and interpreters. The structure of a compiler:
lexical analysis, parsing, semantic analysis, intermediate code generation,
register allocation, global optimization. Bootstrapping a compiler.
This is essentially an overview of
what compilers are all about, and hence of the entire course
Readings: ASU chap. 1 and 2.
Lecture 2
A modern compiler : GNAT. The architecture of GNAT: phases, intermediate
representations, important data structures.
How to build GNAT from sources.
How to build modified versions of GNAT, tree-walking routines that use the
intermediate representation of GNAT.
Readings: GNAT specifications for syntactic and entity information, the
driver routines, interface to the GCC code generator.
an example tree-walking routine
example sources
syntax specification for GNAT AST
semantic information for entities in GNAT AST
Lecture 3
Lexical scanning: Token classes, fast keyword recognition, minimizing the
code-per-character cost of scanning, scanning numeric literals and string
literals. The interface between the scanner and the parser.
Hand-written vs. automatically generated scanners. Regular expressions.
Handling of constants
Readings: GNAT files
scans.ads,
scn.ads,
scn.adb,
scn-nlit.adb,
scn-slit.adb,
ASU sections 3.1 - 3.4 (rest of chapter optional)
The flex specifications
Lecture slides (or powerpoint)
Lecture 4
Parsing. Abstract syntax vs. concrete syntax. Grammars and the formal
specification of certain aspects of programming languages. Top-down
parsing and recursive descent. Error detection and recovery.
Readings: GNAT files
sinfo.ads ,
par.adb (the top level parsing routine),
and its subunits. such as
par-ch3.adb (parsing of declarations),
par-ch4.adb (expressions),
par-ch5.adb (statements),
par-ch6.adb (subprograms),
par-ch7.adb (packages),
par-ch10.adb (compilation units),
and ASU sections 4.1-4.4
Lecture 5
Automatic parser construction. FIRST and FOLLOW functions. LL(1) parsers.
LR parsers. Conflicts in LR grammars and how to resolve them. An introduction
to BISON.
The Bison specifications.
Lecture slides (or powerpoint)
Readings: ASU sec. 4.7-4.9
GCC 2.81 uses Yacc/Bison for it's C and C++ parsers, which provide realistic
examples. Some preprocessing is performed during the GCC build,
consult those sources for further details.
Lecture 6
Code generation: quadruples, assembly language, machine language.
Simple algorithms for register allocation. Formal specification of
machine characteristics. Note: logically this would fit better later
on in the course if we were tracing the action of a compiler from
beginning to end, but we move this earlier to give you a jump
start on the final project for the course.
Readings: GCC docs, ASU chapter 9
An excellent GCC tutorial can be found at www.gnat.com/gcctut/ppframe.htm.
Lecture 7
Run-time organization: storage allocation, non-local references, parameter
passing, dynamic storage allocation. Exception handling, debugging information.
Readings: GNAT files sem_res, exp_ch4, exp_ch6, exp_ch11.
ASU chapter 7.
Lecture slides (or powerpoint)
Lecture 8
Semantic analysis: attributes and their computation, tree-traversals,
visibility and name resolution. Inherited attributes and symbol tables.
Name resolution in block-structured languages.
Readings: GNAT files sem_chX.ads/adb, X= 3 to 8
ASU sections 5.1 - 5.5,
Lecture slides
Lecture 9
Type checking. Type systems, varieties of strong typing, overload resolution,
polymorphism and dynamic dispatching.
Type-checking and type inference, unification.
Readings: GNAT files sem_res, sem_type, sem_disp.
ASU chapter 6.
Lecture 10
Intermediate code generation: control structures, assignments,
aggregates. Lowering the semantic abstraction of high-level
constructs.
Readings: GNAT files exp_ch5, exp_aggr. ASU chapter 8.
Lecture 11
Code generation for RISC machines: pipelines, delay slots,
instruction scheduling
Readings : Microprocessors, chap. ???
Lecture 12
Peephole optimization. Case study: the REALIA Cobol compiler.
Lecture slides (or StarOffice format)
Lecture 13
Last official class. April 25
Global optimization: data flow analysis, monotonic frameworks, use-def
chaining, constant folding, common subexpression elimination.
Readings: ASU chapter 10, lecture notes.
Lecture 14
First exam week date. May 2nd
Special extra class. Guest lecturer by Dr. Ben Goldberg on advance
loop optimization.
Final Exam
May 9th in class
Open book. Exam guidelines.