Start Lecture #1

Chapter 0: Administrivia

I start at 0 so that when we get to chapter 1, the numbering will agree with the text.

0.1: Contact Information

0.2: Course Web Page

There is a web site for the course. You can find it from my home page, which is http://cs.nyu.edu/~gottlieb

0.3: Textbook

The course text is Scott, "Programming Language Pragmatics", Third Edition (3e).

0.4: Computer Accounts and Mailman Mailing List

0.5: Grades

Grades are based on the labs and the final exam, with each very important. The weighting will be approximately
30%*LabAverage + 30%*MidTermExam + 40%*FinalExam (but see homeworks below).

0.6: The Upper Left Board

I use the upper left board for lab/homework assignments and announcements. I should never erase that board. Viewed as a file it is group readable (the group is those in the room), appendable by just me, and (re-)writable by no one. If you see me start to erase an announcement, let me know.

I try very hard to remember to write all announcements on the upper left board and I am normally successful. If, during class, you see that I have forgotten to record something, please let me know. HOWEVER, if I forgot and no one reminds me, the assignment has still been given.

0.7: Homeworks and Labs

I make a distinction between homeworks and labs.

Labs are

Homeworks are

0.7.1: Homework Numbering

Homeworks are numbered by the class in which they are assigned. So any homework given today is homework #1. Even if I do not give homework today, the homework assigned next class will be homework #2. Unless I explicitly state otherwise, all homeworks assignments can be found in the class notes. So the homework present in the notes for lecture #n is homework #n (even if I inadvertently forgot to write it to the upper left board).

0.7.2: Doing Labs on non-NYU Systems

You may solve lab assignments on any system you wish, but ...

0.7.3: Obtaining Help with the Labs

Good methods for obtaining help include

  1. Asking me during office hours (see web page for my hours).
  2. Asking the mailing list.
  3. Asking another student, but ...
    Your lab must be your own.
    That is, each student must submit a unique lab. Naturally, simply changing comments, variable names, etc. does not produce a unique lab.

0.8: A Grade of Incomplete

The new rules set by GSAS state:

  3.6.  Incomplete Grades: An unresolved grade, I, reverts to F one
        year after the beginning of the semester in which the course
        was taken unless an extension of the incomplete grade has been
        approved by the Vice Dean.
     3.6.1.  At the request of the departmental DGS and with the
             approval of the course instructor, the Vice Dean will
             review requests for an extension of an incomplete grade.
     3.6.2.  A request for an extension of incomplete must be
             submitted before the end of one year from the beginning
             of the semester in which the course was taken.
     3.6.3.  An extension of an incomplete grade may be requested for
             a period of up to, but not exceeding, one year
     3.6.4.  Only one one-year extension of an incomplete may be granted.
     3.6.5.  If a student is approved for a leave of absence (See 4.4)
             any time the student spends on that leave of absence will
             not count toward the time allowed for completion of the
             coursework.
  

0.9 Academic Integrity Policy

This email from the assistant director, describes the policy.

    Dear faculty,

    The vast majority of our students comply with the
    department's academic integrity policies; see

      www.cs.nyu.edu/web/Academic/Undergrad/academic_integrity.html
      www.cs.nyu.edu/web/Academic/Graduate/academic_integrity.html

    Unfortunately, every semester we discover incidents in
    which students copy programming assignments from those of
    other students, making minor modifications so that the
    submitted programs are extremely similar but not identical.

    To help in identifying inappropriate similarities, we
    suggest that you and your TAs consider using Moss, a
    system that automatically determines similarities between
    programs in several languages, including C, C++, and Java.
    For more information about Moss, see:

      http://theory.stanford.edu/~aiken/moss/

    Feel free to tell your students in advance that you will be
    using this software or any other system.  And please emphasize,
    preferably in class, the importance of academic integrity.

    Rosemary Amico
    Assistant Director, Computer Science
    Courant Institute of Mathematical Sciences
  

Chapter 1: Introduction

Brief History

Very early systems were programmed in machine language.

Next came assembly language where the programmer uses mnemonics for the opcode, but still writes one line for every machine instruction. Simple translators, called assemblers, translate this language into machine code. Assembly language, like machine language is completely non-portable. Later assemblers supported macros, which helped the tedium, but did not increase portability.

Today high-level languages dominate and are translated by sophisticated software called compilers. The 3e has much material on compilers, but I will de-emphasize that portion since we give a course on the subject (G22-2130). My extensive lecture notes for 2130 can be found off my home page.

1.1: The Art of Language Design

What is a Program?

At the most primitive level, a program is a sequence of characters chosen from an alphabet. But not all sequences are legal: there are syntactic restrictions (e.g., identifiers in C cannot begin with a digit). We also ascribe meaning to programs using the semantics of the language, which further restricts the legal sequences (e.g., declarations must precede uses in Ada).

A Programming Language specifies what character sequences are legal and what these sequences mean (i.e., it specifies the syntax and semantics).

Why So Many Languages?

Desirable Traits

Why are some languages so much more popular/successful than others?

1.2: The Programming Language Spectrum

Imperative vs. Declarative Languages

An imperative language programmer essentially tells the computer exactly how to solve the problem at hand. In contrast a declarative language is higher-level: The programmer describes only what is to be done. For example, malloc/free (especially free) is imperative, but garbage collection is normally found in declarative languages. The Prolog example below illustrates the declarative style.

There are broad spectrum languages like Ada that try to provide both low-level and high-level features. By necessity, such languages are large and complex.

von Neumann: Fortran, Pascal, C, Ada 83

This is the most common programming paradigm and largely subsumes the Object-Oriented class described next. It is the prototypical imperative language style.

The defining characteristic is that the state of a program (very roughly the values of the variables) changes during execution. Very often the change is the result of executing an assignment statement.

Recently, the term von Neumann is used to refer to serial execution of a program (i.e., no concurrency).

Object-Oriented: Simula 67, Smalltalk, Ada 95, Java, C#, C++

Languages that emphasize information hiding and (especially) inheritance. Data structures are bundled together with the operators using them. Most are von Neumann, but pure object-oriented languages like Smalltalk have a different viewpoint of computation, which is viewed as occurring at the objects themselves.

Functional (a.k.a. Applicative): Scheme, ML, Haskell

These languages are based on the lambda calculus. They emphasize functions (without side effects) and discourage assignment statements and other forms of mutable state. Functions are first-class objects; new functions can be constructed while the program is running. This will be emphasized when we study Scheme. Here is a taste.

(define sumtwo
  (lambda (A B)
    (+ A B)))

> (sumtwo 5 8)
13

(define sumthree
  (lambda (A B C)
    (+ (+ A B) C)))

> (sumthree 5 8 2)
15

(define oosumtwo              ; "object oriented" sum
  (lambda (A B)
    ( (lambda (X) (+ X B))    ; this is the anonymous "add B" function
      A)))                    ; which we apply to A

> (oosumtwo 5 8)              ; the anonymous "add 8" temporarily existed
13
  

Logic (Declarative, Constraint Based): Prolog

A program is a set of constraints and rules. The following example is a fixed version of Figure 1.14.

    gcd(A,B,G) :- A=B, G=A.
    gcd(A,B,G) :- A>B, C is A-B, gcd(C,B,G)
    gcd(A,B,G) :- B>A, C is B-A, gcd(C,A,G)
  

Homework: 1.4. Unless otherwise stated numbered homework problems are from the Exercises section of the current chapter. For example this problem can be found in section 1.8.

Scripting (Shell, Perl, Python)

Often used as glue to connect other programs together.

Mixtures

Many languages contain aspects of several of these classes. As mentioned C++, Java, and C# are both von Neumann and object-oriented. The language OHaskell is object-oriented and functional.

Concurrency

Concurrency is not normally considered a separate class. It is usually obtained via extensions to languages in the above classes. These extensions can be in the form of libraries. A few languages have concurrency features, but the bulk of the language is devoted to serial execution. For example, threads in Java and rendezvous in Ada.

(High-Level vs Low-Level) High-Level Languages

In this course we are considering only high-level languages, i.e., we exclude machine and assembly languages. However, the class of high-level languages is quite broad and some are considered higher level than others.

Thus when comparing languages, we often call C and Fortran low-level since the programmer has more control and must expend more effort. In contrast, languages like Scheme and Haskell are considered high-level. They require less effort for a given task but give the programmer less control and the run-time performance is not as easy to predict. Perhaps we should call C a low-level, high-level language, but that suggestion is too awkward to take seriously.

There are wide spectrum languages like Ada and C++ that provide both low-level control when desired, in addition to high-level features such as garbage-collection and array manipulation. The cost of this is a large, complex language.

Homework: Page 16 CYU (check your understanding) #2.

Characteristics of Modern Languages

Modern general-purpose languages such as Ada, C++, and Java share many characteristics.

1.3: Why Study Programming Languages

Become Familiar with Various Idioms

Learning several languages exposes you to multiple techniques for problem solving, some of which are idioms in the various languages. Your exposure to the different idioms enlarges your toolbox and increases your programming power, even if you only use a very few, closely related, languages.

Elegance, Beauty and Coolness

The automatic pattern matching and rule application of Prolog, once understood, is real neat. It is certainly not something I would have thought of had I never seen Prolog. If you encounter problems for which Prolog is well suited and for which a Prolog based solution is permitted, the programming effort saved is substantial. A very primitive form of the Prolog unification is the automatic dependency checking, topological sorting, and evaluation performed by spreadsheets.

I have never written a serious program using continuation passing, but nonetheless appreciate the elegance and (mathematical-like) beauty the technique offers. I hope you will too when we study Scheme.

Applying Programming Language / Compiler Techniques outside the Domain

Very few of you will write a compiler or be part of a programming language design effort. Nonetheless, the techniques used in these areas can be applied to many other problems. For example, lexing and especially parsing can be used to produce a front end for many systems. Brian Kernighan refers to them as little languages; others call them domain-specific languages or application-specific languages.

1.4: Compilation and Interpretation

The standard reference for compilers is the Dragon Book now in its third incarnation (gaining an author with each new life). My notes for our compiler course using this text can be found off my home page (click on previous courses).

A Compiler is a translator from one language, the input or source language, to another language, the output or target language.

Often, but not always, the target language is an assembler language or the machine language for a computer processor.

Note that using a compiler requires a two step process to run a program.

  1. Execute the compiler (and possibly an assembler) to translate the source program into a machine language program.
  2. Execute the resulting machine language program, supplying appropriate input.

This should be compared with an interpreter, which accepts the source language program and the appropriate input, and itself produces the program output.

Sometimes both compilation and interpretation are used. For example, consider typical Java implementations. The (Java) source code is translated (i.e., compiled) into bytecodes, the machine language for an idealized virtual machine, the Java Virtual Machine or JVM. Then an interpreter of the JVM (itself normally called a JVM) accepts the bytecodes and the appropriate input, and produces the output. This technique was quite popular in academia some time ago with the Pascal programming language and P-code.

1.5: Programming Environments

The Compilation Tool Chain

This section of 3e is quite clear, but rather abbreviated. It is enough for this class, but if you are interested in further details you can look here at the corresponding section from the compilers course (G22.2130).

Other Parts of the Programming Environment

In addition to the compilation tool chain, the programming environment includes such items as pretty printers, debuggers, and configuration managers. These aids may be standalone or integrated in an IDE or sophisticated editing system like emacs.

1.6: An Overview of Compilation

The material in 1.6 of the 3e, including the subsections below, is briefly covered in section 1.2 of my compilers notes. The later chapters of those notes naturally cover the material in great depth. However, we will not be emphasizing compilation aspects of programming languages in this course so will be content with the following abbreviated coverage.

Homework: 1.1 (a-d).

1.6.1: Lexical and Syntax Analysis

1.6.2: Semantic Analysis and Intermediate Code Generation

1.6.3: Target Code Generation

1.6.4: Code Improvement

1.7: Summary

Read.

Every chapter ends with these same four sections. I will not be repeating them in the notes. You should always read the summary.

1.8: Exercises

1.9: Explorations

1.10: Bibliographic Notes

Chapter 2: Programming Language Syntax

We cover this material (in much more depth) in the compilers course. For the programming language course we are content with the very brief treatment in the previous chapter.

Homework: 2.1(a), 2.3, 2.9(a,b)