Programming Languages

G22.2110 Spring 1998


Class #5`

Conventional Imperative Programming Languages - Part II

OO Imperative Programming Languages - Part I


Homework #5: Sethi Exercises 5.3, 5.7, 5.10, 6.3, 6.11

Read Sethi’s Chapters 5, 6, 7, 15.3, 15.4, handouts and references




C4.0. Midterm, Current Assignments, PL Project

C4.1. Names, Bindings, Type Checking, and Scopes

C4.2. Procedures and their Implementation

C4.3. Comments on Sethi's Chapters 5, 6


C4.0. Current Assignments, PL Project


Homework #4 due today


Midterm, 1 hour in class – March 9th (Sethi’s chapters 1-6 and class notes)


PL Project – Part II postponed until March 9th (due March 23)


C4.1. Names, Bindings, Type Checking, and Scopes




variables: abstractions in a language for the memory cells of the machine.


In some cases, the characteristics of the abstractions are very close to the characteristics of the cells (e.g., integer variable represented exactly as an individual hardware memory word).


In other cases, the abstractions are far removed from the cells (e.g., 3 dimensional array which must be mapped).


Variables have a collection of properties, or attributes, the most important of which is type.


The design of data types in a language requires that a variety of issues be considered:




Names (aka, identifiers) are associated wiuth labels, subprograms, and formal parameters.


Design Issues


What is the maximum length of a name ?

Can connector characters be used in names ?

Are names case sensitive ?

Are the special words reserved words or keywords


Name Forms


A name is a string of characters used to identify some entity in a program.

(e.g., single character names in early programming languages as unknown in mathematical equations).

FORTRAN I: 6 character names

FORTRAN 90 and C: 31 character names

Ada: no length limit

C++: no length limit (all characters may not be significant in order to keep the symbol table manageable)


Connector characters: not allowed in Pascal, Modula-2, and FORTRAN 77.


Case sensitivity: recognized in C, C++, Modula-2 (e.g., WriteInt procedure for outputting an integer in Modula-2).

Some languages restrict names to use only uppercase letters (e.g., pre 90 FORTRAN).


Special Words


Used to make program more readable by naming actions to be performed.

Used to separate syntactic entities of programs.


In most languages, special words are reserved words, but in some they are only keywords.


Keyword: word of a programming languages that is special only in certain contexts:

e.g., in FORTRAN, special words are keywords recognized according to context

REAL APPLE (declarative statement)

REAL = 3.4 (variable name REAL)


Reserved word: special word of a programming language that cannot be used as a name.


Many languages include predefined names (between reserved words and user-defined names)

e.g., Ada


not reserved words,

can be redefined by any Ada program

implicitly visible to the compiler

e.g., Pascal

predefined names are sometimes called standard identifiers

readln, writeln

implicitly visible to the compiler

e.g., Ada

Other predefined names such as GET and PUT are made visible explicitly by with

statements written by the user.

e.g., C, C++

printf, scanf defined in library stdio

access to names predefined in libraries available to the compiler through header files.


In C/C++:







Most variables have names.

Nameless variables exist as well.

Names are often referred to as identifiers.




Memory address with which a variable is associated.


In many languages it is possible for the same name to be associated with different addresses at different places in the program.

e.g., C

functions f1 and f2 both defining a variable that uses the same name, say "sum".


Similarly, some languages allow the same name to be associated with different addresses at different times during program execution:

e.g., C

recursive function can have multiple versions of each locally declared identifier,

one for each activation of the function

e.g., C

function f1 has a local variable count, and is called by two different functions

count may be associated with a different address in each of the activations of f1.


The address of a variable is sometimes called its l-value.




It is possible to have multiple identifiers (aliases) reference the same address.


Hindrance to readability and difficult for program verification.


Creation of aliases:


explicit creation with the EQUIVALENCE statement.

e.g., Pascal, and Ada

can be created using the variant record structures

e.g., C/C++

can be created using the union variables


Aliasing can be created in many languages through subprogram parameters.


In languages that have pointer variables, two pointer variables are aliases when they point to the same memory location (not meant to conserver storage, but rather a side effect of the nature of pointers).


When a C/C++ pointer is set to point at a named variable, the pointer, when dereferenced, and the variable's name are aliases.


C++ reference variables:


There is no need anymore to create aliases for the purpose of reusing storage (can be replaced by a dynamic storage management scheme).




Determines the range of values the variable can have and the set of operations that are defined for values of the type.


INTEGER type specifies a value range of -32,768 to 32,767,

and arithmetic operations for addition, substraction, multiplication, and division, along

with some library functions for operations like absolute value.




Content of the memory cell or cells associated with the variable.


Cells are abstract (large enough to hold the variable)


object: denotes the combination of a memory cell and its contents.


A variable's value is sometimes called its r-value (to access the r-value, the l-value must be determined first).


The Concept of Binding


A binding is an association, such as between an attribute and an entity or between an operation and a symbol.


Binding time:


(*) bound to the multiplication operation at language design time.

INTEGER (FORTRAN) bound to range of values at language implementation time

Variable in C or Pascal program bound to particular data type at compile time

Call to a library subprogram bound to subprogram code at link time

Variable bound to storage cell when program is loaded into memory (does not happen

until run time in some cases (Pascal subprograms, C functions) if the definition does not

include the static qualifier.

e.g., C

int count;


count = count + 5;

Set of possible types for count: bound at language design time.

Type of count: bound at compile time

Set of possible values of count: bound at compiler design time.

Value of count: bound at execution time with this statement

Set of possible meanings for the operator symbol +: bound at language definition time

Meaning of the operator symbol +: bound at compile time

Internal representation of the literal 5: bound at compiler design time.


Complete understanding of the binding times for the attributes of program entities is a prerequisite for understanding the semantics of a programming language.


Binding of Attributes to Variables


Binding is static if it occurs before run time and remains unchanged throughout program execution.


Binding is dynamic if it occurs during run time or can change in the course of program execution.


Physical binding is complex: page or segment of the address space in which the cell reside may be moved in and out of memory many times during program execution (hardware binding). However changes are invisible to the program and the user.


Type Bindings


Before a variable can be referenced in a program, it must be bound to a data type.


Important aspects of the binding:


Types can be specified statically through some form of explicit or implicit declaration.


Variable Declarations


Explicit declaration:


Implicit declaration:


Both explicit and implicit declarations create static bindings to types.


Most programming languages require explicit declarations of all variables.


FORTRAN, PL/I, and BASIC allow implicit declarations.


if identifier is not explicitly declared and starts with I, J, K, L, M, or N it is

implicitly declared to be an INTEGER type, otherwise it is implicitly declared

to be REAL type.

e.g., ML

has implicit declaration, using a type inference system


Implicit declaration can be unreliable:


Dynamic Type Binding


Type is not specified by a declaration statement.


The variable is bound to a type when it is assigned a value in an assignment statement (bound to the type of the value, variable, or expression on the right side of the assignment)




e.g., APL

LIST <- 10.2 5.1 0.0

causes LIST to become a single-dimensioned array of floating-point elements of length 3.

if it is followed by: LIST <- 47, LIST becomes a simple integer variable.



e.g., i and x are integers, and y is floating-point variable

need i := x

keying error produces i := y

with dynamic type binding: i is changed to floating-point type, and results are erroneous

with static type binding: compiler detects an error, program does not get to execution.


Type Inference


ML (supports both functional and imperative programming)


The types of most expressions can be determined without requiring the programmer to specify the types of the variables.



fun circumf (r) = 3.14159 * r * r

types inferred from the type of the constant in the expression.

fun times10 (x) = 10 * x

argument and functional value inferred to be of type integer

fun square (x) = x * x

rejected (cannot infer type of * operator)

fun square (x) : int = x * x



Storage Bindings and Lifetime






lifetime of a program variable:


Static Variables


Variables bound to memory cells before program execution begins, and remaining bound to those same memory cells until program terminates.


Very efficient due (addressing can be direct, and no runtime overhead incurred for allocation and deallocation)


Disadvantage is reduced flexibility (recursive subprograms not supported).


Stack-Dynamic Variables


Variables whose storage bindings are created when their declaration statements are elaborated, but whose types are statically bound.


Elaboration of variable declaration: refers to storage allocation and binding process indicated by the declaration which takes place when execution reaches the code to which the declaration is attached (during run-time).

e.g., Pascal

declaration and code sections

declaration section is elaborated just before execution of the code section begins, i.e.

when the procedure is called.

storage for variables in the declaration section is allocated (from the runtime stack)

at elaboration time and deallocated when the procedure returns control to its caller.


Variables declared in a subprogram are called local variables.


Stack-Dynamic variables meet the needs of recursive subprograms .


e.g., pre 77 FORTRAN: all variables were static

FORTRAN 77 & 90: stack dynamic variables for locals with a statement

SAVE list

that allows the programmer to specify that some or all of the variables (those in

the list) in the subprogram in which SAVE is placed will be static.

e.g., C/C++

local variables are by default stack-dynamic

can be made static using the static qualifier at definition time


Explicit Heap-Dynamic Variables


Nameless objects whose storage is allocated and deallocated by explicit run-time instructions specified by the programmer.


Such variables can only be referenced through pointer variables.


Heap: collection of storage cells whose organization is highly disorganized because of the unpredictable nature of its use.


Explicit heap-dynamic variable creation:


e.g., C++

allocation operator: new

creates a heap-dynamic variable of the type specified as attribute to new, and return a

pointer to it.

type can be determined at compile time.

variable is bound to storage at the time it is created at runtime.


int *intnode;


intnode = new int; /* allocates an int object */


delete intnode; /* deallocates the object to which intnode points */


Useful for dynamic structures (linked lists, trees) that need to grow/shrink during execution.




Implicit Dynamic Variables


Variables bound to heap storage only when they are assigned values (names that adapt to whatever use they are asked to serve).






Type Checking


Activity of ensuring that the operands of an operator are of compatible types.


Subprograms are operators whose operands are their parameters


The assignment symbol is a binary operator with its target variable and its expression being the operands.


Compatible type: one that is either legal for the operator or is allowed under language rules to be implicitly converted (coercion) by compiler-generated code to a legal type.


Type error: application of an operator to an operand of an inappropriate type.


If all bindings of variables to types are static in a language, then type checking can nearly always be done statically.

Dynamic type binding requires type checking at run time (dynamic type checking).


Type checking is complicated when a language allows a memory cell to store values of different types at different time during execution:

e.g., Ada & Pascal

via variant records



e.g., C/C++

via unions

In this case, type checking must be dynamic and requires the runtime system to maintain the type of the current value of the memory cells. So even though all variables are statically bound to types in languages like C and Pascal, not all type errors can be detected by static type checking.


Strong Typing


A strongly typed language is one in which each name in a program in the language has a single type associated with it, and that type is known at compile time (all types are statically bound).


The above definition ignores the possibility that, although a variable's type may be known, the storage location to which it is bound may store values of different types at different times.


Refined definition:

A strongly type language is one in which type errors are always detected (types of all operands can be determined either at compile time or at run time).


A strongly typed language has the ability to detect all misusesof variables that result in type errors.

A strongly typed language also allows the detection, at run time, of uses of the incorrect type values in variables that can store values of more than one type.



FORTRAN 77 is not strongly typed (relationship between actual and formal parameters

is not type checked. Also, the use of EQUIVALENCE between variables of different

types allows a variable of one type to refer to a value of a different type, without the

system being able to check the type of the value when one of the EQUIVALENCEd

variables is referenced or assigned.


Pascal is nearly strongly typed, but fails in its design of variant records (allows omission

of the tag that stores the current type of a variable, which provides the means of checking

for the correct value types.


Modula-2 is not a strongly typed language due to its design of variant records (like that

of Pascal). It also has a type WORD that intentionally avoids type checking.


Ada is nearly strongly typed (references to variables in variant records are dynamically

checked for correct type values). Ada allows programmers to breach the Ada type

checking rules by specifically requesting that type checking be suspended for a

particular type conversion (when using the UNCHECKED_CONVERSION library

function, which can exist for every data type and takes a variable of its type as its

parameter and returns the bit string that is the current value of that variable).


Modula-3 has a predefined procedure named LOOPHOLE that serves the same



C and C++ are not strongly typed languages (both allow functions for which parameters

are not type checked. Also, the union types of these languages are not type checked.


ML and Miranda are strongly typed in a slightly different sense than imperative

languages. ML has variables whose types are statically known (via declaration or

type inference). Miranda does not have variables, but its function parameter types

are all known at compile time.


Coercion rules weakens the value of strong typing:

e.g., Pascal

expressions are strongly typed

an arithmetic operator with one floating-point (real) operand and one integer operand is

legal. The value of the integer operand is coerced to floating-point.

This results in a loss of part of the reason for strong typing - error detection.


Languages with a great deal of coercion (e.g., FORTRAN, C, C++) are significantly less reliable than those with little coercion (e.g., Ada).


In C++, user-defined types or classes can also be defined to include coercions to other types.


APL & SNOBOL4 allow only dynamic type checking (because of dynamic type binding).


Type Compatibility


Concept defined when the issue of type checking was introduced.


The result of two variables being of compatible types is that either one can have its value assigned to the other.


Two different type compatibility methods:


Name type compatibility:

e.g., suppose Pascal uses strict name type compatibility

type indextype = 1..100; { a subrange type }


count : integer;

index : indextype;

then count and index would not be compatible

e.g., original version of Pascal

structured type passed among subprograms through parameters

such a type must be defined once globally

subprogram cannot state the type of such formal parameters in local terms


Structure compatibility:


type celsius = real;

farenheit = real;

compatible under structure type compatibility, but mixed use in expression is undesirable


ISO standard Pascal (1982):



type1 = array [1..10] of integer

type2 = array [1..10] of integer;

type3 = type2;

type1 and type2 are not compatible

type2 is compatible with type3 (declaration equivalence while not name type compatible)

e.g., Ada

uses name type compatibility

provides two type constructs, subtypes and derived types (new type based on previously

defined type which is incompatible)

derived types:

type celsius is new FLOAT;

type farenheit is new FLOAT;

identical structures but derived types incompatible

variables of neither type are compatible with any other floating-point type (however

constants are exempt from the rule)

derived types can also include range constraints on the parent type while still

inheriting all of the parent's operations.

subtypes: possibly range-constrained version of an existing type

compatible with its parent type

subtype SMALL_TYPE is INTEGER range 0..99;


Variables of type SMALL_TYPE are compatible with INTEGER variables.

Type compatibility rules are more important than those for languages that have

many coercions among types.

e.g., C

Uses structural equivalence for all types except structures (records, and unions) for

which C uses declaration equivalence (for structures within same file), and C

uses structural type equivalence for structures across files.

e.g., C++

name equivalence


Anonymous types:

e.g., Ada

A, B : array (1 .. 10) of INTEGER

A and B are incompatible anonymous types

type LIST_10 is array (1..10) of INTEGER;

C, D : LIST_10;

C and D are compatible




Range of statements in which the variable is visible.


A variable is visible in a statement if it can be referenced in that statement.


Scope rules:


Nonlocal variable of a program unit or block:


Static Scope


Method of binding names to nonlocal variables introduced in ALGOL 60.


In all common static-scoped languages (except C, C++, and FORTRAN) subprograms can be nested inside other subprograms. In these languages, all subprograms create their own scopes. The result is a hierarchy of scopes in a program.


e.g., Pascal

procedure big;

var x : integer;

procedure sub1;

begin { sub1 }

… x …

end; { sub1 }

procedure sub2;

var x : integer;

begin { sub2 }


end; { sub2 }

begin { big }


end; { big }


the reference to x in sub1 is to the x declared in big (search starts in sub1, then in its

static parent or static ancestor).


Predefined names and static scoping:


Hidden variables:




Introduced in ALGOL 60


Section of code that has its own local variables whose scope is minimized.


Such variables are typically stack-dynamic (storage allocated upon entering the section and deallocated upon section exit).


e.g., Ada block

declare TEMP : integer;








Blocks provide the origin of the phrase "block-structured-language"/


Pascal and Modula-2 are called block-structured languages but they do not include nonprocedural blocks.


C and C++ allow any compound statement to have declarations and thus define a new scope. Such compound statements are called blocks:


if (list[I] < list[j]) {

int temp;

temp = list [I];

list[I] = list[j];

list[j] = temp;


In C++ variable definitions may appear anywhere in functions, the scope of the variable is from the definition statement to the end of the function.


Scopes created by blocks are treated exactly like those of created by subprograms (searching enclosing scopes in order of increasing size).


Evaluation of Static Scoping


main { A { C { } D { } } B { E { } } }


desirable calls:


many other undesirable calls may not be detected until runtime


Also, to access some variable in the scope of D, E would need to be moved inside

the scope of D but would not longer be able to access the scope of B … !!!


Dynamic Scope


Based on the calling sequence of subprograms, not on their spatial relationship to each other. Thus, the scope can be determined only at run time.


e.g., procedure big (from before)

Assume dynamic scoping rules apply to nonlocal references

In sub1, the search of local declarations fails, so the declarations of the

dynamic parent (calling procedure) are searched. If no declaration is found

for x in the dynamic ancestors, it is a runtime error.


Evaluation of Dynamic Scoping


The correct attributes of nonlocal variables visible to a program statement cannot be determined statically, and such variables are not always the same.


Programming problems:




Scope and Lifetime


Apparent relationship between scope and lifetime:


Apparent relationship does not hold:

Referencing Environments


Referencing environment of a statement:


e.g, Pascal

program example;

var a, b : integer;


procedure sub1;

var x, y : integer;

begin { sub1 }

… (1)

end; { sub1 }

procedure sub2;

var x : integer;


procedure sub3;

var x : integer;

begin { sub3 }

… (2)

end; { sub3 }

begin { sub2 }

… (3)

end; { sub2 }

begin { example }

… (4)

end. { example }


Point Referencing Environment

  1. x and y of sub1, a and b of example
  2. x of sub3, (x of sub2 is hidden), a and b of example
  3. x of sub2, a and b of example
  4. a and b of example


A subprogram is active if its execution has begun but has not yet terminated.


The referencing environment of a statement in a dynamically scoped language is the locally declared variables, plus the variables of all other subprograms that are currently active.


Note that recent subprogram activations can have declarations for variables that hide variables with the same names in previous subprogram activations.


e.g., C

void sub1 (void) {

int a, b;

… (1)

} /* end of sub1 */

void sub2 (void) {

int b, c;

… (2)

sub1 ( );

} /* end of sub2 */

void main ( ) {

int c, d;

… (3)

sub2 ( );

} /* end of main */


Point Referencing Environment

  1. d of main, c of sub2, a and b of sub1 (c of main nd b of sub2 are hidden)
  2. d of main, b and c of sub2 (c of main is hidden)
  3. c and d of main


Named Constants


Variable bound to a value only at the time it is bound to storage; its value cannot be changed by assignment or by an input statement.




Pascal named constant declarations require simple values on the RHS of the = operator.


Modula-2, FORTRAN 90 allow constant expressions to be used


Named constants in languages that use static binding of values are sometimes called manifest constants (e.g., Pascal)


Ada allows dynamic typing of values to named constants.


MAC : constant integer := 2 * WIDTH + 1;


Original C did not include named constants. Similar capability provided by the C preprocessor (macro processor).

e.g., #define LISTLEN 100

LISTLEN is called a named literal.


ANSI C/C++ have named constants, these cannot be used in ANSI C to define the length of arrays.


Variable Initialization


Binding of a variable to a value at the time it is bound to storage.


If the variable is statically bound to storage, binding and initialization occur before runtime.

If the storage binding is dynamic, initialization is also dynamic.


e.g., FORTRAN (compile time initialization)



DATA SUM /0/, PI /3.14159/


e.g., Ada


LIMIT : constant INTEGER := OLD_LIMIT + 1;



e.g., ALGOL 68

int first := 10; (variable initialization)

int second = 10; (named constant initialization)


e.g., Pascal, Modula-2

no way to initialize variables other than at run time with assignment statements


Initialization occurs only once for static variables, but occurs with every allocation for dynamically allocated variables (e.g., local variables in an Ada procedure).


C4.2. Procedures and their Implementation




Scope-defining unit of the language

Sequence of executable statements that are treated as a unit.




Delimit the scope of variables local to the module

Provide a mechanism for exporting and importing a set of variables and for separate compilation




Scope-defining unit of the language

Same as a program/subprogram


Implementation of Block-Structured Languages


Procedure invocation data structure or "activation record":


Can be reduced to:


Compiler generates code to manipulate these activation records at procedure invocation and exit (runtime system or runtime support of the language).


The runtime system organizes the activation recors (stack: call a procedure, push the activation record, exit a procedure, pop the activation record off the stack).


The trick is to link the statically enclosing blocks on the stack according to their nesting in the program (dynamic link is evident on the stack). To do the runtime system must keep track of an array of static links which allows the fetching of local and nonlocal variables in constant time based on their static distance (difference in static nesting between a variable use and its declaration, where the static nesting corresponds to how deeply nested a declaration or variable use is).


Procedural Parameters


Illegal to pass procedures as parameters in Ada.


Not uncommon in ALGOL-like languages to pass a procedure to a subprocedure as an argument, but returning procedures as the values of functions is illegal.


e.g., Modula-3



TYPE F = PROCEDURE (x: INTEGER); (* procedure type *)


PROCEDURE Q (fp: F) = BEGIN fp(5) END Q;




Q(P); Q(T);



R( ); (* call subprocedure R *)

END Main.


What get passed to Q to represent P or T ?

What code must be emitted to implement the call fp(5) ?


For any procedure call, we must:

The last 2 steps require an address and a pointer to the statically enclosing block of the procedure being called (the two combined form a "closure").

From the closure passed as an argument to the subprogram, the correct adjustments can be made to the runtime stack to complete the call to a formal procedure argument.

Functions that pass functions back as result values break the stack model (could return a procedure local to the function, and then point to a part of the stack that is no longer in use …).


Calling Procedures


Actual parameters: arguments passed to a procedure.


Formal parameters: place holderd used in the procedure or function definition.


Parameter Passing






Pass l-value of each actual parameter to the subroutine.


Aliasing is a potential issue.




Initializing a variable local to the procedure with the r-value of the actual parameter.


Note that C uses call-by-value exclusively but can pass l-values by values since it is possible to compute the l-value of identifiers (using the & operator).


Cannot affect the calling program through the parameters.


Expensive initialization if argument is a large structured object.




Variable local to the subprocedure is created, and its value is copied back to the corresponding actual parameter.




Combination of call by value and copy-out.


Text Substitution


e.g., #define directive in the preprocessing step of the C language.

(effect of procedure invocation).



A procedure can be replaced by its body with the actuals substituted for the formals.




Evaluation of the actual parameter is deferred until the value of the formal parameter is first needed.


Exception Handling


Exception Handling in Ada


Ada provides a mechanism of raising and catching exceptions.


Exception Handling and Recursive Procedures


Exception propagation follows the dynamic chain of execution, and therefore exceptions can propagate to places in the program where the exception name is not visible.


More About Exceptions


CLU, Modula-3, and ML permit exceptions to have parameters.

e.g., RAISE IO_error ("File not found");




Aggregation of program components into larger units (modules).


An abstract type is a type bundled together with a fixed number of procedures which provide the only way to manipulate elements of the abstract type.


A module may provide one or more types and unrelated functions.


A type which structure is hidden is called an opaque type.


A type whose structure is not hidden is called a transparent type.


A module that provides an abstract type is a server, and a program that uses the abstract type is a client.


Packages in Ada


Ada implements abstract types using packages which have both a specification and an implementation.


Modules in Modula-3


Modules in ML


Classes in C++


Generalization of the C struct construct.


Class declaration: interface of fields and member functions like the module construct in other languages.


Fields and member functions are divided into private, public, and protected parts.


Access granting mechanism: friend. Function that is not a member of the class, but is nevertheless permitted to use its private and protected fields and functions.


Class declaration and implementation are usually provided in different files.


Classes are designed to be cloned into new, similar data structures called derived classes. These classes are said to inherit methods from the base classes when they implicitly offer their clients the same methods as the base classes.


C4.3. Comments on Sethi's Chapters 5, 6