Programming Languages (Honors)
G22.3110 Fall 2005
ML Programming Project
Due Monday, November 28
Your task is to implement a polymorphic type inferencer for the
language whose syntax is represented by the exp datatype in the file
unify.sml. You are free to use all the code on unification provided in
the file or from any other source, but are not allowed to use code
from any other source besides this document for the type inference
component.
Generic vs. nongeneric type variables

Consider the following ML definition:
fun foo x y = let
fun bar z = if length z < 3 then x else y
in
(bar [1], bar[true,false])
end
The type of foo is 'a > 'a > 'a * 'a where 'a can
be any type. What is the type of bar? It is 'b list > 'a,
where 'b can be any type, but 'a is constrained to be
the same type as the outer x and y. This means that for
the two calls to bar, the type of each occurrence of bar
is an instance of 'b > 'a where 'b can be replaced by
another type but 'a cannot. We say that 'b is a
generic type variable and 'a is a nongeneric type variable in
the type of bar. Recall from the earlier handout that formal
parameters cannot be used polymorphically, therefore during type
inference, formal parameters are assigned nongeneric type variables.
Static Semantics of the language

Individual definitions of letbound variables can be recursive.
In a given let block, however, the definitions are elaborated
from top to bottom (thus, no mutually recursive functions).
For example,
let val fac = fn x => if x = 0 then 1 else x * fac (x  1)
val a = fac 4
val b = a + 2
in
a + b
end
is a valid program (returning 50). Notice this is slightly different
from ML (don't need fun or "val rec" for recursive functions).
Otherwise, the type rules of ML hold. Specifically, only letbound
variables can be used polymorphically, but must be used monomorphically
in the right hand side of their own definitions. Lambdabound variables
can only be used monomorphically.
How I wrote mine

Here I will describe some aspects of my type inferencer. You can follow
what I did or not  it's up to you. I wrote mine off the top of my
head and it isn't necessarily the best way (in fact, I know there
are some inefficiencies).
The procedure type_check takes the following parameters
 e: an expression
 theta: a substitution binding type variables to type expressions
 env: an environment binding program variables to type expressions
 non_gens: a list of type variables that must be nongeneric in
the type of e.
type_check returns a type expression representing the type of e and a
new substitution theta' resulting from updating theta to incorporating
all the new type variable bindings resulting from type checking e.
Here is a skeleton of my type inferencer. You can use whatever you
like.
fun type_check e theta env non_gens =
case e of
constant_int _ => (integer,theta)

constant_bool _ => (boolean,theta)

variable v => let
(* lookup the type t of v in the environment.
Then create a new instance of t such that
all the type variables in t that are not in non_gens
are replaced by brand new type variables. *)
in
(apply_sub theta t, theta)
end

alist [] => (* The type of [] is 'a list, where 'a is a brand new
type variable *)

alist (e1::rest) => (* Call type_check on the rest, returning listtype(t1)
and a theta'. Then call type_check on e1 using the
theta', returning t2 and theta''. Apply theta'' to t1
giving t1'. Finally, unify t2 and t1' giving a theta'''.
Return listtype(apply_sub theta''' t2) and theta'''. *)

cond(test,conseq,alt) => (* Call type_check on the test returning t1 and theta'.
Call type_check on conseq with theta', returning
t2 and theta''. Call type_check on alt with theta'',
returning t3 and theta'''. Apply theta''' to t1
and t2, returning t1' and t2'. Then, unify t1'
with boolean and t2' with t3, using theta''',
returning theta''''. Return the unified type
along with theta''''.

lambda(formal,body) => (* Type_check the body using an environment
binding the formal to a new type variable v
and include v in the non_gens actual parameter
to type_check. Type_check will return a
body type and a new subsitution theta'.
Return the type of the lambda (what is it?) and
the new_theta *)

letdef ([],body) => type_check body theta env non_gens

letdef ((name, def)::rest, body) =>
(* call type_check on def in an environment in which name is
bound to a new type variable v. Since v must be nongeneric,
include v in the actual parameter for nongens. Then call
type_check on rest. Since name can be used polymorphically
in rest and body, do not include v in the nongens parameter
when type checking rest and body. *)

application (func,arg) =>
let val (funtype,theta1) = type_check func theta env non_gens
val (argtype,theta2) = type_check arg theta1 env non_gens
val new_funtype = apply_sub theta2 funtype
val result_type = typevar (new_var())
val new_theta = unify [(new_funtype, argtype > result_type)] theta2
in
(apply_sub new_theta result_type, new_theta)
end
The function new_var returns a new type variable name each time.
It is defined as follows:
val new_var = let val count = ref 0
in fn () => let val value = "T" ^ Int.toString (!count)
in (count := ! count + 1;
value)
end
end