Programming Languages (Honors) G22.3110 Spring 2009 ML Programming Project Due Friday, April 17 Your task is to implement a polymorphic type inferencer for the language whose syntax is represented by the exp datatype in the file unify.sml. You are free to use all the code on unification provided in the file or from any other source, but are not allowed to use code from any other source besides this document for the type inference component. Generic vs. non-generic type variables -------------------------------------- Consider the following ML definition: fun foo x y = let fun bar z = if length z < 3 then x else y in (bar [1], bar[true,false]) end The type of foo is 'a -> 'a -> 'a * 'a where 'a can be any type. What is the type of bar? It is 'b list -> 'a, where 'b can be any type, but 'a is constrained to be the same type as the outer x and y. This means that for the two calls to bar, the type of each occurrence of bar is an instance of 'b -> 'a where 'b can be replaced by another type but 'a cannot. We say that 'b is a generic type variable and 'a is a non-generic type variable in the type of bar. Recall from the earlier handout that formal parameters cannot be used polymorphically, therefore during type inference, formal parameters are assigned non-generic type variables. Static Semantics of the language -------------------------------- Individual definitions of let-bound variables can be recursive. In a given let block, however, the definitions are elaborated from top to bottom (thus, no mutually recursive functions). For example, let val fac = fn x => if x = 0 then 1 else x * fac (x - 1) val a = fac 4 val b = a + 2 in a + b end is a valid program (returning 50). Notice this is slightly different from ML (don't need fun or "val rec" for recursive functions). Otherwise, the type rules of ML hold. Specifically, only let-bound variables can be used polymorphically, but must be used monomorphically in the right hand side of their own definitions. Lambda-bound variables can only be used monomorphically. How I wrote mine ---------------- Here I will describe some aspects of my type inferencer. You can follow what I did or not - it's up to you. I wrote mine off the top of my head and it isn't necessarily the best way (in fact, I know there are some inefficiencies). The procedure type_check takes the following parameters - e: an expression - theta: a substitution binding type variables to type expressions - env: an environment binding program variables to type expressions - non_gens: a list of type variables that must be non-generic in the type of e. type_check returns a type expression representing the type of e and a new substitution theta' resulting from updating theta to incorporating all the new type variable bindings resulting from type checking e. Here is a skeleton of my type inferencer. You can use whatever you like. fun type_check e theta env non_gens = case e of constant_int _ => (integer,theta) | constant_bool _ => (boolean,theta) | variable v => let (* lookup the type t of v in the environment. Then create a new instance of t such that all the type variables in t that are not in non_gens are replaced by brand new type variables. *) in (apply_sub theta t, theta) end | alist [] => (* The type of [] is 'a list, where 'a is a brand new type variable *) | alist (e1::rest) => (* Call type_check on the rest, returning listtype(t1) and a theta'. Then call type_check on e1 using the theta', returning t2 and theta''. Apply theta'' to t1 giving t1'. Finally, unify t2 and t1' giving a theta'''. Return listtype(apply_sub theta''' t2) and theta'''. *) | cond(test,conseq,alt) => (* Call type_check on the test returning t1 and theta'. Call type_check on conseq with theta', returning t2 and theta''. Call type_check on alt with theta'', returning t3 and theta'''. Apply theta''' to t1 and t2, returning t1' and t2'. Then, unify t1' with boolean and t2' with t3, using theta''', returning theta''''. Return the unified type along with theta''''. | lambda(formal,body) => (* Type_check the body using an environment binding the formal to a new type variable v and include v in the non_gens actual parameter to type_check. Type_check will return a body type and a new subsitution theta'. Return the type of the lambda (what is it?) and the new_theta *) | letdef ([],body) => type_check body theta env non_gens | letdef ((name, def)::rest, body) => (* call type_check on def in an environment in which name is bound to a new type variable v. Since v must be non-generic, include v in the actual parameter for non-gens. After type_check on def has returned a type t and a theta', unify v with t using theta'. This unification will return theta''. Then call type_check on rest using theta''. Since name can be used polymorphically in rest and body, do not include v in the non-gens parameter when type checking rest and body. *) | application (func,arg) => let val (funtype,theta1) = type_check func theta env non_gens val (argtype,theta2) = type_check arg theta1 env non_gens val new_funtype = apply_sub theta2 funtype val result_type = typevar (new_var()) val new_theta = unify [(new_funtype, argtype --> result_type)] theta2 in (apply_sub new_theta result_type, new_theta) end The function new_var returns a new type variable name each time. It is defined as follows: val new_var = let val count = ref 0 in fn () => let val value = "T" ^ Int.toString (!count) in (count := ! count + 1; value) end end