105
ML

ML. Main features Expression-oriented List-oriented, garbage-collected heap-based Functional Functions are first-class values Largely side-effect free

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

ML

Main features Expression-oriented List-oriented, garbage-collected heap-based Functional

Functions are first-class values Largely side-effect free

Strongly, statically typed Polymorphic type system Automatic type inference

Pattern matching Exceptions Modules Highly regular and expressive

History Designed as a Meta Language for

automatic theorem proving system in mid 70’s by Milner et al.

Standard ML: 1986 SML’97: 1997 Caml: a French version of ML, mid

80’s O’Caml: an object-oriented extension

of Caml, late 90’s

Interpreter interface Read-eval-print loop

Read input expression Reading ends with semicolon (not needed in files) = prompt indicates continuing expression on next line

Evaluate expression it (re)bound to result, in case you want to use it again

Print result repeat

- 3 + 4;val it = 7 : int- it + 5;val it = 12 : int- it + 5;val it = 17 : int

Basic ML data types and operations ML is organized around types

each type defines some set of values of that type each type defines a set of operations on values of that type

int ~, +, -, *, div, mod; =, <>, <, >, <=, >=; real, chr

real ~, +, -, *, /; <, >, <=, >= (no equality);

floor, ceil, trunc, round bool: different from int

true, false; =, <>; orelse, andalso string

e.g. "I said \"hi\"\tin dir C:\\stuff\\dir\n" =, <>, ^

char e.g. #"a", #"\n" =, <>; ord, str

Variables and binding Variables declared and initialized with a val

binding- val x:int = 6;val x = 6 : int- val y:int = x * x;val y = 36 : int

Variable bindings cannot be changed! Variables can be bound again,

but this shadows the previous definition- val y:int = y + 1;val y = 37 : int (* a new, different y *)

Variable types can be omitted they will be inferred by ML based on the type of the

r.h.s.- val z = x * y + 5;val z = 227 : int

Strong, static typing ML is statically typed: it will check for type

errors statically when programs are entered, not when they’re run

ML is strongly typed: it will catch all type errors (a.k.a. it's type-safe)

But which errors are type errors? Can have weakly, statically typed languages,

and strongly, dynamically typed languages

Type errors Type errors can look weird, given ML’s

fancy type system- asd;Error: unbound variable or constructor: asd

- 3 + 4.5;Error: operator and operand don’t agree

operator domain: int * int operand: int * real in expression: 3 + 4.5- 3 / 4;Error: overloaded variable not defined at type

symbol: / type: int

Records ML records are like C structs

allow heterogeneous element types, but fixed # of elements

A record type: {name:string, age:int} field order doesn’t matter

A record value: {name="Bob Smith", age=20} Can construct record values from expressions

for field values as with any value, can bind record values to variables

- val bob = {name="Bob " ^ "Smith",=            age=18+num_years_in_college};

val bob = {age=20,name="Bob Smith"}        : {age:int,name:string}

Accessing parts of records Can extract record fields using #fieldname function like C’s -> operator, but a regular function

- val bob’ = {name = #name(bob),

=             age = #age(bob)+1};

val bob’ = {age=21,name="Bob Smith"}

: {...}

Cannot assign/change a record’s fields an immutable data structure

Tuples Like records, but fields ordered by position, not

label Useful for pairs, triples, etc.

A tuple type: string * int order does matter

A tuple value: ("Joe Stevens", 45) Can construct tuple values from expressions

for elements as with any value, can bind tuple values to variables

- val joe = ("Joe "^"Stevens", 25+num_jobs*10);

val joe = ("Joe Stevens",45) : string * int

Accessing parts of tuples Can extract tuple fields using #n

function- val joe’ = (#1(joe), #2(joe)+1);

val joe’ = ("Joe Stevens",46)

: string * int

Cannot assign/change a tuple’s components another immutable data structure

Lists ML lists are built-in, singly-linked lists

homogeneous element types, but variable # of elements

A list type: int list in general: T list, for any type T

A list value: [3, 4, 5] Empty list: [] or nil

null(lst): tests if lst is nil Can create a list value using the […] notation

elements are expressions- val lst = [1+2, 8 div 2, #age(bob)-15];

val lst = [3,4,5] : int list

Basic operations on lists Add to front of list, non-

destructively:::  (an infix operator)

- val lst1 = 3::(4::(5::nil));

val lst1 = [3,4,5] : int list

- val lst2 = 2::lst1;

val lst2 = [2,3,4,5] : int list

nillst2

lst1

2 3 4 5

Basic operations on lists Adding to the front allocates a new

link;the original list is unchanged and still available

- lst1;

val it = [3,4,5] : int list

- lst2;

val it = [2,3,4,5] : int list

nillst2

lst1

2 3 4 5

More on lists Lists can be nested:

- (3 :: nil) :: (4 :: 5 :: nil) :: nil;

val it = [[3],[4,5]]: int list list

Lists must be homogeneous:- [3, "hi there"];

Error: operator and operand don’t agreeoperator domain: int * int listoperand: int * string listin expression:

  (3 : int) :: "hi there" :: nil

Manipulating lists Look up the first (“head”) element: hd

- hd(lst1) + hd(lst2);val it = 5 : int

Extract the rest (“tail”) of the list: tl- val lst3 = tl(lst1);val lst3 = [4,5] : int list- val lst4 = tl(tl(lst3));val lst4 = [] : int list- tl(lst4);  (* or hd(lst4) *)uncaught exception Empty

Cannot assign/change a list’s elements another immutable data structure

First-class values All of ML’s data values are first-class

there are no restrictions on how they can be created, used, passed around, bound to names, stored in other data structures, ….

One consequence: can nest records, tuples, lists arbitrarily

an example of orthogonal design{foo=(3, 5.6, "seattle"), bar=[[3,4], [5,6,7,8], [], [1,2]]}: {bar:int list list, foo:int*real*string}

Another consequence: can create initialized, anonymous values directly, as expressions

instead of using a sequence of statements to first declare (allocate named space) and then assign to initialize

Reference data model A variable refers to a value (of whatever type),

uniformly A record, tuple, or list refers to its element values,

uniformly all values are implicitly referred to by pointer

A variable binding makes the l.h.s. variable refer to its r.h.s. value

No implicit copying upon binding, parameter passing,returning from a function, storing in a data structure

like Java, Scheme, Smalltalk, … (all high-level languages) unlike C, where non-pointer values are copied

C arrays?

Reference-oriented values are heap-allocated (logically) scalar values like ints, reals, chars, bools, nil optimized

Garbage collection ML provides several ways to allocate & initialize new

values (…), {…}, […], ::

But it provides no way to deallocate/free values that are no longer being used

Instead, it provides automatic garbage collection when there are no more references to a value (either from

variables or from other objects), it is deemed garbage, and the system will automatically deallocate the value

dangling pointers impossible(could not guarantee type safety without this!)

storage leaks impossible simpler programming can be more efficient! less ability to carefully manage memory use & reuse

GCs exist even for C & C++, as free libraries

Functions Some function definitions:

- fun square(x:int):int = x * x;val square = fn : int -> int- fun swap(a:int, b:string):string*int = (b,a);

val swap = fn : int * string -> string * int Functions are values with types of the formTarg -> Tresult

use tuple type for multiple arguments use tuple type for multiple results (orthogonality!) * binds tighter than ->

Some function calls:- square(3);   (* parens not needed! *)val it = 9 : int- swap(3 * 4, "billy" ^ "bob"); (*parens needed*)

val it = ("billybob",12) : string * int

Expression-orientation Function body is a single expression

fun square(x:int):int = x * x not a statement list no return keyword

Like equality in math a call to a function is equivalent to its body,

after substituting the actuals in the call for its formals

square(3) (x*x)[x3] 3*3

There are no statements in ML, only expressions

simplicity, regularity, and orthogonality in action What would be statements in other languages

are recast as expressions in ML

If expression General form: if test then e1 else e2

return value of either e1 or e2,based on whether test is true or false

cannot omit else part

- fun max(x:int, y:int):int =

=     if x >= y then x else y;

val max = fn : int * int -> int

Like ?: operator in C don’t need a distinct if statement

Static typechecking ofif expression What are the rules for typechecking an if expression?

What’s the type of the result of if?

Some basic principles of typechecking: values are members of types the type of an expression must include all the values that might

possibly result from evaluating that expression at run-time

Requirements on each if expression: the type of the test expression must be bool the type of the result of the if must include whatever values

might be returned from the if the if might return the result of either e1 or e2

A solution: e1 and e2 must have the same type,and that type is the type of the result of the if expression

Let expression let: an expression that introduces a new nested scope

with local variable declarations unlike { … } statements in C, which don’t compute results

like a gcc extension? General form:

let val id1:type1 = e1

... val idn:typen = en

in ebody end typei are optional; they’ll be inferred from the ei

Evaluates each ei and binds it to idi, in turn each ei can refer to the previous id1..idi-1 bindings

Evaluates ebody and returns it as the result of the let expression

ebody can refer to all the id1..idn bindings The idi bindings disappear after ebody is evaluated

they’re in a nested, local scope

Example scopes- val x = 3;val x = 3 : int- fun f(y:int):int == let val z = x + y= val x = 4= in (let val y = z + x=  in x + y + z end)= + x + y + z= end;val f = fn : int -> int- val x = 5;val x = 5 : int- f(x);???

“Statements” For expressions that have no useful result,

return empty tuple, of type unit:- print("hi\n");

hi

val it = () : unit

Expression sequence operator:  ;(an infix operator, like C's comma operator) evaluates both “arguments”, returns second one

- val z = (print("hi "); print("there\n"); 3);

hi there

val z = 3 : int

Type inference for functions Declaration of function result type can be

omitted infer function result type from body expression result

type- fun max(x:int, y:int) ==     if x >= y then x else y;val max = fn : int * int -> int

Can even omit formal argument type declarations

infer all types based on how arguments are used in body

constraint-based algorithm to do type inference- fun max(x, y) == if x >= y then x else y;val max = fn : int * int -> int

Functions with many possible types Some functions could be used on arguments of different

types Some examples:

null: can test an int list, or a string list, or …;in general, work on a list of any type T

null: T list -> bool hd: similarly works on a list of any type T, and returns an

element of that type:hd: T list -> T

swap: takes a pair of an A and a B, returns a pair of a B and an A:

swap: A * B -> B * A How to define such functions in a statically-typed

language? in C: can’t (or have to use casts) in C++: can use templates (but can’t check separately) in ML: allow functions to have polymorphic types

Polymorphic types A polymorphic type contains one or more type

variables an identifier starting with a quote

'a list'a * 'b * 'a * 'c{x:'a, y:'b} list * 'a -> 'b

A polymorphic type describes a set of possible types,where each type variable is replaced with some type

each occurrence of a type variable must be replaced with the same type

('a * 'b * 'a * 'c)['aint, 'bstring, 'creal->real] (int * string * int * (real->real))

Polymorphic functions Functions can have polymorphic

types:null   : 'a list -> bool

hd     : 'a list -> 'a

tl     : 'a list -> 'a list

(op ::): 'a * 'a list -> 'a list

swap   : 'a * 'b -> 'b * 'a

Calling polymorphic functions When calling a polymorphic function, need to find

the instantiation of the polymorphic type into a regular type that's appropriate for the actual arguments

caller knows types of actual arguments can compute how to replace type variables so that the

replaced function type matches the argument types derive type of result of call

Example: hd([3,4,5]) type of argument: int list type of function: 'a list -> 'a replace 'a with int to make a match instantiated type of hd for this call: int list -> int type of result of this call: int

Polymorphic values Regular values can polymorphic, too

nil: 'a list

Each reference to nil finds the right instantiation for that use, separately from other references

(3 :: 4 :: nil) :: (5 :: nil) :: nil

Polymorphism versus overloading Polymorphic function: same function usable

for many different types- fun swap(i,j) = (j,i);val swap = fn : 'a * 'b -> 'b * 'a

Overloaded function: several different functions, but with same name

the name + is overloaded a function of type int*int->int a function of type real*real->real

Resolve overloading to particular function based on:

static argument types (in ML) dynamic argument classes (in object-oriented

languages)

Example of overload resolution

- 3 + 4;

val it = 7 : int

- 3.0 + 4.5;

val it = 7.5 : real

- (op +); (* which? default to int *)

val it = fn : int*int -> int

- (op +):real*real->real;

val it = fn : real*real -> real

Equality types Built-in = is polymorphic over all types that “admit

equality” i.e., any type except those containing reals or functions

Use ''a, ''b, etc. to stand for these equality types

- fun is_same(x, y) = if x = y then "yes" else "no";

val is_same = fn : ''a * ''a -> string- is_same(3, 4);val it = "no" : string- is_same({l=[3,4,5],h=("a","b"),w=nil},          {l=[3,4,5],h=("a","b"),w=nil});val it = "yes" : string- is_same(3.4, 3.4);Error: operator and operand don’t agree

[equality type required] operator domain: ’’Z * ’’Z operand: real * real in expression: is_same (3.4,3.4)

Loops, using recursion ML has no looping statement or

expression Instead, use recursion to compute

a resultfun append(l1, l2) = if null(l1) then l2 else hd(l1) :: append(tl(l1), l2)

val lst1 = [3, 4]val lst2 = [5, 6, 7]val lst3 = append(lst1, lst2)

Tail recursion Tail recursion: recursive call is last operation

before returning can be implemented just as efficiently as iteration, in

both time and space, since tail-caller isn’t needed after callee returns

Some tail-recursive functions:fun last(lst) =let val tail = tl(lst)in if null(tail) then hd(lst) else last(tail) end

fun includes(lst, x) =if null(lst) then falseelse if hd(lst) = x then trueelse includes(tl(lst), x)

append?

Converting to tail-recursive form Can often rewrite a recursive function into a tail-

recursive one introduce a helper function (usually nested) the helper function has an extra accumulator argument the accumulator holds the partial result computed so far accumulator returned as full result when base case

reached This isn’t tail-recursive:

fun fact(n) =if n <= 1 then 1else fact(n-1) * n

This is:fun fact(n0) =

let fun fact_helper(n, res) =if n <= 1 then reselse fact_helper(n-1,

res*n)in fact_helper(n0, 1) end

Pattern matching Pattern-matching: a convenient syntax for extracting

components of compound values (tuple, record, or list) A pattern looks like an expression to build a compound

value, but with variable names to be bound in some places

cannot use the same variable name more than once Use pattern in place of variable on l.h.s. of val binding

anywhere val can appear: either at top-level or in let (orthogonality & regularity)

- val x = (false, 17);val x = (false,17) : bool*int- val (a, b) = x;val a = false : boolval b = 17 : int- val (root1, root2) = quad_roots(3.0, 4.0,

5.0);val root1 = 0.786299647847 : realval root2 = ~2.11963298118 : real

More patterns List patterns:

- val [x,y] = 3::4::nil;val x = 3 : intval y = 4 : int- val (x::y::zs) = [3,4,5,6,7];val x = 3 : intval y = 4 : intval zs = [5,6,7] : int list

Constants (ints, bools, strings, chars, nil) can be patterns:- val (x, true, 3, "x", z) = (5.5, true, 3, "x",

[3,4]);val x = 5.5 : realval z = [3,4] : int list

If don’t care about some component, can use a wildcard: _- val (_::_::zs) = [3,4,5,6,7];val zs = [5,6,7] : int list

Patterns can be nested, too orthogonality

Function argument patterns Formal parameter of a fun declaration can be a pattern

- fun swap (i, j) = (j, i);val swap = fn : 'a * 'b -> 'b * 'a- fun swap2 p = (#2 p, #1 p);val swap2 = fn : 'a * 'b -> 'b * 'a- fun swap3 p = let val (a,b) = p in (b,a)

end;val swap3 = fn : 'a * 'b -> 'b * 'a- fun best_friend {student={name=n,age=_}, grades=_,

best_friends={name=f,age=_}::_} =n ^ "'s best friend is " ^

f;val best_friend = fn

: {best_friends:{age:'a, name:string} list,   grades:'b,   student:{age:'c, name:string}} -> string

In general, patterns allowed wherever binding occurs

Multiple cases Often a function’s implementation can be broken down

into several different cases, based on the argument value ML allows a single function to be declared via several

cases Each case identified using pattern-matching

cases checked in order, until first matching case- fun fib 0 = 0    | fib 1 = 1    | fib n = fib(n-1) + fib(n-2);val fib = fn : int -> int- fun null nil    = true    | null (_::_) = false;val null = fn : 'a list -> bool- fun append(nil,  lst) = lst    | append(x::xs,lst) = x :: append(xs,lst);val append = fn : 'a list * 'a list -> 'a list

The function has a single type all cases must have same argument and result types

Missing cases What if we don’t provide enough cases?

ML gives a warning message “match nonexhaustive”when function is declared (statically)

ML raises an exception “nonexhaustive match failure”if invoked and no existing case applies (dynamically)

- fun first_elem (x::xs) = x;Warning: match nonexhaustive

x :: xs => ...val first_elem = fn : 'a list -> 'a- first_elem [3,4,5];val it = 3 : int- first_elem [];uncaught exception nonexhaustive match failure

How would you provide an implementation of this missing case for nil?

- fun first_elem (x::xs) = x= | first_elem nil = ???

Exceptions If get in a situation where you can’t produce a normal

value of the right type, then can raise an exception aborts out of normal execution can be handled by some caller reported as a top-level “uncaught exception” if not

handled Step 1: declare an exception that can be raised

- exception EmptyList;exception EmptyList

Step 2: use the raise expression where desired- fun first_elem (x::xs) = x    | first_elem nil = raise EmptyList;val first_elem = fn : 'a list -> 'a (* no

warning! *)- first_elem [3,4,5];val it = 3 : int- first_elem [];uncaught exception EmptyList

Handling exceptions Add handler clause to expressions to handle

(some) exceptions raised in that expressionexpr handle exn_name1 => expr1

          | exn_name2 => expr2

            ...

          | exn_namen => exprn

if expr raises exn_namei, then evaluate and return expri instead

- fun second_elem l = first_elem (tl l);

val second_elem = fn : 'a list -> 'a

- (second_elem [3] handle EmptyList => ~1) + 5

val it = 4 : int

Exceptions with arguments Can have exceptions with

arguments

- exception IOError of int;

exception IOError of int;

- (... raise IOError(-3) ...)

handle IOError(code) => ... code ...

Type synonyms Can give a name to a type, for convenience

name and type are equivalent, interchangeable- type person = {name:string, age:int};

type person = {age:int, name:string}

- val p:person = {name="Bob", age=18};

val p = {age=18,name="Bob"} : person

- val p2 = p;

val p2 = {age=18,name="Bob"} : person

- val p3:{name:string, age:int} = p;

val p3 = {age=18,name="Bob"}

       : {age:int, name:string}

Polymorphic type synonyms Can define polymorphic synonyms

- type 'a stack = 'a list;type ’a stack = ’a list- val emptyStack:'a stack = nil;val emptyStack = [] : ’a stack

Synonyms can have multiple type parameters- type (''key, 'value) assoc_list == (''key * 'value) list;type (’a,’b) assoc_list = (’a * ’b) list

- val grades:(string,int) assoc_list ==   [("Joe", 84), ("Sue", 98), ("Dude", 44)];

val grades=[("Joe",84),("Sue",98),("Dude",44)]

:(string,int) assoc_list

Datatypes Users can define their own (polymorphic) data

structures a new type, unlike type synonyms

Simple example: ML’s version of enumerated types

- datatype sign = Positive | Zero | Negative;

datatype sign = Negative | Positive | Zero declares a type (sign) and a set of alternative

constructor values of that type (Positive etc.) order of constructors doesn’t matter

Another example: bool- datatype bool = true | falsedatatype bool = false | true

Using datatypes Can use constructor values as

regular values Their type is a regular type

- fun signum(x) =

=   if x > 0 then Positive

=   else if x = 0 then Zero

=   else Negative;

val signum = fn : int -> sign

Datatypes and pattern-matching Constructor values can be used in

patterns, too- fun signum(Positive) = 1

=   | signum(Zero)     = 0

=   | signum(Negative) = ~1;

val signum = fn : sign -> int

Datatypes with data Each constructor can have data of particular

type stored with it constructors with data are functions that allocate &

initialize new values with that “tag”- datatype LiteralExpr ==   Nil |=   Integer of int |=   String of string;datatype LiteralExpr =

Integer of int | Nil | String of string

- Nil;val it = Nil : LiteralExpr- Integer(3);val it = Integer 3 : LiteralExpr- String("xyz");val it = String "xyz" : LiteralExpr

Pattern-matching on datatypes The only way to access components of a

value of a datatype is via pattern-matching Constructor “calls” can be used in patterns

to test for and take apart values with that “tag”

- fun toString(Nil) = "nil"

=   | toString(Integer(i)) = Int.toString(i)

=   | toString(String(s)) = "\"" ^ s ^ "\"";

val toString = fn : LiteralExpr -> string

Recursive datatypes Many datatypes are recursive: one or more constructors

are defined in terms of the datatype itself- datatype Expr ==   Nil |=   Integer of int |=   String of string |=   Variable of string |=   Tuple of Expr list |=   BinOpExpr of {arg1:Expr, operator:string,

arg2:Expr} |=   FnCall of {function:string, arg:Expr};datatype Expr = ...

- val e1 = Tuple [Integer(3), String("hi")]; (* (3,"hi") *)

val e1 = Tuple [Integer 3,String "hi"] : Expr

(Nil, Integer, and String of LiteralExpr are shadowed)

Another example Expr value

(*  f(3+x, "hi")  *)

- val e2 =

= FnCall {

= function="f",

= arg=Tuple [

= BinOpExpr {arg1=Integer(3),

= operator="+",

= arg2=Variable("x")},

= String("hi")]};

val e2 = … : Expr

Recursive functions over recursive datatypes Often manipulate recursive datatypes with

recursive functions pattern of recursion in function matches pattern of

recursion in datatype- fun toString(Nil) = "nil"=   | toString(Integer(i)) = Int.toString(i)=   | toString(String(s)) = "\"" ^ s ^ "\""=   | toString(Variable(name)) = name=   | toString(Tuple(elems)) ==         "(" ^ listToString(elems) ^ ")"=   | toString(BinOpExpr{arg1,operator,arg2})==         toString(arg1) ^ " " ^ operator ^ " " ^

= toString(arg2)=   | toString(FnCall{function,arg}) ==         function ^ "(" ^ toString(arg) ^ ")"= …;val toString = fn : Expr -> string

Mutually recursive functions and datatypes If two or more functions are defined in terms

of each other, recursively, then must be declared together, and linked with and

fun toString(...) = ... listToString ...and listToString([]) = ""  | listToString([elem]) = toString(elem)  | listToString(e::es) =        toString(e) ^ "," ^ listToString(es);

If two or more mutually recursive datatypes, then declare them together, linked by and

datatype Stmt = ... Expr ...and Expr = ... Stmt ...

A convenience:record pattern syntactic sugar Instead of writing {a=a, b=b, c=c}

as a pattern, can write {a,b,c} E.g.

... BinOpExpr{arg1,operator,arg2} ...

is short-hand for... BinOpExpr{arg1=arg1,

              operator=operator,

              arg2=arg2} ...

Polymorphic datatypes Datatypes can be polymorphic

- datatype 'a List = Nil=                  | Cons of 'a * 'a List;

datatype 'a List = Cons of 'a * 'a List | Nil

- val lst = Cons(3, Cons(4, Nil));val lst = Cons (3, Cons (4, Nil)) : int List

- fun Null(Nil) = true=   | Null(Cons(_,_)) = false;

val Null = fn : 'a List -> bool- fun Hd(Nil) = raise Empty=   | Hd(Cons(h,_)) = h;val Hd = fn : 'a List -> 'a- fun Sum(Nil) = 0=   | Sum(Cons(x,xs)) = x + Sum(xs);val Sum = fn : int List -> int

Modules for name-space management A file full of types and functions can be cumbersome to

manage Would like some hierarchical organization to names

Modules allow grouping declarations to achieve a hierarchical name-space

ML structure declarations create modules- structure Assoc_List = struct= type (''k,'v) assoc_list =

(''k*'v) list= val empty = nil= fun store(alist, key, value) =

...= fun fetch(alist, key) = ...= end;structure Assoc_List : sig

type ('a,'b) assoc_list = ('a*'b) listval empty : 'a listval store : ('’a*'b) list * ''a * 'b -> ('’a*'b) listval fetch : ('’a*'b) list * ''a -> 'b

end

Using structures To access declarations in a structure, can use dot

notation- val league = Assoc_List.empty;val l = [] : 'a list

- val league = Assoc_List.store(league, "Mariners", {..});

val league = [("Mariners", {..})] : (string * {..}) list

- ...

- Assoc_List.fetch("Mariners");val it = {wins=78,losses=4} : {..}

Other definitions of empty, store, fetch, etc. don’t clash

Common names can be reused by different structures

The open declaration To avoid typing a lot of structure names, can use

the open struct_name declaration to introduce local synonyms for all the declarations in a structure

usually in a let, local, or within some other structurefun add_first_team(name) =let

open Assoc_List(* imports assoc_list, empty,

store, fetch *)val init = {wins=0,losses=0}

instore(empty,name,init)(*

Assoc_List.store(Assoc_List.empty, name,

init) *)end

Modules for encapsulation Want to hide details of data structure implementations from

clients, i.e., data abstraction simplify interface to clients allow implementation to change without affecting clients

In C++ and Java, use public/private annotations In ML:

define a signature that specifies the desired interface specify the signature with the structure declaration

E.g. a signature that hides the implementation of assoc_list:- signature ASSOC_LIST = sig= type (''k,'v) assoc_list (* no rhs! *)= val empty : (''k,'v) assoc_list= val store : (''k,'v) assoc_list * ''k * 'v ->= (''k,'v) assoc_list= val fetch : (''k,'v) assoc_list * ''k -> 'v= end;signature ASSOC_LIST = sig ... end

Specifying the signatures of structures Specify desired signature of structure when

declaring it:- structure Assoc_List :> ASSOC_LIST = struct

= type (''k,'v) assoc_list = (''k*'v) list

= val empty = nil= fun store(alist, key, value) = ...

= fun fetch(alist, key) = ...

= fun helper(...) = ...= end;structure Assoc_List : ASSOC_LIST

The structure’s interface is the given one, not the default interface that exposes everything

Hidden implementation Now clients can’t see implementation, nor guess it

- val teams = Assoc_List.empty;val teams = - : (''a,'b) Assoc_List.assoc_list

- val teams’ = "Mariners"::"Yankees"::teams;Error: operator and operand don't agree

operator: string * string listoperand:  string * (''Z,'Y) Assoc_List.assoc_list

- Assoc_List.helper(…);Error: unbound variable helper in path

Assoc_List.helper

- type Records = (string,…) Assoc_List.assoc_list;type Records = (string,…) Assoc_List.assoc_list- fun sortStandings(nil:Records):Records = nil= | sortStandings(pivot::rest) = ...;Error: pattern and constraint don't agree

pattern:    'Z listconstraint: Records

in pattern: nil : Records

An extended example:binary trees Stores elements in sorted order

enables faster membership testing, printing out in sorted order

datatype 'a BTree =

EmptyBTree

| BTNode of 'a * 'a BTree * 'a BTree

Some functions on binary trees

fun insert(x, EmptyBTree) =

BTNode(x, EmptyBTree, EmptyBTree)

  | insert(x, n as BTNode(y,t1,t2)) =

if x = y then n

else if x < y then

BTNode(y, insert(x, t1), t2)

else BTNode(y, t1, insert(x, t2))

fun member(x, EmptyBTree) = false

  | member(x, BTNode(y,t1,t2)) =

if x = y then true

else if x < y then member(x, t1)

else member(x, t2) What are the types of these functions?

First-class functions Can make code more reusable by parameterizing it by

functions as well as values and types Simple technique: treat functions as first-class values

function values can be created, used, passed around, bound to names, stored in other data structures, etc., just like all other ML values

- fun int_lt(x:int, y:int) = x < y;val int_lt = fn : int * int -> bool

- int_lt(3,4);val it = true : bool

- val f = int_lt;val f = fn : int * int -> bool

- f(3,4);val it = true : bool

Passing functions to functions A function can often be made more flexible if takes another

function as an argument Example:

parameterize binary tree insert & member functions by the = and < comparisons to use

parameterize the quicksort algorithm by the < comparison to use parameterize a list search function by the pattern being searched

for

(* find(test_fn:'a -> bool, lst:'a list):'a *)- exception NotFound;- fun find(test_fn, nil) = raise NotFound    | find(test_fn, elem::elems) =

if test_fn(elem) then elem else find(test_fn, elems);

val find = fn : ('a -> bool) * 'a list -> 'a

- fun is_good_grade(g) = g >= 90;val is_good_grade = fn : int -> bool- find(is_good_grade, [85,72,92,98,84]);val it = 92 : int

Binary tree functions, revisited

- fun insert(x, EmptyBTree, eq, lt) = BTNode(x, EmptyBTree, EmptyBTree)

   | insert(x, n as BTNode(y,t1,t2), eq, lt) = if eq(x,y) then n else if lt(x,y) then BTNode(y, insert(x, t1, eq, lt), t2) else BTNode(y, t1, insert(x, t2, eq, lt))

val insert = fn : 'a * 'a BTree *              ('a * 'a -> bool) *              ('a * 'a -> bool) -> 'a

BTree

- fun member(x, EmptyBTree, eq, lt) = false   | member(x, BTNode(y,t1,t2), eq, lt) =

if eq(x,y) then true else if lt(x,y) then member(x, t1, eq,

lt) else member(x, t2, eq, lt)

val member = fn : 'a * 'a BTree *              ('a * 'a -> bool) *              ('a * 'a -> bool) -> bool

Calling binary tree functions

- val t = insert(5, EmptyBTree, op=, op<);val t = BTNode (5,EmptyBTree,EmptyBTree)      : int BTree- val t = insert(2, t, op=, op<);- val t = insert(3, t, op=, op<);- val t = insert(7, t, op=, op<);- member(2, t, op=, op<);val it = true : bool- member(4, t, op=, op<);val it = false : bool

- ... definitions of person type, person_eq and person_lt functions, and p1 value

- val pt = insert(p1, EmptyBTree,                  person_eq, person_lt);

val pt = ... : person BTree

Storing functions in data structures It’s a pain to keep passing around the eq and lt functions to all

calls of insert and member It’s unreliable to depend on clients to pass in the right

functions

Idea: store the functions in the tree itselflocal

datatype 'a BT = EmptyBT | BTNode of 'a * 'a BT * 'a BTfun ins(x, tree, eq, lt) = ... previous insert ...fun mbr(x, tree, eq, lt) = ... previous member ...

indatatype 'a BTree = BTree of {tree:'a BT,               eq:'a * 'a -> bool,               lt:'a * 'a -> bool}fun emptyBTree(eq,lt) =

BTree{tree=EmptyBT, eq=eq, lt=lt}fun insert(x, BTree{tree, eq, lt}) =

BTree{tree=ins(x, tree, eq, lt), eq=eq, lt=lt}fun member(x, BTree{tree, eq, lt}) =

mbr(x, tree, eq, lt)end

Records containing functions are ML’s version of objects!

A common pattern: map Pattern: take a list and produce a new list, where each

element of the output is calculated from the corresponding element of the input

map captures this patternmap: ('a -> 'b) * 'a list -> 'b list

[not quite the type of ML’s predefined map; stay tuned]

Example: have a list of fahrenheit temperatures for Seattle days want to give a list of temps to friend in England

- fun f2c(f_temp) = (f_temp - 32.0) * 5.0/9.0;val f2c = fn : real -> real

- val f_temps = [56.4, 72.2, 68.4, 78.4, 45.0];val f_temps = [56.4,72.2,68.4,78.4,45.0] : real list

- val c_temps = map(f2c, f_temps);val c_temps = [13.556,22.333,20.222,25.778,7.222] : real list

Another common pattern: filter Pattern: take a list and produce a new list of all the

elements of the first list that pass some test (a predicate)

filter captures this patternfilter: ('a -> bool) * 'a list -> 'a list

[not quite the type of ML’s predefined filter; stay tuned]

Example: have a list of day temps want a list of nice days

- fun is_nice_day(temp) = temp >= 70.0;val is_nice_day = fn : real -> bool

- val nice_days = filter(is_nice_day, f_temps);val nice_days = [72.2,78.4] : real list

Another common pattern: find Pattern: take a list and return the first element

that passes some test, raising an exception if no element passes the test

find captures this patternfind: ('a -> bool) * 'a list -> 'aexception NotFound

[not quite the type of ML’s predefined find; stay tuned]

Example: find first nice day

- val a_nice_day = find(is_nice_day, f_temps);a_nice_day = 72.2 : real

Anonymous functions Map functions and predicate functions often pretty

simple, only used as argument to map, etc.; don’t merit their own name

Can directly write anonymous function expressions:fn patternformal => exprbody

Examples:- fn(x)=> x + 1;val it = fn : int -> int- (fn(x)=> x + 1)(8);val it = 9 : int

- map(fn(f)=> (f - 32.0) * 5.0/9.0, f_temps);val it = [13.556,...] : real list

- filter(fn(t)=> t < 60.0, f_temps);val it = [56.4,45.0] : real list

Fun vs. fn fn expressions are a primitive notion val declarations are a primitive notion fun declarations are just a convenient syntax for val + fn

fun f arg = expr is syntactic sugar for

val rec f = (fn arg => expr)

fun succ(x) = x + 1 is syntactic sugar for

val rec succ = (fn(x) => x + 1)

Explains why the type of a fun declaration prints like a val declaration with a fn value

val succ = fn : int -> int

Attributes of good design: orthogonality of primitives syntactic sugar for common combinations

Nested functions An example:

- fun good_days(good_temp:real,                temps:real list):real list =

filter(fn(temp)=> temp >= good_temp, temps);

val good_days = fn : real * real list -> real list

(* good days in Seattle: *)- good_days(70.0, f_temps)val it = [72.2,78.4] : real list

(* good days in Fairbanks: *)- good_days(32.0, f_temps)val it = [56.4,72.2,68.4,78.4,45.0] : real list

What’s interesting about the anonymous function expressionfn(temp)=> temp >= good_temp ?

Nested functions and scoping If functions can be written nested within other functions

(whether named in a let expression, or anonymous) then can reference local variables in enclosing function scope

Variables declared outside a scope are called free variables

Makes nested functions a lot more useful in practice More than just hiding helper functions

Beyond what can be done with function pointers in C/C++

C functions only have globals as free variables

Akin to inner classes in Java

Returning functions from functions If functions are first-class, then should be able to create and

return them Example: function composition

- fun compose(f,g) = (fn(x) => f(g(x)));val compose = fn : (’b -> ’c) * (’a -> ’b) -> (’a -> ’c)

- fun square(x) = x*x;val square = fn : int -> int- fun double(y) = y+y;val double = fn : int -> int

- val double_square = compose(double, square);val double_square = fn : int -> int- double_square(3);val it = 18 : int- (compose(square,double))(3);val it = 36 : int

The infix o operator is ML’s predefined compose:- map(square o double, [3,4,5]);val it = [36,64,100] : int list

Currying A curried function takes some arguments and then

computes & returns a function which takes additional arguments

The result function can be applied to many different arguments, without having to pass in the first arguments again

Example: a curried version of map:- fun map(f) =

(fn(nil) => nil  |(x::xs) => f(x)::map(f)(xs));

val map = fn : ('a->'b) -> 'a list -> 'b list

- map(square)([3,4,5]);val it = [9,16,25] : int list

- val squares = map(square); (* "partial application" *)val squares = fn : int list -> int list- squares([3,4,5]);val it = [9,16,25] : int list- squares([9,10]);val it = [81,100] : int list

Clean syntactic sugar for currying Allow multiple formal argument patterns curried function Application ("function calling") written without parentheses

juxtaposition associates left-to-right; higher precedence than infix operators Function type (->) associates right-to-left; lower precedence than e.g.

*, list

- fun map f  nil    = nil    | map f (x::xs) = f x :: map f xs; (* parenthesization? *)val map = fn : ('a->'b) -> 'a list -> 'b list (* parenthesization? *)

- fun filter pred  nil    = nil    | filter pred (x::xs) =

let val rest = filter pred xs in if pred x then x::rest else rest end;

val filter = fn : ('a->bool) -> 'a list -> 'a list

- fun find pred  nil    = raise NotFound    | find pred (x::xs) =

if pred x then x else find pred xs;val find = fn : ('a->bool) -> 'a list -> 'a

Curried is the normal way to define ML functions syntactically cleaner semantically more flexible

ML’s predefined map, filter, and find are defined like this

First-class functions and scoping Lexical scoping is interesting if returning a function with

free variables how to remember bindings of free variables?

- fun compose(f,g) = (fn(x) => f(g(x)));val compose = fn : (’a -> ’b) * (’b -> ’c) -> ’a -> ’c

- val double_square = compose(double, square);- val square_double = compose(square, double);

- double_square(3);val it = 18 : int- square_double(3);val it = 36 : int

How are these two calls distinguished?Where do bindings for f and g come from?

All curried functions have free variables like this Many anonymous fn args (to map et al.) have free variables

Closures To support lexically nested procedures

which can be returned out of their enclosing scope, must represent as a closure: a pair of code address and an environment environment records bindings of free variables closure no longer dependent on enclosing scope pair and environment must be heap-allocated e.g. ML, Scheme, Haskell, Smalltalk, Cecil

Restricted versions If only allow to pass nested procedures down,

not return them, then can implement more cheaply

environment can be stack-allocated, not heap-allocated e.g. Pascal, Modula-3

If allow nested procedures but not first-class procedures, then cheaper still

do not need pair, just extra implicit environment argument

e.g. Ada If allow first-class procedures but no nesting,

then can implement with just a code address e.g. C, C++

A general pattern: fold The general pattern over lists simply abstracts the standard

pattern of recursion Recursion pattern:

fun f(…, nil, …) = … (* base case *)  | f(…, x::xs, …) = … x … f(…, xs, …) … (* inductive case *)

Parameters of this pattern, for a list argument of type 'a list: what to return as the base case result ('b) how to compute the inductive result from the head and the

recursive call('a * 'b -> 'b)

fold captures this patternfoldl, foldr: ('a * 'b -> 'b) -> 'b -> 'a list -> 'b

3 curried arguments iterate over elements left-to-right: foldl iterate over elements right-to-left: foldr

for associative combining operators, order doesn’t matter [which is the recursive pattern above?]

Examples using foldfoldl, foldr: ('a * 'b -> 'b) -> 'b -> 'a list -> 'b

Summing all the elements of a list- val rainfall = [0.0, 1.2, 0.0, 0.4, 1.3, 1.1];val rainfall = […] : real list- val total_rainfall =

foldl (fn(rain,subtotal) => rain+subtotal)

      0.0 rainfall;val total_rainfall = 4.0 : real

Reusable sum function?

What do these do?- foldl (fn(x,ls)=>x::ls) nil [3,4,5];

- foldr (fn(x,ls)=>x::ls) nil [3,4,5];

- foldr (fn(x,ls)=>x::ls) [1,2,3] [4,5,6];

Polymorphic type inference ML infers types of expressions automatically, as follows:

assign each declared variable & subexpression a fresh type variable result of function is another type variable share argument and result type variables across function cases

for each subexpression, generate constraints on types of its operands constraint: one type expression must equal another before applying a polymorphic function, replace quantified type variables with fresh

ones for that application solve constraints by unifying type expressions

can partially refine types, e.g.:'a 'b list'b ''c

fail for cyclic constraints, e.g. 'a = 'a list

If overloaded operator is unresolved after constraint solving, default to int version

Overconstrained (unsatisfiable constraints) type error Underconstrained (still some type variables) a polymorphic result

Example #1

fun sum lst =

if null lst then 0

else hd lst +

sum (tl lst)

Example #2

fun map f nil = nil

| map f (x::xs) =

f x ::

map f xs

Let-bound polymorphism ML type inference supports only let-bound polymorphism

only val-/fun-declared names can be polymorphic, not names of formals

implicit quantifiers of polymorphic variables are at outer level “prenex form”

- fun id(x) = x;val id = fn : 'a -> 'a(* with explicit quantifier: val id = fn : 'a.'a->'a *)- fun g(f) = (f 3, f "hi");(* type error in ML; f cannot be given a polymorphic type *)(* this (legal) ML type wouldn’t allow the two different f calls:   val g = fn : 'a.(('a->'a) -> int*string) *)

What if ML allowed explicitly quantified polymorphic types for formals?

- fun g(f:'a.'a->'a) = (f 3, f "hi");val g = fn : ('a.'a->'a) -> int*string- g(id);val it = (3, "hi") : int * string

Type inference precludes first-class polymorphic values

Polymorphic vs. monomorphic recursion When analyzing the body of a polymorphic function, what

do we do when we encounter a recursive call?fun f(x) =

... f(hd(x)) ... f(tl(x)) ...

If allow polymorphic recursion, then f is considered polymorphic in body, and each recursive call uses a fresh instantiation (like any call to a polymorphic function)

If only monomorphic recursion, then force recursive call to pass same argument types as formals (don’t make a fresh instantiation)

Type inference under polymorphic recursion is undecidable

but only in obscure cases ML uses monomorphic recursion

Nested polymorphic functions After doing type inference for a function, if any type variables

remain in its type, then make the function polymorphic over them

But what about a nested function?fun f(x) =

let fun g(u, v) = ([x,u], [v,v]) in ... g(x, 5) ... (* does this work? *) ... g([x], true) ... (* does this? *)end

Type of f: 'a -> '... Type of g: 'a * 'b -> 'a list * 'b list

but 'a and 'b act differently…

'a is a non-generalizable type variable don’t replace with a fresh type variable when g called

Handles monomorphic recursion restriction, too

Properties of ML type inference Hindley-Milner type inference

allows let-bound polymorphism only universal parametric polymorphism,

no constrained polymorphism (other than equality types)

Type inference yields principal type for expression

single most general type that can be inferred

Worst-case complexity of type inference: exponential time

Average case complexity: linear time

References Support side-effects (mutation) through explicit

reference values: ref     : 'a -> 'a ref !       : 'a ref -> 'a (op :=) : 'a ref * 'a -> unit

- val v = ref 0;val v = ref 0 : int ref- v := !v + 1;val it = () : unit- !v;val it = 1 : int

Arrays: indexable mutable locations

Must say which things are mutable Mutation is compartmentalized

References to polymorphic values? Try this:

- fun id(x) = x;

val ID = fn : 'a -> 'a

- val fp = ref id;

(* error in real SML; pretend it’s not *)

val fp = ref fn : ('a -> 'a) ref

- (!fp true, !fp 5);

(true, 5) : bool * int

- fp := not;

hmmmm...

- !fp 5

CRASH!!!

The "value restriction" Cannot allow references to polymorphic

values exception arguments similarly cannot be

polymorphic In general, only polymorphic literals can

be bound in val/fun bindings, not polymorphic expressions get “non-generalizable type variable” error

otherwise SML'90 had “weakly polymorphic types”

instead

Functors Can parameterize structures by other

structures

functor AListUser(AL:ASSOC_LIST) = struct... AL.store ... AL.fetch ...

end

only know aspects of AL that are defined by ASSOC_LIST

Instantiate functors to build regular structures:

- structure ALU1 = AListUser(Assoc_List);

- structure ALU2 = AListUser(Hash_Assoc_List);

Functors for bounded quantification Define a signature representing the operations

needed

signature ORDERED = sigtype Tval eq: T * T -> boolval lt: T * T -> bool

end

Define quantified algorithms as elements of functors parameterized by required signature

functor Sort(O:ORDERED) = structfun min(x,y) = if O.lt(x,y) then x else yfun sort(lst) = ... O.lt(x, y) ...

end

An instantiation of Sort Create specialized sorter by instantiating functor with

appropriate operations- structure IntOrder:ORDERED = struct type T = int val lt = (op <) val eq = (op =) end;structure IntOrder:>ORDERED = …

- structure IntSort = Sort(IntOrder);structure IntSort = … val sort:IntOrder.T list -> IntOrder.T list …

- IntSort.sort([3,5,~2]);val it = [~2,3,5] : IntOrder.T list

Use IntOrder:ORDERED, not IntOrder:>ORDERED Using : instead of :> allows type binding (T=int) to bleed through

to users of IntOrder IntOrder is a view/extension of an existing type, int;

it isn’t creating a new ADT w/ only 2 operations

Another instantiation of Sort Can create nested, multiply parameterized

functors:functor PairOrder(

structure First:ORDERED;structure Second:ORDERED):ORDERED =

structtype T = First.T * Second.Tfun lt((x1,x2),(y1,y2)) = First.lt(x1,y1) andalso

Second.lt(x2,y2);fun eq((x1,x2),(y1,y2)) = ...;

end

(* to sort (int*string) lists: *)structure IntStringSort = Sort(  PairOrder(structure First = IntOrder;            structure Second = StringOrder))

Signature “subtyping” Signature specifies a particular interface Any structure that satisfies that interface

can be used where that interface is expected e.g. in functor application

Structure can have more operations more polymorphic operations more details of implementation of types

than required by signature

Some limitations of ML modules Structures are not first-class values

must be named or be argument to functor application must be declared at top-level or nested inside another

structure or signature

Cannot instantiate functors at run-time to create “objects” cannot simulate classes and object-oriented programming

No type inference for functor arguments

These constraints are to enable type inference of core and static typechecking (at all) of structures that contain types

Modules vs. classes Classes (abstract data types) implicitly define a single type,

with associated constructors, observers, and mutators

Modules can define 0, 1, or many types in same module,with associated operations over several types

no new types if adding operations to existing type(s) e.g. a library of integer or array functions hard to do in C++

multiple types can share private data & operations requires friend declarations in C++

one new type requires a name for the type (e.g. T) class name is also type name in C++, conveniently

Functors similar to parameterized classes

C++’s public/private is simpler than ML’s separate signatures, but C++ doesn’t have a simple way of describing just an interface

See Moby: modules + classes, cleanly