topics/_week5_system_F.mdwn

   1 # System F and recursive types
   2
   3 In the simply-typed lambda calculus, we write types like <code>&sigma;
   4 -> &tau;</code>.  This looks like logical implication.  We'll take
   5 that resemblance seriously when we discuss the Curry-Howard
   6 correspondence.  In the meantime, note that types respect modus
   7 ponens:
   8
   9 <pre>
  10 Expression    Type      Implication
  11 -----------------------------------
  12 fn            &alpha; -> &beta;    &alpha; &sup; &beta;
  13 arg           &alpha;         &alpha;
  14 ------        ------    --------
  15 (fn arg)      &beta;         &beta;
  16 </pre>
  17
  18 The implication in the right-hand column is modus ponens, of course.
  19
  20 System F was discovered by Girard (the same guy who invented Linear
  21 Logic), but it was independently proposed around the same time by
  22 Reynolds, who called his version the *polymorphic lambda calculus*.
  23 (Reynolds was also an early player in the development of
  24 continuations.)
  25
  26 System F enhances the simply-typed lambda calculus with abstraction
  27 over types.  In order to state System F, we'll need to adopt the
  28 notational convention that "<code>x:&alpha;</code>" represents an
  29 expression `x` whose type is <code>&alpha;</code>.
  30
  31 Then System F can be specified as follows (choosing notation that will
  32 match up with usage in O'Caml, whose type system is based on System F):
  33
  34         System F:
  35         ---------
  36         types τ ::= c | 'a | τ1 -> τ2 | ∀'a. τ
  37         expressions e ::= x | λx:τ. e | e1 e2 | Λ'a. e | e [τ]
  38
  39 In the definition of the types, "`c`" is a type constant (e.g., `e` or
  40 `t`, or in arithmetic contexts, `N` or `Int`).  "`'a`" is a type
  41 variable (the tick mark just indicates that the variable ranges over
  42 types rather than over values).  "`τ1 -> τ2`" is the type of a
  43 function from expressions of type `τ1` to expressions of type `τ2`.
  44 And "`∀'a. τ`" is called a universal type, since it universally
  45 quantifies over the type variable `'a`.  (You can expect that in
  46 `∀'a. τ`, the type `τ` will usually have at least one free occurrence
  47 of `'a` somewhere inside of it.)
  48
  49 In the definition of the expressions, we have variables "`x`" as usual.
  50 Abstracts "`λx:τ. e`" are similar to abstracts in the simply-typed lambda
  51 calculus, except that they have their shrug variable annotated with a
  52 type.  Applications "`e1 e2`" are just like in the simply-typed lambda calculus.
  53
  54 In addition to variables, abstracts, and applications, we have two
  55 additional ways of forming expressions: "`Λ'a. e`" is called a *type
  56 abstraction*, and "`e [τ]`" is called a *type application*.  The idea
  57 is that <code>&Lambda;</code> is a capital <code>&lambda;</code>: just
  58 like the lower-case <code>&lambda;</code>, <code>&Lambda;</code> binds
  59 variables in its body, except that unlike <code>&lambda;</code>,
  60 <code>&Lambda;</code> binds type variables instead of expression
  61 variables.  So in the expression
  62
  63 <code>&Lambda; 'a (&lambda; x:'a . x)</code>
  64
  65 the <code>&Lambda;</code> binds the type variable `'a` that occurs in
  66 the <code>&lambda;</code> abstract.  This expression is a polymorphic
  67 version of the identity function.  It defines one general identity
  68 function that can be adapted for use with expressions of any type. In order
  69 to get it ready to apply to, say, a variable of type boolean, just do
  70 this:
  71
  72 <code>(&Lambda; 'a (&lambda; x:'a . x)) [t]</code>
  73
  74 This type application (where `t` is a type constant for Boolean truth
  75 values) specifies the value of the type variable &alpha;, which is
  76 the type of the variable bound in the &lambda; expression.  Not
  77 surprisingly, the type of this type application is a function from
  78 Booleans to Booleans:
  79
  80 <code>((&Lambda; 'a (&lambda; x:'a . x)) [t]): (b -> b)</code>
  81
  82 Likewise, if we had instantiated the type variable as an entity (base
  83 type `e`), the resulting identity function would have been a function
  84 of type `e -> e`:
  85
  86 <code>((&Lambda; 'a (&lambda; x:'a . x)) [e]): (e -> e)</code>
  87
  88 Clearly, for any choice of a type `'a`, the identity function can be
  89 instantiated as a function from expresions of type `'a` to expressions
  90 of type `'a`.  In general, then, the type of the unapplied
  91 (polymorphic) identity function is
  92
  93 <code>(&Lambda; 'a (&lambda; x:'a . x)): (&forall; 'a . 'a -> 'a)</code>
  94
  95 Pred in System F
  96 ----------------
  97
  98 We saw that the predecessor function couldn't be expressed in the
  99 simply-typed lambda calculus.  It can be expressed in System F,
 100 however.  Here is one way, coded in
 101 [[Benjamin Pierce's type-checker and evaluator for
 102 System F|http://www.cis.upenn.edu/~bcpierce/tapl/index.html]] (the
 103 part you want is called "fullpoly"):
 104
 105     N = All X . (X->X)->X->X;
 106     Pair = All X . (N -> N -> X) -> X;
 107     let zero = lambda X . lambda s:X->X . lambda z:X. z in
 108     let snd = lambda x:N . lambda y:N . y in
 109     let pair = lambda x:N . lambda y:N . lambda X . lambda z:N->N->X . z x y in
 110     let suc = lambda n:N . lambda X . lambda s:X->X . lambda z:X . s (n [X] s z) in
 111     let shift = lambda p:Pair . p [Pair] (lambda a:N . lambda b:N . pair (suc a) a) in
 112     let pre = lambda n:N . n [Pair] shift (pair zero zero) [N] snd in
 113
 114     pre (suc (suc (suc zero)));
 115
 116 We've truncated the names of "suc(c)" and "pre(d)", since those are
 117 reserved words in Pierce's system.  Note that in this code, there is
 118 no typographic distinction between ordinary lambda and type-level
 119 lambda, though the difference is encoded in whether the variables are
 120 lower case (for ordinary lambda) or upper case (for type-level
 121 lambda).
 122
 123 The key to the extra flexibility provided by System F is that we can
 124 instantiate the `pair` function to return a number, as in the
 125 definition of `pre`, or we can instantiate it to return an ordered
 126 pair, as in the definition of the `shift` function.  Because we don't
 127 have to choose a single type for all uses of the pair-building
 128 function, we aren't forced into a infinite regress of types.
 129
 130 [See Benjamin C. Pierce. 2002. *Types and Programming Languages*, MIT
 131 Press, pp. 350--353, for `tail` for lists in System F.]
 132
 133 Typing &omega;
 134 --------------
 135
 136 In fact, it is even possible to give a type for &omeage; in System F.
 137
 138     omega = lambda x:(All X. X->X) . x [All X . X->X] x in
 139     omega;
 140
 141 Each time the internal application is performed, the type of the head
 142 is chosen anew.  And each time, we choose the same type as before, the
 143 type of a function that takes an argument of any type and returns a
 144 result of the same type...
 145
 146
 147 Types in OCaml
 148 --------------
 149
 150 OCaml has type inference: the system can often infer what the type of
 151 an expression must be, based on the type of other known expressions.
 152
 153 For instance, if we type
 154
 155     # let f x = x + 3;;
 156
 157 The system replies with
 158
 159     val f : int -> int = <fun>
 160
 161 Since `+` is only defined on integers, it has type
 162
 163      # (+);;
 164      - : int -> int -> int = <fun>
 165
 166 The parentheses are there to turn off the trick that allows the two
 167 arguments of `+` to surround it in infix (for linguists, SOV) argument
 168 order. That is,
 169
 170     # 3 + 4 = (+) 3 4;;
 171     - : bool = true
 172
 173 In general, tuples with one element are identical to their one
 174 element:
 175
 176     # (3) = 3;;
 177     - : bool = true
 178
 179 though OCaml, like many systems, refuses to try to prove whether two
 180 functional objects may be identical:
 181
 182     # (f) = f;;
 183     Exception: Invalid_argument "equal: functional value".
 184
 185 Oh well.
 186
 187 [Note: There is a limited way you can compare functions, using the
 188 `==` operator instead of the `=` operator. Later when we discuss mutation,
 189 we'll discuss the difference between these two equality operations.
 190 Scheme has a similar pair, which they name `eq?` and `equal?`. In Python,
 191 these are `is` and `==` respectively. It's unfortunate that OCaml uses `==` for the opposite operation that Python and many other languages use it for. In any case, OCaml will accept `(f) == f` even though it doesn't accept
 192 `(f) = f`. However, don't expect it to figure out in general when two functions
 193 are equivalent. (That question is not Turing computable.)
 194
 195         # (f) == (fun x -> x + 3);;
 196         - : bool = false
 197
 198 Here OCaml says (correctly) that the two functions don't stand in the `==` relation, which basically means they're not represented in the same chunk of memory. However as the programmer can see, the functions are extensionally equivalent. The meaning of `==` is rather weird.]
 199
 200
 201
 202 Booleans in OCaml, and simple pattern matching
 203 ----------------------------------------------
 204
 205 Where we would write `true 1 2` in our pure lambda calculus and expect
 206 it to evaluate to `1`, in OCaml boolean types are not functions
 207 (equivalently, they're functions that take zero arguments). Instead, selection is
 208 accomplished as follows:
 209
 210     # if true then 1 else 2;;
 211     - : int = 1
 212
 213 The types of the `then` clause and of the `else` clause must be the
 214 same.
 215
 216 The `if` construction can be re-expressed by means of the following
 217 pattern-matching expression:
 218
 219     match <bool expression> with true -> <expression1> | false -> <expression2>
 220
 221 That is,
 222
 223     # match true with true -> 1 | false -> 2;;
 224     - : int = 1
 225
 226 Compare with
 227
 228     # match 3 with 1 -> 1 | 2 -> 4 | 3 -> 9;;
 229     - : int = 9
 230
 231 Unit and thunks
 232 ---------------
 233
 234 All functions in OCaml take exactly one argument.  Even this one:
 235
 236     # let f x y = x + y;;
 237     # f 2 3;;
 238     - : int = 5
 239
 240 Here's how to tell that `f` has been curry'd:
 241
 242     # f 2;;
 243     - : int -> int = <fun>
 244
 245 After we've given our `f` one argument, it returns a function that is
 246 still waiting for another argument.
 247
 248 There is a special type in OCaml called `unit`.  There is exactly one
 249 object in this type, written `()`.  So
 250
 251     # ();;
 252     - : unit = ()
 253
 254 Just as you can define functions that take constants for arguments
 255
 256     # let f 2 = 3;;
 257     # f 2;;
 258     - : int = 3;;
 259
 260 you can also define functions that take the unit as its argument, thus
 261
 262     # let f () = 3;;
 263     val f : unit -> int = <fun>
 264
 265 Then the only argument you can possibly apply `f` to that is of the
 266 correct type is the unit:
 267
 268     # f ();;
 269     - : int = 3
 270
 271 Now why would that be useful?
 272
 273 Let's have some fun: think of `rec` as our `Y` combinator.  Then
 274
 275     # let rec f n = if (0 = n) then 1 else (n * (f (n - 1)));;
 276     val f : int -> int = <fun>
 277     # f 5;;
 278     - : int = 120
 279
 280 We can't define a function that is exactly analogous to our &omega;.
 281 We could try `let rec omega x = x x;;` what happens?
 282
 283 [Note: if you want to learn more OCaml, you might come back here someday and try:
 284
 285         # let id x = x;;
 286         val id : 'a -> 'a = <fun>
 287         # let unwrap (`Wrap a) = a;;
 288         val unwrap : [< `Wrap of 'a ] -> 'a = <fun>
 289         # let omega ((`Wrap x) as y) = x y;;
 290         val omega : [< `Wrap of [> `Wrap of 'a ] -> 'b as 'a ] -> 'b = <fun>
 291         # unwrap (omega (`Wrap id)) == id;;
 292         - : bool = true
 293         # unwrap (omega (`Wrap omega));;
 294     <Infinite loop, need to control-c to interrupt>
 295
 296 But we won't try to explain this now.]
 297
 298
 299 Even if we can't (easily) express omega in OCaml, we can do this:
 300
 301     # let rec blackhole x = blackhole x;;
 302
 303 By the way, what's the type of this function?
 304
 305 If you then apply this `blackhole` function to an argument,
 306
 307     # blackhole 3;;
 308
 309 the interpreter goes into an infinite loop, and you have to type control-c
 310 to break the loop.
 311
 312 Oh, one more thing: lambda expressions look like this:
 313
 314     # (fun x -> x);;
 315     - : 'a -> 'a = <fun>
 316     # (fun x -> x) true;;
 317     - : bool = true
 318
 319 (But `(fun x -> x x)` still won't work.)
 320
 321 You may also see this:
 322
 323         # (function x -> x);;
 324         - : 'a -> 'a = <fun>
 325
 326 This works the same as `fun` in simple cases like this, and slightly differently in more complex cases. If you learn more OCaml, you'll read about the difference between them.
 327
 328 We can try our usual tricks:
 329
 330     # (fun x -> true) blackhole;;
 331     - : bool = true
 332
 333 OCaml declined to try to fully reduce the argument before applying the
 334 lambda function. Question: Why is that? Didn't we say that OCaml is a call-by-value/eager language?
 335
 336 Remember that `blackhole` is a function too, so we can
 337 reverse the order of the arguments:
 338
 339     # blackhole (fun x -> true);;
 340
 341 Infinite loop.
 342
 343 Now consider the following variations in behavior:
 344
 345     # let test = blackhole blackhole;;
 346     <Infinite loop, need to control-c to interrupt>
 347
 348     # let test () = blackhole blackhole;;
 349     val test : unit -> 'a = <fun>
 350
 351     # test;;
 352     - : unit -> 'a = <fun>
 353
 354     # test ();;
 355     <Infinite loop, need to control-c to interrupt>
 356
 357 We can use functions that take arguments of type `unit` to control
 358 execution.  In Scheme parlance, functions on the `unit` type are called
 359 *thunks* (which I've always assumed was a blend of "think" and "chunk").
 360
 361 Question: why do thunks work? We know that `blackhole ()` doesn't terminate, so why do expressions like:
 362
 363         let f = fun () -> blackhole ()
 364         in true
 365
 366 terminate?
 367
 368 Bottom type, divergence
 369 -----------------------
 370
 371 Expressions that don't terminate all belong to the **bottom type**. This is a subtype of every other type. That is, anything of bottom type belongs to every other type as well. More advanced type systems have more examples of subtyping: for example, they might make `int` a subtype of `real`. But the core type system of OCaml doesn't have any general subtyping relations. (Neither does System F.) Just this one: that expressions of the bottom type also belong to every other type. It's as if every type definition in OCaml, even the built in ones, had an implicit extra clause:
 372
 373         type 'a option = None | Some of 'a;;
 374         type 'a option = None | Some of 'a | bottom;;
 375
 376 Here are some exercises that may help better understand this. Figure out what is the type of each of the following:
 377
 378         fun x y -> y;;
 379
 380         fun x (y:int) -> y;;
 381
 382         fun x y : int -> y;;
 383
 384         let rec blackhole x = blackhole x in blackhole;;
 385
 386         let rec blackhole x = blackhole x in blackhole 1;;
 387
 388         let rec blackhole x = blackhole x in fun (y:int) -> blackhole y y y;;
 389
 390         let rec blackhole x = blackhole x in (blackhole 1) + 2;;
 391
 392         let rec blackhole x = blackhole x in (blackhole 1) || false;;
 393
 394         let rec blackhole x = blackhole x in 2 :: (blackhole 1);;
 395
 396 By the way, what's the type of this:
 397
 398         let rec blackhole (x:'a) : 'a = blackhole x in blackhole
 399
 400
 401 Back to thunks: the reason you'd want to control evaluation with
 402 thunks is to manipulate when "effects" happen. In a strongly
 403 normalizing system, like the simply-typed lambda calculus or System F,
 404 there are no "effects." In Scheme and OCaml, on the other hand, we can
 405 write programs that have effects. One sort of effect is printing.
 406 Another sort of effect is mutation, which we'll be looking at soon.
 407 Continuations are yet another sort of effect. None of these are yet on
 408 the table though. The only sort of effect we've got so far is
 409 *divergence* or non-termination. So the only thing thunks are useful
 410 for yet is controlling whether an expression that would diverge if we
 411 tried to fully evaluate it does diverge. As we consider richer
 412 languages, thunks will become more useful.