week6.mdwn

   1 [[!toc]]
   2
   3 Types, OCaml
   4 ------------
   5
   6 OCaml has type inference: the system can often infer what the type of
   7 an expression must be, based on the type of other known expressions.
   8
   9 For instance, if we type
  10
  11     # let f x = x + 3;;
  12
  13 The system replies with
  14
  15     val f : int -> int = <fun>
  16
  17 Since `+` is only defined on integers, it has type
  18
  19      # (+);;
  20      - : int -> int -> int = <fun>
  21
  22 The parentheses are there to turn off the trick that allows the two
  23 arguments of `+` to surround it in infix (for linguists, SOV) argument
  24 order. That is,
  25
  26     # 3 + 4 = (+) 3 4;;
  27     - : bool = true
  28
  29 In general, tuples with one element are identical to their one
  30 element:
  31
  32     # (3) = 3;;
  33     - : bool = true
  34
  35 though OCaml, like many systems, refuses to try to prove whether two
  36 functional objects may be identical:
  37
  38     # (f) = f;;
  39     Exception: Invalid_argument "equal: functional value".
  40
  41 Oh well.
  42
  43 [Note: There is a limited way you can compare functions, using the
  44 `==` operator instead of the `=` operator. Later when we discuss mutation,
  45 we'll discuss the difference between these two equality operations.
  46 Scheme has a similar pair, which they name `eq?` and `equal?`. In Python,
  47 these are `is` and `==` respectively. It's unfortunate that OCaml uses `==` for the opposite operation that Python and many other languages use it for. In any case, OCaml will understand `(f) == f` even though it doesn't understand
  48 `(f) = f`. However, don't expect it to figure out in general when two functions
  49 are identical. (That question is not Turing computable.)
  50
  51         # (f) == (fun x -> x + 3);;
  52         - : bool = false
  53
  54 Here OCaml says (correctly) that the two functions don't stand in the `==` relation, which basically means they're not represented in the same chunk of memory. However as the programmer can see, the functions are extensionally equivalent. The meaning of `==` is rather weird.]
  55
  56
  57
  58 Booleans in OCaml, and simple pattern matching
  59 ----------------------------------------------
  60
  61 Where we would write `true 1 2` in our pure lambda calculus and expect
  62 it to evaluate to `1`, in OCaml boolean types are not functions
  63 (equivalently, they're functions that take zero arguments). Instead, selection is
  64 accomplished as follows:
  65
  66     # if true then 1 else 2;;
  67     - : int = 1
  68
  69 The types of the `then` clause and of the `else` clause must be the
  70 same.
  71
  72 The `if` construction can be re-expressed by means of the following
  73 pattern-matching expression:
  74
  75     match <bool expression> with true -> <expression1> | false -> <expression2>
  76
  77 That is,
  78
  79     # match true with true -> 1 | false -> 2;;
  80     - : int = 1
  81
  82 Compare with
  83
  84     # match 3 with 1 -> 1 | 2 -> 4 | 3 -> 9;;
  85     - : int = 9
  86
  87 Unit and thunks
  88 ---------------
  89
  90 All functions in OCaml take exactly one argument.  Even this one:
  91
  92     # let f x y = x + y;;
  93     # f 2 3;;
  94     - : int = 5
  95
  96 Here's how to tell that `f` has been curry'd:
  97
  98     # f 2;;
  99     - : int -> int = <fun>
 100
 101 After we've given our `f` one argument, it returns a function that is
 102 still waiting for another argument.
 103
 104 There is a special type in OCaml called `unit`.  There is exactly one
 105 object in this type, written `()`.  So
 106
 107     # ();;
 108     - : unit = ()
 109
 110 Just as you can define functions that take constants for arguments
 111
 112     # let f 2 = 3;;
 113     # f 2;;
 114     - : int = 3;;
 115
 116 you can also define functions that take the unit as its argument, thus
 117
 118     # let f () = 3;;
 119     val f : unit -> int = <fun>
 120
 121 Then the only argument you can possibly apply `f` to that is of the
 122 correct type is the unit:
 123
 124     # f ();;
 125     - : int = 3
 126
 127 Now why would that be useful?
 128
 129 Let's have some fun: think of `rec` as our `Y` combinator.  Then
 130
 131     # let rec f n = if (0 = n) then 1 else (n * (f (n - 1)));;
 132     val f : int -> int = <fun>
 133     # f 5;;
 134     - : int = 120
 135
 136 We can't define a function that is exactly analogous to our &omega;.
 137 We could try `let rec omega x = x x;;` what happens?
 138
 139 [Note: if you want to learn more OCaml, you might come back here someday and try:
 140
 141         # let id x = x;;
 142         val id : 'a -> 'a = <fun>
 143         # let unwrap (`Wrap a) = a;;
 144         val unwrap : [< `Wrap of 'a ] -> 'a = <fun>
 145         # let omega ((`Wrap x) as y) = x y;;
 146         val omega : [< `Wrap of [> `Wrap of 'a ] -> 'b as 'a ] -> 'b = <fun>
 147         # unwrap (omega (`Wrap id)) == id;;
 148         - : bool = true
 149         # unwrap (omega (`Wrap omega));;
 150     <Infinite loop, need to control-c to interrupt>
 151
 152 But we won't try to explain this now.]
 153
 154
 155 Even if we can't (easily) express omega in OCaml, we can do this:
 156
 157     # let rec blackhole x = blackhole x;;
 158
 159 By the way, what's the type of this function?
 160
 161 If you then apply this `blackhole` function to an argument,
 162
 163     # blackhole 3;;
 164
 165 the interpreter goes into an infinite loop, and you have to type control-c
 166 to break the loop.
 167
 168 Oh, one more thing: lambda expressions look like this:
 169
 170     # (fun x -> x);;
 171     - : 'a -> 'a = <fun>
 172     # (fun x -> x) true;;
 173     - : bool = true
 174
 175 (But `(fun x -> x x)` still won't work.)
 176
 177 You may also see this:
 178
 179         # (function x -> x);;
 180         - : 'a -> 'a = <fun>
 181
 182 This works the same as `fun` in simple cases like this, and slightly differently in more complex cases. If you learn more OCaml, you'll read about the difference between them.
 183
 184 We can try our usual tricks:
 185
 186     # (fun x -> true) blackhole;;
 187     - : bool = true
 188
 189 OCaml declined to try to fully reduce the argument before applying the
 190 lambda function. Question: Why is that? Didn't we say that OCaml is a call-by-value/eager language?
 191
 192 Remember that `blackhole` is a function too, so we can
 193 reverse the order of the arguments:
 194
 195     # blackhole (fun x -> true);;
 196
 197 Infinite loop.
 198
 199 Now consider the following variations in behavior:
 200
 201     # let test = blackhole blackhole;;
 202     <Infinite loop, need to control-c to interrupt>
 203
 204     # let test () = blackhole blackhole;;
 205     val test : unit -> 'a = <fun>
 206
 207     # test;;
 208     - : unit -> 'a = <fun>
 209
 210     # test ();;
 211     <Infinite loop, need to control-c to interrupt>
 212
 213 We can use functions that take arguments of type unit to control
 214 execution.  In Scheme parlance, functions on the unit type are called
 215 *thunks* (which I've always assumed was a blend of "think" and "chunk").
 216
 217 Question: why do thunks work? We know that `blackhole ()` doesn't terminate, so why do expressions like:
 218
 219         let f = fun () -> blackhole ()
 220         in true
 221
 222 terminate?
 223
 224 Bottom type, divergence
 225 -----------------------
 226
 227 Expressions that don't terminate all belong to the **bottom type**. This is a subtype of every other type. That is, anything of bottom type belongs to every other type as well. More advanced type systems have more examples of subtyping: for example, they might make `int` a subtype of `real`. But the core type system of OCaml doesn't have any general subtyping relations. (Neither does System F.) Just this one: that expressions of the bottom type also belong to every other type. It's as if every type definition in OCaml, even the built in ones, had an implicit extra clause:
 228
 229         type 'a option = None | Some of 'a;;
 230         type 'a option = None | Some of 'a | bottom;;
 231
 232 Here are some exercises that may help better understand this. Figure out what is the type of each of the following:
 233
 234         fun x y -> y;;
 235
 236         fun x (y:int) -> y;;
 237
 238         fun x y : int -> y;;
 239
 240         let rec blackhole x = blackhole x in blackhole;;
 241
 242         let rec blackhole x = blackhole x in blackhole 1;;
 243
 244         let rec blackhole x = blackhole x in fun (y:int) -> blackhole y y y;;
 245
 246         let rec blackhole x = blackhole x in (blackhole 1) + 2;;
 247
 248         let rec blackhole x = blackhole x in (blackhole 1) || false;;
 249
 250         let rec blackhole x = blackhole x in 2 :: (blackhole 1);;
 251
 252         let rec blackhole (x:'a) : 'a = blackhole x in blackhole
 253
 254
 255 Back to thunks: the reason you'd want to control evaluation with thunks is to
 256 manipulate when "effects" happen. In a strongly normalizing system, like the
 257 simply-typed lambda calculus or System F, there are no "effects." In Scheme and
 258 OCaml, on the other hand, we can write programs that have effects. One sort of
 259 effect is printing (think of the [[damn]] example at the start of term).
 260 Another sort of effect is mutation, which we'll be looking at soon.
 261 Continuations are yet another sort of effect. None of these are yet on the
 262 table though. The only sort of effect we've got so far is *divergence* or
 263 non-termination. So the only thing thunks are useful for yet is controlling
 264 whether an expression that would diverge if we tried to fully evaluate it does
 265 diverge. As we consider richer languages, thunks will become more useful.
 266
 267
 268
 269 Dividing by zero: Towards Monads
 270 --------------------------------
 271
 272 So the integer division operation presupposes that its second argument
 273 (the divisor) is not zero, upon pain of presupposition failure.
 274 Here's what my OCaml interpreter says:
 275
 276     # 12/0;;
 277     Exception: Division_by_zero.
 278
 279 So we want to explicitly allow for the possibility that
 280 division will return something other than a number.
 281 We'll use OCaml's option type, which works like this:
 282
 283     # type 'a option = None | Some of 'a;;
 284     # None;;
 285     - : 'a option = None
 286     # Some 3;;
 287     - : int option = Some 3
 288
 289 So if a division is normal, we return some number, but if the divisor is
 290 zero, we return None. As a mnemonic aid, we'll append a `'` to the end of our new divide function.
 291
 292 <pre>
 293 let div' (x:int) (y:int) =
 294   match y with 0 -> None |
 295                _ -> Some (x / y);;
 296
 297 (*
 298 val div' : int -> int -> int option = fun
 299 # div' 12 3;;
 300 - : int option = Some 4
 301 # div' 12 0;;
 302 - : int option = None
 303 # div' (div' 12 3) 2;;
 304 Characters 4-14:
 305   div' (div' 12 3) 2;;
 306       ^^^^^^^^^^
 307 Error: This expression has type int option
 308        but an expression was expected of type int
 309 *)
 310 </pre>
 311
 312 This starts off well: dividing 12 by 3, no problem; dividing 12 by 0,
 313 just the behavior we were hoping for.  But we want to be able to use
 314 the output of the safe-division function as input for further division
 315 operations.  So we have to jack up the types of the inputs:
 316
 317 <pre>
 318 let div' (x:int option) (y:int option) =
 319   match y with None -> None |
 320                Some 0 -> None |
 321                Some n -> (match x with None -> None |
 322                                        Some m -> Some (m / n));;
 323
 324 (*
 325 val div' : int option -> int option -> int option = <fun>
 326 # div' (Some 12) (Some 4);;
 327 - : int option = Some 3
 328 # div' (Some 12) (Some 0);;
 329 - : int option = None
 330 # div' (div' (Some 12) (Some 0)) (Some 4);;
 331 - : int option = None
 332 *)
 333 </pre>
 334
 335 Beautiful, just what we need: now we can try to divide by anything we
 336 want, without fear that we're going to trigger any system errors.
 337
 338 I prefer to line up the `match` alternatives by using OCaml's
 339 built-in tuple type:
 340
 341 <pre>
 342 let div' (x:int option) (y:int option) =
 343   match (x, y) with (None, _) -> None |
 344                     (_, None) -> None |
 345                     (_, Some 0) -> None |
 346                     (Some m, Some n) -> Some (m / n);;
 347 </pre>
 348
 349 So far so good.  But what if we want to combine division with
 350 other arithmetic operations?  We need to make those other operations
 351 aware of the possibility that one of their arguments will trigger a
 352 presupposition failure:
 353
 354 <pre>
 355 let add' (x:int option) (y:int option) =
 356   match (x, y) with (None, _) -> None |
 357                     (_, None) -> None |
 358                     (Some m, Some n) -> Some (m + n);;
 359
 360 (*
 361 val add' : int option -> int option -> int option = <fun>
 362 # add' (Some 12) (Some 4);;
 363 - : int option = Some 16
 364 # add' (div' (Some 12) (Some 0)) (Some 4);;
 365 - : int option = None
 366 *)
 367 </pre>
 368
 369 This works, but is somewhat disappointing: the `add'` operation
 370 doesn't trigger any presupposition of its own, so it is a shame that
 371 it needs to be adjusted because someone else might make trouble.
 372
 373 But we can automate the adjustment.  The standard way in OCaml,
 374 Haskell, etc., is to define a `bind` operator (the name `bind` is not
 375 well chosen to resonate with linguists, but what can you do). To continue our mnemonic association, we'll put a `'` after the name "bind" as well.
 376
 377 <pre>
 378 let bind' (x: int option) (f: int -> (int option)) =
 379   match x with None -> None |
 380                Some n -> f n;;
 381
 382 let add' (x: int option) (y: int option)  =
 383   bind' x (fun x -> bind' y (fun y -> Some (x + y)));;
 384
 385 let div' (x: int option) (y: int option) =
 386   bind' x (fun x -> bind' y (fun y -> if (0 = y) then None else Some (x / y)));;
 387
 388 (*
 389 #  div' (div' (Some 12) (Some 2)) (Some 4);;
 390 - : int option = Some 1
 391 #  div' (div' (Some 12) (Some 0)) (Some 4);;
 392 - : int option = None
 393 # add' (div' (Some 12) (Some 0)) (Some 4);;
 394 - : int option = None
 395 *)
 396 </pre>
 397
 398 Compare the new definitions of `add'` and `div'` closely: the definition
 399 for `add'` shows what it looks like to equip an ordinary operation to
 400 survive in dangerous presupposition-filled world.  Note that the new
 401 definition of `add'` does not need to test whether its arguments are
 402 None objects or real numbers---those details are hidden inside of the
 403 `bind'` function.
 404
 405 The definition of `div'` shows exactly what extra needs to be said in
 406 order to trigger the no-division-by-zero presupposition.
 407
 408 For linguists: this is a complete theory of a particularly simply form
 409 of presupposition projection (every predicate is a hole).