week4.mdwn

   1 [[!toc]]
   2
   3 #Q: How do you know that every term in the untyped lambda calculus has a fixed point?#
   4
   5 A: That's easy: let `T` be an arbitrary term in the lambda calculus.  If
   6 `T` has a fixed point, then there exists some `X` such that `X <~~>
   7 TX` (that's what it means to *have* a fixed point).
   8
   9 <pre><code>let L = \x. T (x x) in
  10 let X = L L in
  11 X &equiv; L L &equiv; (\x. T (x x)) L ~~> T (L L) &equiv; T X
  12 </code></pre>
  13
  14 Please slow down and make sure that you understand what justified each
  15 of the equalities in the last line.
  16
  17 #Q: How do you know that for any term `T`, `Y T` is a fixed point of `T`?#
  18
  19 A: Note that in the proof given in the previous answer, we chose `T`
  20 and then set `X = L L = (\x. T (x x)) (\x. T (x x))`.  If we abstract over
  21 `T`, we get the Y combinator, `\T. (\x. T (x x)) (\x. T (x x))`.  No matter
  22 what argument `T` we feed `Y`, it returns some `X` that is a fixed point
  23 of `T`, by the reasoning in the previous answer.
  24
  25 #Q: So if every term has a fixed point, even `Y` has fixed point.#
  26
  27 A: Right:
  28
  29 <pre><code>let Y = \T. (\x. T (x x)) (\x. T (x x)) in
  30 Y Y &equiv; \T. (\x. T (x x)) (\x. T (x x)) Y
  31 ~~> (\x. Y (x x)) (\x. Y (x x))
  32 ~~> Y ((\x. Y (x x)) (\x. Y (x x)))
  33 ~~> Y (Y ((\x. Y (x x)) (\x. Y (x x))))
  34 ~~> Y (Y (Y (...(Y (Y Y))...)))</code></pre>
  35
  36
  37 #Q: Ouch!  Stop hurting my brain.#
  38
  39 A: Is that a question?
  40
  41 Let's come at it from the direction of arithmetic.  Recall that we
  42 claimed that even `succ`---the function that added one to any
  43 number---had a fixed point.  How could there be an X such that X = X+1?
  44 That would imply that
  45
  46     X <~~> succ X <~~> succ (succ X) <~~> succ (succ (succ (X))) <~~> succ (... (succ X)...)
  47
  48 In other words, the fixed point of `succ` is a term that is its own
  49 successor.  Let's just check that `X = succ X`:
  50
  51 <pre><code>let succ = \n s z. s (n s z) in
  52 let X = (\x. succ (x x)) (\x. succ (x x)) in
  53 succ X
  54 &equiv; succ ( (\x. succ (x x)) (\x. succ (x x)) )
  55 ~~> succ (succ ( (\x. succ (x x)) (\x. succ (x x))))
  56 &equiv; succ (succ X)
  57 </code></pre>
  58
  59 You should see the close similarity with `Y Y` here.
  60
  61
  62 #Q. So `Y` applied to `succ` returns a number that is not finite!#
  63
  64 A. Yes!  Let's see why it makes sense to think of `Y succ` as a Church
  65 numeral:
  66
  67         [same definitions]
  68         succ X
  69         = (\n s z. s (n s z)) X
  70         = \s z. s (X s z)
  71         = succ (\s z. s (X s z)) ; using fixed-point reasoning
  72         = \s z. s ([succ (\s z. s (X s z))] s z)
  73         = \s z. s ([\s z. s ([succ (\s z. s (X s z))] s z)] s z)
  74         = \s z. s (s (succ (\s z. s (X s z))))
  75
  76 So `succ X` looks like a numeral: it takes two arguments, `s` and `z`,
  77 and returns a sequence of nested applications of `s`...
  78
  79 You should be able to prove that `add 2 (Y succ) <~~> Y succ`,
  80 likewise for `mult`, `minus`, `pow`.  What happens if we try `minus (Y
  81 succ)(Y succ)`?  What would you expect infinity minus infinity to be?
  82 (Hint: choose your evaluation strategy so that you add two `s`s to the
  83 first number for every `s` that you add to the second number.)
  84
  85 This is amazing, by the way: we're proving things about a term that
  86 represents arithmetic infinity.
  87
  88 It's important to bear in mind the simplest term in question is not
  89 infinite:
  90
  91         Y succ = (\f. (\x. f (x x)) (\x. f (x x))) (\n s z. s (n s z))
  92
  93 The way that infinity enters into the picture is that this term has
  94 no normal form: no matter how many times we perform beta reduction,
  95 there will always be an opportunity for more beta reduction.  (Lather,
  96 rinse, repeat!)
  97
  98
  99 #Q. That reminds me, what about [[evaluation order]]?#
 100
 101 A. For a recursive function that has a well-behaved base case, such as
 102 the factorial function, evaluation order is crucial.  In the following
 103 computation, we will arrive at a normal form.  Watch for the moment at
 104 which we have to make a choice about which beta reduction to perform
 105 next: one choice leads to a normal form, the other choice leads to
 106 endless reduction:
 107
 108         let prefac = \f n. iszero n 1 (mult n (f (pred n))) in
 109         let fac = Y prefac in
 110         fac 2
 111            = [(\f.(\x.f(xx))(\x.f(xx))) prefac] 2
 112            = [(\x.prefac(xx))(\x.prefac(xx))] 2
 113            = [prefac((\x.prefac(xx))(\x.prefac(xx)))] 2
 114            = [prefac(prefac((\x.prefac(xx))(\x.prefac(xx))))] 2
 115            = [(\f n. iszero n 1 (mult n (f (pred n))))
 116                   (prefac((\x.prefac(xx))(\x.prefac(xx))))] 2
 117            = [\n. iszero n 1 (mult n ([prefac((\x.prefac(xx))(\x.prefac(xx)))] (pred n)))] 2
 118            = iszero 2 1 (mult 2 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] (pred 2)))
 119            = mult 2 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] 1)
 120            ...
 121            = mult 2 (mult 1 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] 0))
 122            = mult 2 (mult 1 (iszero 0 1 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] (pred 0))))
 123            = mult 2 (mult 1 1)
 124            = mult 2 1
 125            = 2
 126
 127 The crucial step is the third from the last.  We have our choice of
 128 either evaluating the test `iszero 0 1 ...`, which evaluates to `1`,
 129 no matter what the ... contains;
 130 or we can evaluate the `Y` pump, `(\x.prefac(xx))(\x.prefac(xx))`, to
 131 produce another copy of `prefac`.  If we postpone evaluting the
 132 `iszero` test, we'll pump out copy after copy of `prefac`, and never
 133 realize that we've bottomed out in the recursion.  But if we adopt a
 134 leftmost/call-by-name/normal-order evaluation strategy, we'll always
 135 start with the `iszero` predicate, and only produce a fresh copy of
 136 `prefac` if we are forced to.
 137