From: Chris Barker Date: Sat, 2 Oct 2010 19:20:32 +0000 (-0400) Subject: added notes for week4 X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=commitdiff_plain;h=76452415c955f62828e6c81370c9e39eb6d5b28f added notes for week4 --- diff --git a/week4.mdwn b/week4.mdwn new file mode 100644 index 00000000..10ab8b99 --- /dev/null +++ b/week4.mdwn @@ -0,0 +1,157 @@ +[[!toc]] + +Q: How do you know that every term in the untyped lambda calculus has +a fixed point? + +A: That's easy: let `T` be an arbitrary term in the lambda calculus. If +`T` has a fixed point, then there exists some `X` such that `X <~~> +TX` (that's what it means to *have* a fixed point). + + let W = \x.T(xx) in + let X = WW in + X = WW = (\x.T(xx))W = T(WW) = TX + +Q: How do you know that for any term T, YT is a fixed point of T? + +A: Note that in the proof given in the previous answer, we chose `T` +and then set `X = WW = (\x.T(xx))(\x.T(xx))`. If we abstract over +`T`, we get the Y combinator, `\T.(\x.T(xx))(\x.T(xx))`. No matter +what argument `T` we feed Y, it returns some `X` that is a fixed point +of `T`, by the reasoning in the previous answer. + +Q: So if every term has a fixed point, even Y has fixed point. + +A: Right: + + let Y = \T.(\x.T(xx))(\x.T(xx)) in + Y Y = \T.(\x.T(xx))(\x.T(xx)) Y + = (\x.Y(xx))(\x.Y(xx)) + = Y((\x.Y(xx))(\x.Y(xx))) + = Y(Y((\x.Y(xx))(\x.Y(xx)))) + = Y(Y(Y(...(Y(YY))...))) + +Q: Ouch! Stop hurting my brain. + +A: Let's come at it from the direction of arithmetic. Recall that we +claimed that even `succ`---the function that added one to any +number---had a fixed point. How could there be an X such that X = X+1? +Then + + X = succ X = succ (succ X) = succ (succ (succ (X))) = succ (... (succ X)...) + +In other words, the fixed point of `succ` is a term that is its own +successor. Let's just check that `X = succ X`: + + let succ = \n s z. s (n s z) in + let X = (\x.succ(xx))(\x.succ(xx)) in + succ X + = succ ((\x.succ(xx))(\x.succ(xx))) + = succ (succ ((\x.succ(xx))(\x.succ(xx)))) + = succ (succ X) + +You should see the close similarity with YY here. + +Q. So `Y` applied to `succ` returns infinity! + +A. Yes! Let's see why it makes sense to think of `Y succ` as a Church +numeral: + + [same definitions] + succ X + = (\n s z. s (n s z)) X + = \s z. s (X s z) + = succ (\s z. s (X s z)) ; using fixed-point reasoning + = \s z. s ([succ (\s z. s (X s z))] s z) + = \s z. s ([\s z. s ([succ (\s z. s (X s z))] s z)] s z) + = \s z. s (s (succ (\s z. s (X s z)))) + +So `succ X` looks like a numeral: it takes two arguments, `s` and `z`, +and returns a sequence of nested applications of `s`... + +You should be able to prove that `add 2 (Y succ) <~~> Y succ`, +likewise for `mult`, `minus`, `pow`. What happens if we try `minus (Y +succ)(Y succ)`? What would you expect infinity minus infinity to be? +(Hint: choose your evaluation strategy so that you add two `s`s to the +first number for every `s` that you add to the second number.) + +This is amazing, by the way: we're proving things about a term that +represents arithmetic infinity. It's important to bear in mind the +simplest term in question is not infinite: + + Y succ = (\f.(\x.f(xx))(\x.f(xx)))(\n s z. s (n s z)) + +The way that infinity enters into the picture is that this term has +no normal form: no matter how many times we perform beta reduction, +there will always be an opportunity for more beta reduction. (Lather, +rinse, repeat!) + +Q. That reminds me, what about [[evaluation order]]? + +A. For a recursive function that has a well-behaved base case, such as +the factorial function, evaluation order is crucial. In the following +computation, we will arrive at a normal form. Watch for the moment at +which we have to make a choice about which beta reduction to perform +next: one choice leads to a normal form, the other choice leads to +endless reduction: + + let prefac = \f n. isZero n 1 (mult n (f (pred n))) in + let fac = Y prefac in + fac 2 + = [(\f.(\x.f(xx))(\x.f(xx))) prefac] 2 + = [(\x.prefac(xx))(\x.prefac(xx))] 2 + = [prefac((\x.prefac(xx))(\x.prefac(xx)))] 2 + = [prefac(prefac((\x.prefac(xx))(\x.prefac(xx))))] 2 + = [(\f n. isZero n 1 (mult n (f (pred n)))) + (prefac((\x.prefac(xx))(\x.prefac(xx))))] 2 + = [\n. isZero n 1 (mult n ([prefac((\x.prefac(xx))(\x.prefac(xx)))] (pred n)))] 2 + = isZero 2 1 (mult 2 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] (pred 2))) + = mult 2 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] 1) + ... + = mult 2 (mult 1 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] 0)) + = mult 2 (mult 1 (isZero 0 1 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] (pred 0)))) + = mult 2 (mult 1 1) + = mult 2 1 + = 2 + +The crucial step is the third from the last. We have our choice of +either evaluating the test `isZero 0 1 ...`, which evaluates to `1`, +or we can evaluate the `Y` pump, `(\x.prefac(xx))(\x.prefac(xx))`, to +produce another copy of `prefac`. If we postpone evaluting the +`isZero` test, we'll pump out copy after copy of `prefac`, and never +realize that we've bottomed out in the recursion. But if we adopt a +leftmost/call by name/normal order evaluation strategy, we'll always +start with the isZero predicate, and only produce a fresh copy of +`prefac` if we are forced to. + +Q. You claimed that the Ackerman function couldn't be implemented +using our primitive recursion techniques (such as the techniques that +allow us to define addition and multiplication). But you haven't +shown that it is possible to define the Ackerman function using full +recursion. + +A. OK: + +
+A(m,n) =
+    | when m == 0 -> n + 1
+    | else when n == 0 -> A(m-1,1)
+    | else -> A(m-1, A(m,n-1))
+
+let A = Y (\A m n. isZero m (succ n) (isZero n (A (pred m) 1) (A (pred m) (A m (pred n))))) in
+
+ +For instance, + + A 1 2 + = A 0 (A 1 1) + = A 0 (A 0 (A 1 0)) + = A 0 (A 0 (A 0 1)) + = A 0 (A 0 2) + = A 0 3 + = 4 + +A 1 x is to A 0 x as addition is to the successor function; +A 2 x is to A 1 x as multiplication is to addition; +A 3 x is to A 2 x as exponentiation is to multiplication--- +so A 4 x is to A 3 x as super-exponentiation is to exponentiation... +