--- /dev/null
+[[!toc]]
+
+Q: How do you know that every term in the untyped lambda calculus has
+a fixed point?
+
+A: That's easy: let `T` be an arbitrary term in the lambda calculus. If
+`T` has a fixed point, then there exists some `X` such that `X <~~>
+TX` (that's what it means to *have* a fixed point).
+
+ let W = \x.T(xx) in
+ let X = WW in
+ X = WW = (\x.T(xx))W = T(WW) = TX
+
+Q: How do you know that for any term T, YT is a fixed point of T?
+
+A: Note that in the proof given in the previous answer, we chose `T`
+and then set `X = WW = (\x.T(xx))(\x.T(xx))`. If we abstract over
+`T`, we get the Y combinator, `\T.(\x.T(xx))(\x.T(xx))`. No matter
+what argument `T` we feed Y, it returns some `X` that is a fixed point
+of `T`, by the reasoning in the previous answer.
+
+Q: So if every term has a fixed point, even Y has fixed point.
+
+A: Right:
+
+ let Y = \T.(\x.T(xx))(\x.T(xx)) in
+ Y Y = \T.(\x.T(xx))(\x.T(xx)) Y
+ = (\x.Y(xx))(\x.Y(xx))
+ = Y((\x.Y(xx))(\x.Y(xx)))
+ = Y(Y((\x.Y(xx))(\x.Y(xx))))
+ = Y(Y(Y(...(Y(YY))...)))
+
+Q: Ouch! Stop hurting my brain.
+
+A: Let's come at it from the direction of arithmetic. Recall that we
+claimed that even `succ`---the function that added one to any
+number---had a fixed point. How could there be an X such that X = X+1?
+Then
+
+ X = succ X = succ (succ X) = succ (succ (succ (X))) = succ (... (succ X)...)
+
+In other words, the fixed point of `succ` is a term that is its own
+successor. Let's just check that `X = succ X`:
+
+ let succ = \n s z. s (n s z) in
+ let X = (\x.succ(xx))(\x.succ(xx)) in
+ succ X
+ = succ ((\x.succ(xx))(\x.succ(xx)))
+ = succ (succ ((\x.succ(xx))(\x.succ(xx))))
+ = succ (succ X)
+
+You should see the close similarity with YY here.
+
+Q. So `Y` applied to `succ` returns infinity!
+
+A. Yes! Let's see why it makes sense to think of `Y succ` as a Church
+numeral:
+
+ [same definitions]
+ succ X
+ = (\n s z. s (n s z)) X
+ = \s z. s (X s z)
+ = succ (\s z. s (X s z)) ; using fixed-point reasoning
+ = \s z. s ([succ (\s z. s (X s z))] s z)
+ = \s z. s ([\s z. s ([succ (\s z. s (X s z))] s z)] s z)
+ = \s z. s (s (succ (\s z. s (X s z))))
+
+So `succ X` looks like a numeral: it takes two arguments, `s` and `z`,
+and returns a sequence of nested applications of `s`...
+
+You should be able to prove that `add 2 (Y succ) <~~> Y succ`,
+likewise for `mult`, `minus`, `pow`. What happens if we try `minus (Y
+succ)(Y succ)`? What would you expect infinity minus infinity to be?
+(Hint: choose your evaluation strategy so that you add two `s`s to the
+first number for every `s` that you add to the second number.)
+
+This is amazing, by the way: we're proving things about a term that
+represents arithmetic infinity. It's important to bear in mind the
+simplest term in question is not infinite:
+
+ Y succ = (\f.(\x.f(xx))(\x.f(xx)))(\n s z. s (n s z))
+
+The way that infinity enters into the picture is that this term has
+no normal form: no matter how many times we perform beta reduction,
+there will always be an opportunity for more beta reduction. (Lather,
+rinse, repeat!)
+
+Q. That reminds me, what about [[evaluation order]]?
+
+A. For a recursive function that has a well-behaved base case, such as
+the factorial function, evaluation order is crucial. In the following
+computation, we will arrive at a normal form. Watch for the moment at
+which we have to make a choice about which beta reduction to perform
+next: one choice leads to a normal form, the other choice leads to
+endless reduction:
+
+ let prefac = \f n. isZero n 1 (mult n (f (pred n))) in
+ let fac = Y prefac in
+ fac 2
+ = [(\f.(\x.f(xx))(\x.f(xx))) prefac] 2
+ = [(\x.prefac(xx))(\x.prefac(xx))] 2
+ = [prefac((\x.prefac(xx))(\x.prefac(xx)))] 2
+ = [prefac(prefac((\x.prefac(xx))(\x.prefac(xx))))] 2
+ = [(\f n. isZero n 1 (mult n (f (pred n))))
+ (prefac((\x.prefac(xx))(\x.prefac(xx))))] 2
+ = [\n. isZero n 1 (mult n ([prefac((\x.prefac(xx))(\x.prefac(xx)))] (pred n)))] 2
+ = isZero 2 1 (mult 2 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] (pred 2)))
+ = mult 2 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] 1)
+ ...
+ = mult 2 (mult 1 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] 0))
+ = mult 2 (mult 1 (isZero 0 1 ([prefac((\x.prefac(xx))(\x.prefac(xx)))] (pred 0))))
+ = mult 2 (mult 1 1)
+ = mult 2 1
+ = 2
+
+The crucial step is the third from the last. We have our choice of
+either evaluating the test `isZero 0 1 ...`, which evaluates to `1`,
+or we can evaluate the `Y` pump, `(\x.prefac(xx))(\x.prefac(xx))`, to
+produce another copy of `prefac`. If we postpone evaluting the
+`isZero` test, we'll pump out copy after copy of `prefac`, and never
+realize that we've bottomed out in the recursion. But if we adopt a
+leftmost/call by name/normal order evaluation strategy, we'll always
+start with the isZero predicate, and only produce a fresh copy of
+`prefac` if we are forced to.
+
+Q. You claimed that the Ackerman function couldn't be implemented
+using our primitive recursion techniques (such as the techniques that
+allow us to define addition and multiplication). But you haven't
+shown that it is possible to define the Ackerman function using full
+recursion.
+
+A. OK:
+
+<pre>
+A(m,n) =
+ | when m == 0 -> n + 1
+ | else when n == 0 -> A(m-1,1)
+ | else -> A(m-1, A(m,n-1))
+
+let A = Y (\A m n. isZero m (succ n) (isZero n (A (pred m) 1) (A (pred m) (A m (pred n))))) in
+</pre>
+
+For instance,
+
+ A 1 2
+ = A 0 (A 1 1)
+ = A 0 (A 0 (A 1 0))
+ = A 0 (A 0 (A 0 1))
+ = A 0 (A 0 2)
+ = A 0 3
+ = 4
+
+A 1 x is to A 0 x as addition is to the successor function;
+A 2 x is to A 1 x as multiplication is to addition;
+A 3 x is to A 2 x as exponentiation is to multiplication---
+so A 4 x is to A 3 x as super-exponentiation is to exponentiation...
+