or `\m y n. m n y`, which is the `C` combinator. So in such a
context, (2) might denote
- Y C
- (\h. (\u. h (u u)) (\u. h (u u))) C
- (\u. C (u u)) (\u. C (u u)))
- C ((\u. C (u u)) (\u. C (u u)))
- C (C ((\u. C (u u)) (\u. C (u u))))
- C (C (C ((\u. C (u u)) (\u. C (u u)))))
+ Y C ≡
+ (\h. (\u. h (u u)) (\u. h (u u))) C ~~>
+ (\u. C (u u)) (\u. C (u u))) ~~>
+ C ((\u. C (u u)) (\u. C (u u))) ~~>
+ C (C ((\u. C (u u)) (\u. C (u u)))) ~~>
+ C (C (C ((\u. C (u u)) (\u. C (u u))))) ~~>
...
-And infinite sequence of `C`s, each one negating the remainder of the
+An infinite sequence of `C`s, each one negating the remainder of the
sequence. Yep, that feels like a reasonable representation of the
liar paradox.
A: Right:
- let Y = \N. (\u. N (u u)) (\u. N (u u)) in
- Y Y
- ≡ \N. (\u. N (u u)) (\u. N (u u)) Y
- ~~> (\u. Y (u u)) (\u. Y (u u))
- ~~> Y ((\u. Y (u u)) (\u. Y (u u)))
- ~~> Y ( Y ((\u. Y (u u)) (\u. Y (u u))))
- ~~> Y (Y (Y (...(Y (Y Y))...)))
+ let Y = \h. (\u. h (u u)) (\u. h (u u)) in
+ Y Y ≡
+ \h. (\u. h (u u)) (\u. h (u u)) Y ~~>
+ (\u. Y (u u)) (\u. Y (u u)) ~~>
+ Y ((\u. Y (u u)) (\u. Y (u u))) ~~>
+ Y ( Y ((\u. Y (u u)) (\u. Y (u u)))) <~~>
+ Y ( Y ( Y (...(Y (Y Y))...)))
A: Is that a question?
Let's come at it from the direction of arithmetic. Recall that we
-claimed that even `succ`---the function that added one to any
-number---had a fixed point. How could there be an `ξ` such that `ξ <~~> succ ξ`?
+claimed that even `succ` --- the function that added one to any
+number --- had a fixed point. How could there be an `ξ` such that `ξ <~~> succ ξ`?
That would imply that
ξ <~~> succ ξ <~~> succ (succ ξ) <~~> succ (succ (succ ξ)) <~~> succ (...(succ ξ)...)
`A 1 x` is to `A 0 x` as addition is to the successor function;
`A 2 x` is to `A 1 x` as multiplication is to addition;
-`A 3 x` is to `A 2 x` as exponentiation is to multiplication---
+`A 3 x` is to `A 2 x` as exponentiation is to multiplication ---
so `A 4 x` is to `A 3 x` as hyper-exponentiation is to exponentiation...
## Q: What other questions should I be asking? ##
-* What is it about the variant fixed-point combinators that makes
- them compatible with a call-by-value evaluation strategy?
+* What is it about the "primed" fixed-point combinators `Θ′` and `Y′` that
+ makes them compatible with a call-by-value evaluation strategy?
+
+* What *exactly* is primitive recursion?
* How do you know that the Ackermann function can't be computed
using primitive recursion techniques?
-* What *exactly* is primitive recursion?
-
* I hear that `Y` delivers the/a *least* fixed point. Least
according to what ordering? How do you know it's least?
Is leastness important?
+
+## Q: I still don't fully understand the Y combinator. Can you explain it in a different way?
+
+Sure! Here is another way to derive `Y`. We'll start by choosing a
+specific goal, and at each decision point, we'll make a reasonable
+guess. The guesses will all turn out to be lucky, and we'll arrive at
+a fixed point combinator.
+
+Given an arbitrary term `h`, we want to find a fixed point `X` such that:
+
+ X <~~> h X
+
+Our strategy will be to seek an `X` such that `X ~~> h X` (this is just a guess). Because `X` and
+`h X` are syntactically different, the only way that `X` can reduce to `h X` is if `X`
+contains at least one redex. The simplest way to satisfy this
+constraint would be for the fixed point to itself be a redex (again, just a guess):
+
+ X ≡ ((\u. M) N) ~~> h X
+
+The result of beta reduction on this redex will be `M` with some
+substitutions. We know that after these substitutions, `M` will have
+the form `h X`, since that is what the reduction arrow tells us. So we
+can refine the picture as follows:
+
+ X ≡ ((\u. h (___)) N) ~~> h X
+
+Here, the `___` has to be something that reduces to the fixed point `X`.
+It's natural to assume that there will be at least one occurrence of
+`u` in the body of the head abstract:
+
+ X ≡ ((\u. h (__u__)) N) ~~> h X
+
+After reduction of the redex, we're going to have
+
+ X ≡ h (__N__) ~~> h X
+
+Apparently, `__N__` will have to reduce to `X`. Therefore we should
+choose a skeleton for `N` that is consistent with what we have decided
+so far about the internal structure of `X`. We might like for `N` to
+syntactically match the whole of `X`, but this would require `N` to contain itself as
+a subpart. So we'll settle for the more modest assumption (or guess) that `N`
+matches the head of `X`:
+
+ X ≡ ((\u. h (__u__)) (\u. h (__u__))) ~~> h X
+
+At this point, we've derived a skeleton for X on which it contains two
+so-far identical halves. We'll guess that the halves will be exactly
+identical. Note that at the point at which we perform the first
+reduction, `u` will get bound to `N`, which now corresponds to a term
+representing one of the halves of `X`. So in order to produce a full `X`,
+we simply make a second copy of `u`:
+
+ X ≡ ((\u. h (u u)) (\u. h (u u)))
+ ~~> h ((\u. h (u u)) (\u. h (u u)))
+ ≡ h X
+
+Success.
+
+So the function `\h. (\u. h (u u)) (\u. h (u u))` maps an arbitrary term
+`h` to a fixed point for `h`.