From: jim
Date: Fri, 20 Feb 2015 13:03:56 +0000 (0500)
Subject: clean up, move liar/truthteller to separate document
XGitUrl: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=commitdiff_plain;h=a29958c3e11fa8ae383aacff54ce7f405def0705
clean up, move liar/truthteller to separate document

diff git a/topics/week4_fixed_point_combinators.mdwn b/topics/week4_fixed_point_combinators.mdwn
index fecc6532..601ad813 100644
 a/topics/week4_fixed_point_combinators.mdwn
+++ b/topics/week4_fixed_point_combinators.mdwn
@@ 1,6 +1,4 @@
[[!toc levels=3]]

#Recursion: fixed points in the Lambda Calculus#
+[[!toc levels=2]]
Sometimes when you type in a web search, Google will suggest
alternatives. For instance, if you type in "Lingusitics", it will ask
@@ 9,7 +7,7 @@ added some playfulness to the system. For instance, if you search for
"anagram", Google asks you "Did you mean: nag a ram?" And if you
[search for "recursion"](http://www.google.com/search?q=recursion), Google asks: "Did you mean: recursion?"
##What is the "rec" part of "letrec" doing?##
+## What is the "rec" part of "letrec" doing? ##
How could we compute the length of a list? Without worrying yet about what Lambda Calculus encoding we're using for the list, the basic idea is to define this recursively:
@@ 154,7 +152,7 @@ As we've seen, it does take some ingenuity to define functions like `tail` or `p
With sufficient ingenuity, a great many functions can be defined in the same way. For example, the factorial function is straightforward. The function which returns the *n*th term in the Fibonacci series is a bit more difficult, but also achievable.
##Some functions require fullfledged recursive definitions##
+## Some functions require fullfledged recursive definitions ##
However, some computable functions are just not definable in this
way. We can't, for example, define a function that tells us, for
@@ 180,9 +178,9 @@ Many simpler functions always *could* be defined using the resources we've so fa
But functions like the Ackermann function require us to develop a more general technique for doing recursion  and having developed it, it will often be easier to use it even in the cases where, in principle, we didn't have to.
##Using fixedpoint combinators to define recursive functions##
+## Using fixedpoint combinators to define recursive functions ##
###Fixed points###
+### Fixed points ###
In mathematics, a **fixed point** of a function `f` is any value `Î¾`
such that `f Î¾` is equivalent to `Î¾`. For example,
@@ 237,7 +235,7 @@ the successor function will have a fixed point. Isn't that weird? There's some `
Think about how it might be true. We'll return to this point below.)
###How fixed points help define recursive functions###
+### How fixed points help define recursive functions ###
Recall our initial, abortive attempt above to define the `length` function in the Lambda Calculus. We said:
@@ 252,6 +250,9 @@ length function. Then we have
\xs. (empty? xs) 0 (succ (LENGTH (tail xs)))
+(More generally, we might have some lambda term `Î¦[...SELF...]` where we
+want the contained `SELF` to refer to that very lambda term `Î¦[...SELF...]`.)
+
At this point, we have a definition of the length function, though
it's not complete, since we don't know what value to use for the
symbol `LENGTH`. Technically, it has the status of an unbound
@@ 262,6 +263,9 @@ term `h`:
h â¡ \length \xs. (empty? xs) 0 (succ (length (tail xs)))
+(More generally, convert `Î¦[...SELF...]` to `\body. Î¦[...body...]`, where
+the variable `body` wants to be bound to the very lambda term that is that abstract's body.)
+
Now we have no unbound variables, and we have complete nonrecursive
definitions of each of the other symbols (`empty?`, `0`, `succ`, and `tail`).
@@ 289,7 +293,7 @@ The strategy we will present will turn out to be a general way of
finding a fixed point for any lambda term.
##Deriving Y, a fixed point combinator##
+## Deriving Y, a fixed point combinator ##
How shall we begin? Well, we need to find an argument to supply to
`h`. The argument has to be a function that computes the length of a
@@ 310,7 +314,9 @@ to discuss generalizations of this strategy.)
h â¡ \length \xs. (empty? xs) 0 (succ (length (tail xs)))
H â¡ \u \xs. (empty? xs) 0 (succ ((u u) (tail xs)))
This is the key creative step. Instead of applying `u` to a list, as happened
+(We'll discuss the general case, when you're starting from `\body. Î¦[...body...]` rather than this specific `h`, below.)
+
+Shifting to `H` is the key creative step. Instead of applying `u` to a list, as happened
when we selfapplied `h`, `H` applies its argument `u` first to *itself*: `u u`.
After `u` gets an argument, the *result* is ready to apply to a list, so we've solved the problem noted above with `h (tail list)`.
We're not done yet, of course; we don't yet know what argument `u` to give
@@ 323,16 +329,31 @@ as an argument, and that returns a function that takes a list as an
argument. `H` itself fits the bill:
H H <~~> (\u \xs. (empty? xs) 0 (succ ((u u) (tail xs)))) H
+
<~~> \xs. (empty? xs) 0 (succ ((H H) (tail xs)))
+
<~~> \xs. (empty? xs) 0 (succ ((
\xs. (empty? xs) 0 (succ ((H H) (tail xs)))
) (tail xs)))
<~~> \xs. (empty? xs) 0 (succ (
 (empty? (tail xs)) 0 (succ ((H H) (tail (tail xs)))) ))
+ (empty? (tail xs)) 0 (succ ((H H) (tail (tail xs))))
+ ))
+ <~~> \xs. (empty? xs) 0 (succ (
+ (empty? (tail xs)) 0 (succ (
+ \xs. (empty? xs) 0 (succ ((H H) (tail xs)))
+ ) (tail (tail xs))))
+ ))
+ <~~> \xs. (empty? xs) 0 (succ (
+ (empty? (tail xs)) 0 (succ (
+ (empty? (tail (tail xs))) 0 (succ ((H H) (tail (tail (tail xs)))))
+ ))
+ ))
+ <~~> ...
We're in business!
How does the recursion work?
+### How does the recursion work? ###
+
We've defined `H` in such a way that `H H` turns out to be the length function.
That is, `H H` is the `LENGTH` we were looking for.
In order to evaluate `H H`, we substitute `H` into the body of the
@@ 352,13 +373,13 @@ We've starting with a particular recursive definition, and arrived at
a fixed point for that definition.
What's the general recipe?
1. Start with a formula `h` that takes the recursive function you're seeking as an argument: `h â¡ \length. ... length ...`
+1. Start with a formula `h` that takes the recursive function you're seeking as an argument: `h â¡ \length. ...length...` (This is what we also called `\body. Î¦[...body...]`.)
2. Next, define `H â¡ \u. h (u u)`
3. Then compute `H H â¡ ((\u. h (u u)) (\u. h (u u)))`
4. That's the fixed point of `h`, the recursive function you're seeking.
Expressed in terms of a single formula, here is this method for taking an arbitrary `h`style term and returning
the recursive function that term expects as an argument, which as we've seen will be the `h`term's fixed point:
+that term's fixed point, which will be the recursive function that term expects as an argument:
Y â¡ \h. (\u. h (u u)) (\u. h (u u))
@@ 368,52 +389,50 @@ Let's test that `Y h` will indeed be `h`'s fixed point:
~~> (\u. h (u u)) (\u. h (u u))
~~> h ((\u. h (u u)) (\u. h (u u)))
But the argument of `h` in the last line is just the same as the second line, which <\~~> `Y h`. So the last line <\~~> `h (Y h)`. In other words, `Y h <~~> h (Y h)`. So by definition, `Y h` is a fixed point for `h`.
+But the argument of `h` in the last line is just the same as the second line, which `<~~> Y h`. So the last line `<~~> h (Y h)`. In other words, `Y h <~~> h (Y h)`. So by definition, `Y h` is a fixed point for `h`.
Works!
###Coming at it another way###

TODO


##A fixed point for K?##
+## A fixed point for K? ##
Let's do one more example to illustrate. We'll do `K`, since we
+Let's do one more example to illustrate. We'll do `K` (boolean true), since we
wondered above whether it had a fixed point.
Before we begin, we can reason a bit about what the fixed point must
be like. We're looking for a fixed point for `K`, i.e., `\x y. x`. The term `K`
ignores its second argument. That means that no matter what we give
`K` as its first argument, the result will ignore the next argument
(that is, `KX` ignores its first argument, no matter what `X` is). So
if `KX <~~> X`, `X` had also better ignore its first argument. But we
also have `KX â¡ (\x y. x) X ~~> \y. X`. This means that if `X` ignores
its first argument, then `\y.X` will ignore its first two arguments.
So once again, if `KX <~~> X`, `X` also had better ignore (at least) its
first two arguments. Repeating this reasoning, we realize that `X`
+(that is, `K Î¾` ignores its first argument, no matter what `Î¾` is). So
+if `K Î¾ <~~> Î¾`, `Î¾` had also better ignore its first argument. But we
+also have `K Î¾ â¡ (\x y. x) Î¾ ~~> \y. Î¾`. This means that if `Î¾` ignores
+its first argument, then `K Î¾ <~~> \y. Î¾` will ignore its first two arguments.
+So once again, if `K Î¾ <~~> Î¾`, `Î¾` also had better ignore (at least) its
+first two arguments. Repeating this reasoning, we realize that `Î¾` here
must be a function that ignores as many arguments as you give it.
Our expectation, then, is that our recipe for finding fixed points
will build us a term that somehow manages to ignore arbitrarily many arguments.
 h â¡ \xy.x
 H â¡ \u.h(uu) â¡ \u.(\xy.x)(uu) ~~> \uy.uu
 H H â¡ (\uy.uu)(\uy.uu) ~~> \y.(\uy.uu)(\uy.uu)
+ h â¡ \x y. x
+ H â¡ \u. h (u u)
+ â¡ \u. (\x y. x) (u u)
+ ~~> \u. \y. u u
+ H H ~~> (\u y. u u) (\u y. u u)
+ ~~> \y. (\u y. u u) (\u y. u u)
Let's check that it is in fact a fixed point for `K`:
 K(H H) â¡ (\xy.x)((\uy.uu)(\uy.uu))
 ~~> \y.(\uy.uu)(\uy.uu)
+ K (H H) ~~> (\x y. x) ((\u y. u u) (\u y. u u))
+ ~~> \y. (\u y. u u) (\u y. u u)
Yep, `H H` and `K(H H)` both reduce to the same term.
+Yep, `H H` and `K (H H)` both reduce to the same term.
To see what this fixed point does, let's reduce it a bit more:
 H H â¡ (\uy.uu)(\uy.uu)
 ~~> \y.(\uy.uu)(\uy.uu)
 ~~> \yy.(\uy.uu)(\uy.uu)
 ~~> \yyy.(\uy.uu)(\uy.uu)
+ H H ~~> (\u y. u u) (\u y. u u)
+ ~~> \y. (\u y. u u) (\u y. u u)
+ ~~> \y. \y. (\u y. u u) (\u y. u u)
+ ~~> \y. \y. \y. (\u y. u u) (\u y. u u)
Sure enough, this fixed point ignores an endless, arbitrarilylong series of
arguments. It's a writeonly memory, a black hole.
@@ 429,7 +448,7 @@ Continuing in this way, you can now find an infinite number of fixed
points, all of which have the crucial property of ignoring an infinite
series of arguments.
##What is a fixed point for the successor function?##
+## A fixed point for succ? ##
As we've seen, the recipe just given for finding a fixed point worked
great for our `h`, which we wrote as a definition for the length
@@ 461,9 +480,7 @@ now out of our reach. It would also let us define list
functions on [[the encodings we discussed last weekweek3_lists#otherlists]], where it
wasn't always clear how to force the computation to "keep going."
###Varieties of fixedpoint combinators###

OK, so how do we make use of this?
+## Varieties of fixedpoint combinators ##
Many fixedpoint combinators have been discovered. (And as we've seen, some
fixedpoint combinators give us models for building infinitely many
@@ 474,7 +491,7 @@ Two of the simplest:
Îâ² â¡ (\u h. h (\n. u u h n)) (\u h. h (\n. u u h n))
Yâ² â¡ \h. (\u. h (\n. u u n)) (\u. h (\n. u u n))
Applying either of these to a term `h` gives a fixed point `Î¾` for `h`, meaning that `h Î¾` <~~> `Î¾`. `Îâ²` has the advantage that `h (Îâ² h)` really *reduces to* `Îâ² h`. Whereas `h (Yâ² h)` is only *convertible with* `Yâ² h`; that is, there's a common formula they both reduce to. For most purposes, though, either will do.
+Applying either of these to a term `h` gives a fixed point `Î¾` for `h`, meaning that `h Î¾` <~~> `Î¾`. The combinator `Îâ²` has the advantage that `h (Îâ² h)` really *reduces to* `Îâ² h`. Whereas `h (Yâ² h)` is only *convertible with* `Yâ² h`; that is, there's a common formula they both reduce to. For most purposes, though, either will do.
You may notice that both of these formulas have etaredexes inside them: why can't we simplify the two `\n. u u h n` inside `Îâ²` to just `u u h`? And similarly for `Yâ²`?
@@ 483,22 +500,22 @@ Indeed you can, getting the simpler:
Î â¡ (\u h. h (u u h)) (\u h. h (u u h))
Y â¡ \h. (\u. h (u u)) (\u. h (u u))
We stated the more complex formulas for the following reason: in a language whose evaluation order is *callbyvalue*, the evaluation of `Î (\body. BODY)` and `Y (\body. BODY)` will in general not terminate. But evaluation of the etaunreduced primed versions will.
+We stated the more complex formulas for the following reason: in a language whose evaluation order is *callbyvalue*, the evaluation of `Î (\body. BODY)` and `Y (\body. BODY)` won't terminate. But evaluation of the etaunreduced primed versions may.
Of course, if you define your `\body. BODY` stupidly, your formula will never terminate. For example, it doesn't matter what fixed point combinator you use for `Î¨` in:
+Of course, if you define your `\body. BODY` stupidly, your formula won't terminate, no matter what fixed point combinator you use. For example, let `Î¨` be any fixed point combinator in:
Î¨ (\body. \n. body n)
When you try to evaluate the application of that to some argument `M`, it's going to try to give you back:
 (\n. body n) M
+ (\n. BODY n) M
where `body` is equivalent to the very formula `\n. body n` that contains it. So the evaluation will proceed:
+where `BODY` is equivalent to the very formula `\n. BODY n` that contains it. So the evaluation will proceed:
 (\n. body n) M ~~>
 body M <~~>
 (\n. body n) M ~~>
 body M <~~>
+ (\n. BODY n) M ~~>
+ BODY M <~~>
+ (\n. BODY n) M ~~>
+ BODY M <~~>
...
You've written an infinite loop!
@@ 507,13 +524,14 @@ However, when we evaluate the application of our:
Î¨ (\body. (\xs. (empty? xs) 0 (succ (body (tail xs))) ))
to some list, we're not going to go into an infinite evaluation loop of that sort. At each cycle, we're going to be evaluating the application of:
+to some list, we're *not* going to go into an infinite evaluation loop of that sort. At each cycle, we're going to be evaluating the application of:
\xs. (empty? xs) 0 (succ (body (tail xs)))
to *the tail* of the list we were evaluating its application to at the previous stage. Assuming our lists are finite (and the encodings we've been using so far don't permit otherwise), at some point one will get a list whose tail is empty, and then the evaluation of that formula to that tail will return `0`. So the recursion eventually bottoms out in a base value.
##Fixedpoint Combinators Are a Bit Intoxicating##
+
+## Fixedpoint Combinators Are a Bit Intoxicating ##
[[tatto/images/ycombinatorfixed.jpg]]
@@ 530,21 +548,22 @@ then this is a fixedpoint combinator:
L L L L L L L L L L L L L L L L L L L L L L L L L L
##Watching Y in action##
+## Sink: watching Y in action ##
For those of you who like to watch ultra slowmo movies of bullets
piercing apples, here's a stepwise computation of the application of a
recursive function. We'll use a function `sink`, which takes one
argument. If the argument is boolean true (i.e., `\x y.x`), it
+argument. If the argument is boolean true (i.e., `\x y. x`), it
returns itself (a copy of `sink`); if the argument is boolean false
(`\x y. y`), it returns `I`. That is, we want the following behavior:
 sink false ~~> I
 sink true false ~~> I
 sink true true false ~~> I
 sink true true true false ~~> I
+ sink false <~~> I
+ sink true false <~~> I
+ sink true true false <~~> I
+ sink true true true false <~~> I
So we make `sink = Y (\s b. b s I)`:
+Evidently, then, `sink true <~~> sink`. So we want `sink` to be the fixed point
+of `\sink. \b. b sink I`. That is, `sink â¡ Y (\sb.bsI)`:
1. sink false
2. Y (\sb.bsI) false
@@ 588,15 +607,12 @@ will be discarded, and the recursion will stop.
That's about as simple as recursion gets.

##Application to the truth teller/liar paradoxes##

###Base cases, and their lack###
+## Base cases, and their lack ##
As any functional programmer quickly learns, writing a recursive
function divides into two tasks: figuring out how to handle the
recursive case, and remembering to insert a base case. The
+recursive case, and *remembering to insert a base case*. The
interesting and enjoyable part is figuring out the recursive pattern,
but the base case cannot be ignored, since leaving out the base case
creates a program that runs forever. For instance, consider computing
@@ 612,128 +628,3 @@ recursive rule does not apply. In our terms,
fact â¡ Y (\fact n. (zero? n) 1 (fact (pred n)))
If `n` is `0`, `fact` reduces to `1`, without computing the recursive case.

Curry originally called `Y` the "paradoxical" combinator, and discussed
it in connection with certain wellknown paradoxes from the philosophy
literature. The truthteller paradox has the flavor of a recursive
function without a base case:

(1) This sentence is true.

If we assume that the complex demonstrative "this sentence" can refer
to (1), then the proposition expressed by (1) will be true just in
case the thing referred to by *this sentence* is true. Thus (1) will
be true just in case (1) is true, and (1) is true just in case (1) is
true, and so on. If (1) is true, then (1) is true; but if (1) is not
true, then (1) is not true.

Without pretending to give a serious analysis of the paradox, let's
assume that sentences can have for their meaning boolean functions
like the ones we have been working with here. Then the sentence *John
is John* might denote the function `\x y. x`, our `true`.



Then (1) denotes a function from whatever the referent of *this
sentence* is to a boolean. So (1) denotes `\f. f true false`, where
the argument `f` is the referent of *this sentence*. Of course, if
`f` is a boolean, `f true false <~~> f`, so for our purposes, we can
assume that (1) denotes the identity function `I`.

If we use (1) in a context in which *this sentence* refers to the
sentence in which the demonstrative occurs, then we must find a
meaning `m` such that `I m = I`. But since in this context `m` is the
same as the meaning `I`, so we have `m = I m`. In other words, `m` is
a fixed point for the denotation of the sentence (when used in the
appropriate context).

That means that in a context in which *this sentence* refers to the
sentence in which it occurs, the sentence denotes a fixed point for
the identity function. Here's a fixed point for the identity
function:

 Y I â¡
 (\h. (\u. h (u u)) (\u. h (u u))) I ~~>
 (\u. I (u u)) (\u. I (u u))) ~~>
 (\u. (u u)) (\u. (u u))) â¡
 Ï Ï
 Î©

Oh. Well! That feels right. The meaning of *This sentence is true*
in a context in which *this sentence* refers to the sentence in which
it occurs is `Î©`, our prototypical infinite loop...

What about the liar paradox?

(2) This sentence is false.

Used in a context in which *this sentence* refers to the utterance of
(2) in which that noun phrase occurs, (2) will denote a fixed point for `\f. neg f`,
or `\f y n. f n y`, which is the `C` combinator. So in such a
context, (2) might denote

 Y C
 (\h. (\u. h (u u)) (\u. h (u u))) C
 (\u. C (u u)) (\u. C (u u)))
 C ((\u. C (u u)) (\u. C (u u)))
 C (C ((\u. C (u u)) (\u. C (u u))))
 C (C (C ((\u. C (u u)) (\u. C (u u)))))
 ...

And infinite sequence of `C`s, each one negating the remainder of the
sequence. Yep, that feels like a reasonable representation of the
liar paradox.

See Barwise and Etchemendy's 1987 OUP book, [The Liar: an essay on
truth and circularity](http://tinyurl.com/2db62bk) for an approach
that is similar, but expressed in terms of nonwellfounded sets
rather than recursive functions.

##However...##

You should be cautious about feeling too comfortable with
these results. Thinking again of the truthteller paradox, yes,
`Î©` is *a* fixed point for `I`, and perhaps it has
some privileged status among all the fixed points for `I`, being the
one delivered by `Y` and all (though it is not obvious why `Y` should have
any special status, versus other fixed point combinators).

But one could ask: look, literally every formula is a fixed point for
`I`, since

 X <~~> I X

for any choice of `X` whatsoever.

So the `Y` combinator is only guaranteed to give us one fixed point out
of infinitely many  and not always the intuitively most useful
one. (For instance, the squaring function `\x. mul x x` has `0` as a fixed point,
since `square 0 <~~> 0`, and `1` as a fixed point, since `square 1 <~~> 1`, but `Y
(\x. mul x x)` doesn't give us `0` or `1`.) So with respect to the
truthteller paradox, why in the reasoning we've
just gone through should we be reaching for just this fixed point at
just this juncture?

One obstacle to thinking this through is the fact that a sentence
normally has only two truth values. We might consider instead a noun
phrase such as

(3) the entity that this noun phrase refers to

The reference of (3) depends on the reference of the embedded noun
phrase *this noun phrase*. It's easy to see that any object is a
fixed point for this referential function: if this pen cap is the
referent of *this noun phrase*, then it is the referent of (3), and so
for any object.

The chameleon nature of (3), by the way (a description that is equally
good at describing any object), makes it particularly well suited as a
gloss on pronouns such as *it*. In the system of
[Jacobson 1999](http://www.springerlink.com/content/j706674r4w217jj5/),
pronouns denote (you guessed it!) identity functions...



Ultimately, in the context of this course, these paradoxes are more
useful as a way of gaining leverage on the concepts of fixed points
and recursion, rather than the other way around.