tweak calc improvements

[lambda.git] / week3.mdwn
diff --git a/week3.mdwn b/week3.mdwn

index 892a055..39e472b 100644 (file)
--- a/week3.mdwn
+++ b/week3.mdwn
@@ -1,3 +1,12 @@
+[[!toc]]
+
+##More on evaluation strategies##
+
+Here are notes on [[evaluation order]] that make the choice of which
+lambda to reduce next the selection of a route through a network of
+links.
+
+
  ##Computing the length of a list##
  
  How could we compute the length of a list? Without worrying yet about what lambda-calculus implementation we're using for the list, the basic idea would be to define this recursively:
@@ -211,7 +220,8 @@ Instead of writing out a long formula twice, we could write:
  
  and the initial `(\x. x x)` is just what we earlier called the <code>&omega;</code> combinator (lower-case omega, not the non-terminating <code>&Omega;</code>). So the self-application of `H` can be written:
  
-<pre><code>&omega; (\h \lst. (isempty lst) zero (add one ((h h) (extract-tail lst))))</code></pre>
+<pre><code>&omega; (\h \lst. (isempty lst) zero (add one ((h h) (extract-tail lst))))
+</code></pre>
  
  and this will indeed implement the recursive function we couldn't earlier figure out how to define.
  
@@ -268,6 +278,8 @@ In the lambda calculus, we say a fixed point of an expression `f` is any formula
  
  What is a fixed point of the identity combinator I?
  
+What is a fixed point of the false combinator, KI?
+
  It's a theorem of the lambda calculus that every formula has a fixed point. In fact, it will have infinitely many, non-equivalent fixed points. And we don't just know that they exist: for any given formula, we can name many of them.
  
  Yes, even the formula that you're using the define the successor function will have a fixed point. Isn't that weird? Think about how it might be true.
@@ -412,7 +424,7 @@ to *the tail* of the list we were evaluating its application to at the previous
  
  ##Fixed-point Combinators Are a Bit Intoxicating##
  
-![tatoo](/y-combinator.jpg)
+![tatoo](/y-combinator-fixed.jpg)
  
  There's a tendency for people to say "Y-combinator" to refer to fixed-point combinators generally. We'll probably fall into that usage ourselves. Speaking correctly, though, the Y-combinator is only one of many fixed-point combinators.
  
@@ -427,3 +439,202 @@ then this is a fixed-point combinator:
         L L L L L L L L L L L L L L L L L L L L L L L L L L
  
  
+##Watching Y in action##
+
+For those of you who like to watch ultra slow-mo movies of bullets
+piercing apples, here's a stepwise computation of the application of a
+recursive function.  We'll use a function `sink`, which takes one
+argument.  If the argument is boolean true (i.e., `\x y.x`), it
+returns itself (a copy of `sink`); if the argument is boolean false
+(`\x y. y`), it returns `I`.  That is, we want the following behavior:
+
+    sink false ~~> I
+    sink true false ~~> I
+    sink true true false ~~> I
+    sink true true true false ~~> I
+
+So we make `sink = Y (\f b. b f I)`: 
+
+    1. sink false 
+    2. Y (\fb.bfI) false
+    3. (\f. (\h. f (h h)) (\h. f (h h))) (\fb.bfI) false
+    4. (\h. [\fb.bfI] (h h)) (\h. [\fb.bfI] (h h)) false
+    5. [\fb.bfI] ((\h. [\fb.bsI] (h h))(\h. [\fb.bsI] (h h))) false
+    6. (\b.b[(\h. [\fb.bsI] (h h))(\h. [\fb.bsI] (h h))]I)  false
+    7. false [(\h. [\fb.bsI] (h h))(\h. [\fb.bsI] (h h))] I
+             --------------------------------------------
+    8. I
+
+So far so good.  The crucial thing to note is that as long as we
+always reduce the outermost redex first, we never have to get around
+to computing the underlined redex: because `false` ignores its first
+argument, we can throw it away unreduced.
+
+Now we try the next most complex example:
+
+    1. sink true false 
+    2. Y (\fb.bfI) true false
+    3. (\f. (\h. f (h h)) (\h. f (h h))) (\fb.bfI) true false
+    4. (\h. [\fb.bfI] (h h)) (\h. [\fb.bfI] (h h)) true false
+    5. [\fb.bfI] ((\h. [\fb.bsI] (h h))(\h. [\fb.bsI] (h h))) true false
+    6. (\b.b[(\h. [\fb.bsI] (h h))(\h. [\fb.bsI] (h h))]I)  true false
+    7. true [(\h. [\fb.bsI] (h h))(\h. [\fb.bsI] (h h))] I false
+    8. [(\h. [\fb.bsI] (h h))(\h. [\fb.bsI] (h h))] false
+
+We've now arrived at line (4) of the first computation, so the result
+is again I.
+
+You should be able to see that `sink` will consume as many `true`s as
+we throw at it, then turn into the identity function after it
+encounters the first `false`. 
+
+The key to the recursion is that, thanks to Y, the definition of
+`sink` contains within it the ability to fully regenerate itself as
+many times as is necessary.  The key to *ending* the recursion is that
+the behavior of `sink` is sensitive to the nature of the input: if the
+input is the magic function `false`, the self-regeneration machinery
+will be discarded, and the recursion will stop.
+
+That's about as simple as recursion gets.
+
+##Base cases, and their lack##
+
+As any functional programmer quickly learns, writing a recursive
+function divides into two tasks: figuring out how to handle the
+recursive case, and remembering to insert a base case.  The
+interesting and enjoyable part is figuring out the recursive pattern,
+but the base case cannot be ignored, since leaving out the base case
+creates a program that runs forever.  For instance, consider computing
+a factorial: `n!` is `n * (n-1) * (n-2) * ... * 1`.  The recursive
+case says that the factorial of a number `n` is `n` times the
+factorial of `n-1`.  But if we leave out the base case, we get
+
+    3! = 3 * 2! = 3 * 2 * 1! = 3 * 2 * 1 * 0! = 3 * 2 * 1 * 0 * -1! ...
+
+That's why it's crucial to declare that 0! = 1, in which case the
+recursive rule does not apply.  In our terms,
+
+    fac = Y (\fac n. iszero n 1 (fac (predecessor n)))
+
+If `n` is 0, `fac` reduces to 1, without computing the recursive case.
+
+There is a well-known problem in philosophy and natural language
+semantics that has the flavor of a recursive function without a base
+case: the truth-teller paradox (and related paradoxes).
+
+(1)    This sentence is true.
+
+If we assume that the complex demonstrative "this sentence" can refer
+to (1), then the proposition expressed by (1) will be true just in
+case the thing referred to by *this sentence* is true.  Thus (1) will
+be true just in case (1) is true, and (1) is true just in case (1) is
+true, and so on.  If (1) is true, then (1) is true; but if (1) is not
+true, then (1) is not true.
+
+Without pretending to give a serious analysis of the paradox, let's
+assume that sentences can have for their meaning boolean functions
+like the ones we have been working with here.  Then the sentence *John
+is John* might denote the function `\x y. x`, our `true`.  
+
+Then (1) denotes a function from whatever the referent of *this
+sentence* is to a boolean.  So (1) denotes `\f. f true false`, where
+the argument `f` is the referent of *this sentence*.  Of course, if
+`f` is a boolean, `f true false <~~> f`, so for our purposes, we can
+assume that (1) denotes the identity function `I`.
+
+If we use (1) in a context in which *this sentence* refers to the
+sentence in which the demonstrative occurs, then we must find a
+meaning `m` such that `I m = I`.  But since in this context `m` is the
+same as the meaning `I`, so we have `m = I m`.  In other words, `m` is
+a fixed point for the denotation of the sentence (when used in the
+appropriate context).
+
+That means that in a context in which *this sentence* refers to the
+sentence in which it occurs, the sentence denotes a fixed point for
+the identity function.  Here's a fixed point for the identity
+function:
+
+<pre><code>Y I
+(\f. (\h. f (h h)) (\h. f (h h))) I
+(\h. I (h h)) (\h. I (h h)))
+(\h. (h h)) (\h. (h h)))
+&omega; &omega;
+&Omega
+</code></pre>
+
+Oh.  Well!  That feels right.  The meaning of *This sentence is true*
+in a context in which *this sentence* refers to the sentence in which
+it occurs is <code>&Omega;</code>, our prototypical infinite loop...
+
+What about the liar paradox?
+
+(2)  This sentence is false.
+
+Used in a context in which *this sentence* refers to the utterance of
+(2) in which it occurs, (2) will denote a fixed point for `\f.neg f`,
+or `\f l r. f r l`, which is the `C` combinator.  So in such a
+context, (2) might denote
+
+     Y C
+     (\f. (\h. f (h h)) (\h. f (h h))) I
+     (\h. C (h h)) (\h. C (h h))) 
+     C ((\h. C (h h)) (\h. C (h h)))
+     C (C ((\h. C (h h))(\h. C (h h))))
+     C (C (C ((\h. C (h h))(\h. C (h h)))))
+     ...
+
+And infinite sequence of `C`s, each one negating the remainder of the
+sequence.  Yep, that feels like a reasonable representation of the
+liar paradox.
+
+See Barwise and Etchemendy's 1987 OUP book, [The Liar: an essay on
+truth and circularity](http://tinyurl.com/2db62bk) for an approach
+that is similar, but expressed in terms of non-well-founded sets
+rather than recursive functions.
+
+##However...##
+
+You should be cautious about feeling too comfortable with
+these results.  Thinking again of the truth-teller paradox, yes,
+<code>&Omega;</code> is *a* fixed point for `I`, and perhaps it has
+some a privileged status among all the fixed points for `I`, being the
+one delivered by Y and all (though it is not obvious why Y should have
+any special status).
+
+But one could ask: look, literally every formula is a fixed point for
+`I`, since
+
+    X <~~> I X
+
+for any choice of X whatsoever.
+
+So the Y combinator is only guaranteed to give us one fixed point out
+of infinitely many---and not always the intuitively most useful
+one. (For instance, the squaring function has zero as a fixed point,
+since 0 * 0 = 0, and 1 as a fixed point, since 1 * 1 = 1, but `Y
+(\x. mul x x)` doesn't give us 0 or 1.) So with respect to the
+truth-teller paradox, why in the reasoning we've
+just gone through should we be reaching for just this fixed point at
+just this juncture?
+
+One obstacle to thinking this through is the fact that a sentence
+normally has only two truth values.  We might consider instead a noun
+phrase such as 
+
+(3)  the entity that this noun phrase refers to
+
+The reference of (3) depends on the reference of the embedded noun
+phrase *this noun phrase*.  It's easy to see that any object is a
+fixed point for this referential function: if this pen cap is the
+referent of *this noun phrase*, then it is the referent of (3), and so
+for any object.
+
+The chameleon nature of (3), by the way (a description that is equally
+good at describing any object), makes it particularly well suited as a
+gloss on pronouns such as *it*.  In the system of 
+[Jacobson 1999](http://www.springerlink.com/content/j706674r4w217jj5/),
+pronouns denote (you guessed it!) identity functions...
+
+Ultimately, in the context of this course, these paradoxes are more
+useful as a way of gaining leverage on the concepts of fixed points
+and recursion, rather than the other way around.