edits

[lambda.git] / week3.mdwn
diff --git a/week3.mdwn b/week3.mdwn

index 40bcb99..892a055 100644 (file)
--- a/week3.mdwn
+++ b/week3.mdwn
@@ -99,6 +99,34 @@ where this very same formula occupies the `...` position:
  
  but as you can see, we'd still have to plug the formula back into itself again, and again, and again... No dice.
  
+[At this point, some of you will recall the discussion in the first
+class concerning the conception of functions as sets of ordered pairs.
+The problem, as you will recall, was that in the untyped lambda
+calculus, we wanted a function to be capable of taking itself as an
+argument.  For instance, we wanted to be able to apply the identity
+function to itself.  And since the identity function always returns
+its argument unchanged, the value it should return in that case is
+itself:
+
+    (\x.x)(\x.x) ~~> (\x.x)
+
+If we conceive of a function as a set of ordered pairs, we would start
+off like this:
+
+    1 -> 1
+    2 -> 2
+    3 -> 3
+    ...
+    [1 -> 1, 2 -> 2, 3 -> 3, ..., [1 -> 1, 2 -> 2, 3 -> 3, ..., 
+
+Eventually, we would get to the point where we want to say what the
+identity function itself gets mapped to.  But in order to say that, we
+need to write down the identity function in the argument position as a
+set of ordered pairs.  The need to insert a copy of the entire
+function definition inside of a copy of the entire function definition
+inside of... is the same problem as the need to insert a complete
+graph of the identity function inside of the graph for the identity function.]
+
  So how could we do it? And how do OCaml and Scheme manage to do it, with their `let rec` and `letrec`?
  
  1.     OCaml and Scheme do it using a trick. Well, not a trick. Actually an impressive, conceptually deep technique, which we haven't yet developed. Since we want to build up all the techniques we're using by hand, then, we shouldn't permit ourselves to rely on `let rec` or `letrec` until we thoroughly understand what's going on under the hood.
@@ -126,7 +154,10 @@ With sufficient ingenuity, a great many functions can be defined in the same way
  
  ##However...##
  
-Some computable functions are just not definable in this way. The simplest function that *simply cannot* be defined using the resources we've so far developed is the Ackermann function:
+Some computable functions are just not definable in this way. We can't, for example, define a function that tells us, for whatever function `f` we supply it, what is the smallest integer `x` where `f x` is `true`.
+
+Neither do the resources we've so far developed suffice to define the 
+[[!wikipedia Ackermann function]]:
  
         A(m,n) =
                 | when m == 0 -> n + 1
@@ -134,9 +165,9 @@ Some computable functions are just not definable in this way. The simplest funct
                 | else -> A(m-1, A(m,n-1))
  
         A(0,y) = y+1
-       A(1,y) = y+2
-       A(2,y) = 2y + 3
-       A(3,y) = 2^(y+3) -3
+       A(1,y) = 2+(y+3) - 3
+       A(2,y) = 2(y+3) - 3
+       A(3,y) = 2^(y+3) - 3
         A(4,y) = 2^(2^(2^...2)) [where there are y+3 2s] - 3
         ...
  
@@ -146,11 +177,90 @@ But functions like the Ackermann function require us to develop a more general t
  
  ##How to do recursion with lower-case omega##
  
-[TODO]
+Recall our initial, abortive attempt above to define the `get_length` function in the lambda calculus. We said "What we really want to do is something like this:
+
+       \lst. (isempty lst) zero (add one (... (extract-tail lst)))
+
+where this very same formula occupies the `...` position."
+
+We are not going to exactly that, at least not yet. But we are going to do something close to it.
+
+Consider a formula of the following form (don't worry yet about exactly how we'll fill the `...`s):
+
+       \h \lst. (isempty lst) zero (add one (... (extract-tail lst)))
+
+Call that formula `H`. Now what would happen if we applied `H` to itself? Then we'd get back:
+
+       \lst. (isempty lst) zero (add one (... (extract-tail lst)))
+
+where any occurrences of `h` inside the `...` were substituted with `H`. Call this `F`. `F` looks pretty close to what we're after: a function that takes a list and returns zero if it's empty, and so on. And `F` is the result of applying `H` to itself. But now inside `F`, the occurrences of `h` are substituted with the very formula `H` we started with. So if we want to get `F` again, all we have to do is apply `h` to itself---since as we said, the self-application of `H` is how we created `F` in the first place.
+
+So, the way `F` should be completed is:
+
+       \lst. (isempty lst) zero (add one ((h h) (extract-tail lst)))
+
+and our original `H` is:
+
+       \h \lst. (isempty lst) zero (add one ((h h) (extract-tail lst)))
+
+The self-application of `H` will give us `F` with `H` substituted in for its free variable `h`.
+
+Instead of writing out a long formula twice, we could write:
+
+       (\x. x x) LONG-FORMULA
+
+and the initial `(\x. x x)` is just what we earlier called the <code>&omega;</code> combinator (lower-case omega, not the non-terminating <code>&Omega;</code>). So the self-application of `H` can be written:
+
+<pre><code>&omega; (\h \lst. (isempty lst) zero (add one ((h h) (extract-tail lst))))</code></pre>
+
+and this will indeed implement the recursive function we couldn't earlier figure out how to define.
+
+In broad brush-strokes, `H` is half of the `get_length` function we're seeking, and `H` has the form:
+
+       \h other-arguments. ... (h h) ...
+
+We get the whole `get_length` function by applying `H` to itself. Then `h` is replaced by the half `H`, and when we later apply `h` to itself, we re-create the whole `get_length` again.
+
+##Neat! Can I make it easier to use?##
+
+Suppose you wanted to wrap this up in a pretty interface, so that the programmer didn't need to write `(h h)` but could just write `g` for some function `g`. How could you do it?
+
+Now the `F`-like expression we'd be aiming for---call it `F*`---would look like this:
+
+       \lst. (isempty lst) zero (add one (g (extract-tail lst)))
+
+or, abbreviating:
+
+       \lst. ...g...
+
+Here we have just a single `g` instead of `(h h)`. We'd want `F*` to be the result of self-applying some `H*`, and then binding to `g` that very self-application of `H*`. We'd get that if `H*` had the form:
+
+       \h. (\g lst. ...g...) (h h)
+
+The self-application of `H*` would be:
+
+       (\h. (\g lst. ...g...) (h h)) (\h. (\g lst. ...g...) (h h))
+
+or:
+
+       (\f. (\h. f (h h)) (\h. f (h h))) (\g lst. ...g...)
+
+The left-hand side of this is known as **the Y-combinator** and so this could be written more compactly as:
+
+       Y (\g lst. ...g...)
+
+or, replacing the abbreviated bits:
+
+       Y (\g lst. (isempty lst) zero (add one (g (extract-tail lst))))
+
+So this is another way to implement the recursive function we couldn't earlier figure out how to define.
+
  
  ##Generalizing##
  
-In general, a **fixed point** of a function f is a value *x* such that f<em>x</em> is equivalent to *x*. For example, what is a fixed point of the function from natural numbers to their squares? What is a fixed point of the successor function?
+Let's step back and fill in some theory to help us understand why these tricks work.
+
+In general, we call a **fixed point** of a function f any value *x* such that f <em>x</em> is equivalent to *x*. For example, what is a fixed point of the function from natural numbers to their squares? What is a fixed point of the successor function?
  
  In the lambda calculus, we say a fixed point of an expression `f` is any formula `X` such that:
  
@@ -170,17 +280,17 @@ who knows what we'd get back? Perhaps there's some non-number-representing formu
  
  Yes! That's exactly right. And which formula this is will depend on the particular way you've implemented the successor function.
  
-Moreover, the recipes that enable us to name fixed points for any given formula aren't *guaranteed* to give us *terminating* fixed points. They might give us formulas X such that neither `X` nor `f X` have normal forms. (Indeed, what they give us for the square function isn't any of the Church numerals, but is rather an expression with no normal form.) However, if we take care we can ensure that we *do* get terminating fixed points. And this gives us a principled, fully general strategy for doing recursion. It lets us define even functions like the Ackermann function, which were until now out of our reach. It would let us define arithmetic and list functions on the "version 1" and "version 2" implementations, where it wasn't always clear how to force the computation to "keep going."
+Moreover, the recipes that enable us to name fixed points for any given formula aren't *guaranteed* to give us *terminating* fixed points. They might give us formulas X such that neither `X` nor `f X` have normal forms. (Indeed, what they give us for the square function isn't any of the Church numerals, but is rather an expression with no normal form.) However, if we take care we can ensure that we *do* get terminating fixed points. And this gives us a principled, fully general strategy for doing recursion. It lets us define even functions like the Ackermann function, which were until now out of our reach. It would also let us define arithmetic and list functions on the "version 1" and "version 2" implementations, where it wasn't always clear how to force the computation to "keep going."
  
  OK, so how do we make use of this?
  
-Recall our initial, abortive attempt above to define the `get_length` function in the lambda calculus. We said "What we really want to do is something like this:
+Recall again our initial, abortive attempt above to define the `get_length` function in the lambda calculus. We said "What we really want to do is something like this:
  
         \lst. (isempty lst) zero (add one (... (extract-tail lst)))
  
  where this very same formula occupies the `...` position."
  
-Now, what if we *were* somehow able to get ahold of this formula, as an additional argument? We could take that argument and plug it into the `...` position. Something like this:
+If we could somehow get ahold of this very formula, as an additional argument, then we could take the argument and plug it into the `...` position. Something like this:
  
         \self (\lst. (isempty lst) zero (add one (self (extract-tail lst))) )
  
@@ -254,16 +364,14 @@ Isn't that cool?
  
  ##Okay, then give me a fixed-point combinator, already!##
  
-Many fixed-point combinators have been discovered. (And given a fixed-point combinators, there are ways to use it as a model to build infinitely many more, non-equivalent fixed-point combinators.)
+Many fixed-point combinators have been discovered. (And some fixed-point combinators give us models for building infinitely many more, non-equivalent fixed-point combinators.)
  
  Two of the simplest:
  
  <pre><code>&Theta;&prime; &equiv; (\u f. f (\n. u u f n)) (\u f. f (\n. u u f n))
  Y&prime; &equiv; \f. (\u. f (\n. u u n)) (\u. f (\n. u u n))</code></pre>
  
-&Theta;&prime; has the advantage that <code>f (&Theta;&prime; f)</code> really *reduces to* <code>&Theta;&prime; f</code>.
-
-<code>f (Y&prime; f)</code> is only convertible with <code>Y&prime; f</code>; that is, there's a common formula they both reduce to. For most purposes, though, either will do.
+<code>&Theta;&prime;</code> has the advantage that <code>f (&Theta;&prime; f)</code> really *reduces to* <code>&Theta;&prime; f</code>. Whereas <code>f (Y&prime; f)</code> is only *convertible with* <code>Y&prime; f</code>; that is, there's a common formula they both reduce to. For most purposes, though, either will do.
  
  You may notice that both of these formulas have eta-redexes inside them: why can't we simplify the two `\n. u u f n` inside <code>&Theta;&prime;</code> to just `u u f`? And similarly for <code>Y&prime;</code>?
  
@@ -319,7 +427,3 @@ then this is a fixed-point combinator:
         L L L L L L L L L L L L L L L L L L L L L L L L L L
  
  
-
-[TODO: Explain how what we've done relates to the version using lower-case &omega;.]
-
-