final note

[lambda.git] / topics / week2_encodings.mdwn
diff --git a/topics/week2_encodings.mdwn b/topics/week2_encodings.mdwn

index 46f1529..5bb69d6 100644 (file)
--- a/topics/week2_encodings.mdwn
+++ b/topics/week2_encodings.mdwn
@@ -69,7 +69,7 @@ in Kapulet syntax:
  
  or:
  
  
  or:
  
-    lambda p q. if p then q else q
+    lambda p q. if p then q else p
  
  Given that we know how to express `if ... then ...` in terms of our encoded Booleans,
  we can represent this easily in the Lambda Calculus as either:
  
  Given that we know how to express `if ... then ...` in terms of our encoded Booleans,
  we can represent this easily in the Lambda Calculus as either:
@@ -131,7 +131,7 @@ we do recommend you [get Scheme installed on your
  computer](/how_to_get_the_programming_languages_running_on_your_computer), and
  [get started learning Scheme](/learning_scheme). It will help you understand the ideas better to experiment with it.
  
  computer](/how_to_get_the_programming_languages_running_on_your_computer), and
  [get started learning Scheme](/learning_scheme). It will help you understand the ideas better to experiment with it.
  
-There's also a (slow, bare-bones, but perfectly adequate) version of Scheme available for online use at <http://tryscheme.sourceforge.net/>.
+There's also a (slow, bare-bones, but perfectly adequate) version of Scheme available for online use at <http://tryscheme.sourceforge.net/>. (Unfortunately, though, that Scheme implementation will only display the result of `((lambda-and true) false)` as `#<procedure lambda>`. You won't be able to see if it's the function assigned to `true` or the function assigned to `false`. You'd have to feed that result some more arguments to see how it behaved.)
  
  You should also be experimenting with this site's [[lambda evaluator|code/lambda evaluator]].
  
  
  You should also be experimenting with this site's [[lambda evaluator|code/lambda evaluator]].
  
@@ -204,7 +204,7 @@ That will evaluate to whatever this does:
  
      f (10, f (20, f (30, z)))
  
  
      f (10, f (20, f (30, z)))
  
-For example, if we let `f` be `(+)` and `z` be `0`, then it will be `10 + (20 + (30 + 0))` or `60`. Another example, if we let `f` be `(&)` and `z` be `[]`, then `fold_right ((+), []) [10, 20, 30]` will be `10 & (20 & (30 & []))` or `[10, 20, 30]`, the same sequence we began with.
+For example, if we let `f` be `(+)` and `z` be `0`, then it will be `10 + (20 + (30 + 0))` or `60`. Another example, if we let `f` be `(&)` and `z` be `[]`, then `fold_right ((&), []) [10, 20, 30]` will be `10 & (20 & (30 & []))` or `[10, 20, 30]`, the same sequence we began with.
  
  The other function works like this:
  
  
  The other function works like this:
  
@@ -237,14 +237,15 @@ We also saw in seminar how to define `fold_right`. We could do this in Kapulet l
      letrec
        fold_right (f, z) xs = case xs of
                                 [] then zs;
      letrec
        fold_right (f, z) xs = case xs of
                                 [] then zs;
-                              y & ys then f (y, fold_right (f, z) ys)
+                               y & ys then f (y, fold_right (f, z) ys)
+                             end
      in fold_right
  
      in fold_right
  
-(In some presentations you may see this with `f` as a curried function of the form `lambda x z. ...` rather than the uncurried form `lambda (x, z). ...` I have here.)
+(In some presentations you may see this expecting `f` to be a curried function `lambda x z. ...` rather than the uncurried `lambda (x, z). ...` that I'm expecting here. We will be shifting to the curried form ourselves momentarily.)
  
  We suggested in class that these functions were very powerful, and could be deployed to do most everything you might want to do with a list. Given how our strategy for encoding booleans turned out, this ought to suggest to you the idea that *folding is what we fundamentally do* with lists, just as *conditional branching is what we fundamentally do* with booleans. So we can try to encode lists in terms of lambda expressions that will let us perform folds on them. We could try to do this with either right-folds or left-folds. Either is viable. Some things are more natural if you use right-folds, though, so let's do that.
  
  
  We suggested in class that these functions were very powerful, and could be deployed to do most everything you might want to do with a list. Given how our strategy for encoding booleans turned out, this ought to suggest to you the idea that *folding is what we fundamentally do* with lists, just as *conditional branching is what we fundamentally do* with booleans. So we can try to encode lists in terms of lambda expressions that will let us perform folds on them. We could try to do this with either right-folds or left-folds. Either is viable. Some things are more natural if you use right-folds, though, so let's do that.
  
-But what should the encoding look like? We don't know *what* function and *what* starting value someone might want to fold over our list!
+What should the encoding look like? We don't know *what* function and *what* starting value someone might want to fold over our list!
  
  Wait, does that remind you of anything?
  
  
  Wait, does that remind you of anything?
  
@@ -254,15 +255,15 @@ Indeed, we'll make our encoded lists consist of higher-order *functions* that ta
  
      \f z. SOMETHING
  
  
      \f z. SOMETHING
  
-but what should the `SOMETHING` be? Well, when we supply an `f` and a `z` we should get the right-fold of those over `[a, b, c]` back, so the answer should evidently be:
+Now, what should the `SOMETHING` be? Well, when we supply an `f` and a `z` we should get back the right-fold of those over `[a, b, c]`, so the answer should evidently be:
  
      \f z. f a (f b (f c z))
  
  
      \f z. f a (f b (f c z))
  
-Here we work with curried functions, because that's how the Lambda Calculus does things. You wouldn't want to build up a tuple using the mechanisms described above, and then supply f as an argument to that tuple, and so on. That would be a lot of red tape for no benefit. In the Lambda Calculus, it's simpler to just work with curried functions as our natural idiom.
+Here we work with curried functions, because that's how the Lambda Calculus does things. You wouldn't want to build up a tuple using the mechanisms described above, and then supply `f` as an argument to that tuple, and so on. That would be a lot of red tape for no benefit. In the Lambda Calculus, it's simpler to just work with curried functions as our natural idiom.
  
  So if `[a, b, c]` should be the displayed higher-order function above, what should `[c]` be? Evidently:
  
  
  So if `[a, b, c]` should be the displayed higher-order function above, what should `[c]` be? Evidently:
  
-   \f z. f c z
+    \f z. f c z
  
  Now what should the empty list `[]` be? Think about it...
  
  
  Now what should the empty list `[]` be? Think about it...
  
@@ -272,10 +273,10 @@ I hope it was this one: `\f z. z`. Because when we fold a function `f` and a sta
  
  We saw before what a `make-triple` function would look like. What about a `make-list` function, or as we've been calling it, "cons" (`&` in Kapulet)? Well, we've already seen what the representation of `[]` and `[c]` are. In that simplest case, the `cons` function should take us from the encoding of `[]`, namely `\f z. z` to the encoding of `[c]`, namely `\f z. f c z`, as a result. Here is how we can define this:
  
  
  We saw before what a `make-triple` function would look like. What about a `make-list` function, or as we've been calling it, "cons" (`&` in Kapulet)? Well, we've already seen what the representation of `[]` and `[c]` are. In that simplest case, the `cons` function should take us from the encoding of `[]`, namely `\f z. z` to the encoding of `[c]`, namely `\f z. f c z`, as a result. Here is how we can define this:
  
-    let cons = \c cs. \f z. f c (cs f z)
+    let cons = \d ds. \f z. f d (ds f z)
      in ...
  
      in ...
  
-Let's walk through this. We supply this function with our `c` and `[]` as arguments. So its `c` gets bound to our `c`, and its `cs` gets bound to our `[]`, namely `\f z z`. Then our result is a higher order function of the form `\f z. f c SOMETHING`. That looks good, what we're after is just this, except the `SOMETHING` should be `z`. If we just simply said `z` here, then our `cons` function would always be giving us back a singleton list of the form `\f z. f x z` for some `x`. That's what we want in this case, but not in the general case. What we'd like is to use the second argument we fed to `cons`, here `[]` or `\f z. z`, and have it reduce to `z`, with the hope that the same strategy wuld make more complex second arguments reduce to other appropriate values. Well, `\f z. z` expects an `f` and a `z` as arguments, and hey we happen to have been given an `f` and a `z`, supplied by the consumer or user of the sequence `[c]` that we're building. So let's just give them to `\f z. z` as its arguments, and it gives us `z`. In other words, we make `SOMETHING` be `cs f z`.
+Let's walk through this. We supply this function with our `c` and `[]` as arguments. So its `d` gets bound to our `c`, and its `ds` gets bound to our `[]`, namely `\f z. z`. Then our result is a higher order function of the form `\f z. f c SOMETHING`. That looks good, what we're after is just this, except the `SOMETHING` should be `z`. If we just simply said `z` here, then our `cons` function would always be giving us back a singleton list of the form `\f z. f x z` for some `x`. That's what we want in this case, but not in the general case. What we'd like is to use the second argument we fed to `cons`, here `[]` or `\f z. z`, and have it reduce to `z`, with the hope that the same strategy would make more complex second arguments reduce to other appropriate values. Well, `\f z. z` expects an `f` and a `z` as arguments, and hey we happen to have an `f` and a `z` handy, since we're inside something of the form `\f z. f c SOMETHING`. The `f` and `z` will be supplied by the consumer or user of the sequence `[c]` that we're building. So let's just give *those* `f` and `z` to `\f z. z` as *its* arguments, and we end up with simply `z`. In other words, where `ds` is the second argument we fed to `cons`, we make `SOMETHING` be `ds f z`.
  
  Formally:
  
  
  Formally:
  
@@ -283,8 +284,8 @@ Formally:
  
  is:
  
  
  is:
  
-    (\c cs. \f z. f c (cs f z)) c (\f z. z) ~~>
-    (\cs. \f z. f c (cs f z)) (\f z. z) ~~>
+    (\d ds. \f z. f d (ds f z)) c (\f z. z) ~~>
+    (\ds. \f z. f c (ds f z)) (\f z. z) ~~>
      \f z. f c ((\f z. z) f z) ~~>
      \f z. f c ((\z. z) z) ~~>
      \f z. f c z
      \f z. f c ((\f z. z) f z) ~~>
      \f z. f c ((\z. z) z) ~~>
      \f z. f c z
@@ -297,13 +298,13 @@ Will this work when more complex values are supplied as the second argument to `
  
  is:
  
  
  is:
  
-    (\c cs. \f z. f c (cs f z)) b (\f z. f c z) ~~>
-    (\cs. \f z. f b (cs f z)) (\f z. f c z) ~~>
+    (\d ds. \f z. f d (ds f z)) b (\f z. f c z) ~~>
+    (\ds. \f z. f b (ds f z)) (\f z. f c z) ~~>
      \f z. f b ((\f z. f c z) f z) ~~>
      \f z. f b ((\z. f c z) z) ~~>
      \f z. f b (f c z)
  
      \f z. f b ((\f z. f c z) f z) ~~>
      \f z. f b ((\z. f c z) z) ~~>
      \f z. f b (f c z)
  
-which looks right. Persuade yourself that the `cons` we've defined here, together with the representation `\f z. z` for the empty list, always give us the correct encoding for complex lists, in terms of a function that takes an `f` and a `z` as arguments, and then returns the result of right-folding those values over the list elements in the appropriate order.
+which looks right. Persuade yourself that the `cons` we've defined here, together with the representation `\f z. z` for the empty list, always give us the correct encoding for complex lists, in terms of a function that takes an `f` and a `z` as arguments, and then returns the result of right-folding those arguments over the list elements in the appropriate order.
  
  You may notice that the encoding we're proposing for `[]` is the same encoding we proposed above for `false`. There's nothing deep to this. If we had adopted other conventions, we could easily have made it instead be `true` and `[]` that had the same encoding.
  
  
  You may notice that the encoding we're proposing for `[]` is the same encoding we proposed above for `false`. There's nothing deep to this. If we had adopted other conventions, we could easily have made it instead be `true` and `[]` that had the same encoding.
  
@@ -318,13 +319,13 @@ In our Lambda Calculus encoding, `fold_right (f, z) xs` gets translated to `xs f
  in Kapulet syntax, turns into this in our lambda evaluator:
  
      let empty = \f z. z in
  in Kapulet syntax, turns into this in our lambda evaluator:
  
      let empty = \f z. z in
-    let cons = \c cs. \f z. f c (cs f z) in
+    let cons = \d ds. \f z. f d (ds f z) in
      xs (\x zs. cons (g x) zs) empty
  
  Try out this:
  
      let empty = \f z. z in
      xs (\x zs. cons (g x) zs) empty
  
  Try out this:
  
      let empty = \f z. z in
-    let cons = \c cs. \f z. f c (cs f z) in
+    let cons = \d ds. \f z. f d (ds f z) in
      let map = \g xs. xs (\x zs. cons (g x) zs) empty in
      let abc = \f z. f a (f b (f c z)) in
      map g abc
      let map = \g xs. xs (\x zs. cons (g x) zs) empty in
      let abc = \f z. f a (f b (f c z)) in
      map g abc
@@ -333,11 +334,90 @@ That will evaluate to:
  
      \f z. f (g a) (f (g b) (f (g c) z))
  
  
      \f z. f (g a) (f (g b) (f (g c) z))
  
-which looks like what we want, a higher-order function that will take an `f` and a `z` as arguments and then return the right fold of those values over `[g a, g b, g c]`, which is `map g [a, b, c]`.
+which looks like what we want, a higher-order function that will take an `f` and a `z` as arguments and then return the right fold of those arguments over `[g a, g b, g c]`, which is `map g [a, b, c]`.
  
  
  ## Numbers ##
  
  
  
  ## Numbers ##
  
-It's noteworthy that when he was developing the lambda calculus, Church had not predicted it would be possible to encode the numbers. It came as a welcome surprise.
+Armed with the encoding we developed for lists above, Church's method for encoding numbers in the Lambda Calculus is very natural. But this is not the order that these ideas were historically developed.
+
+It's noteworthy that when he was developing the Lambda Calculus, Church had not predicted it would be possible to encode the numbers. It came as a welcome surprise.
+
+The idea here is that a number is something like a list, only without any slot for an interesting, possibly varying head element at each level. Whereas in a list like:
+
+    [a, b, c]
+
+we have:
+
+    head=a, sublist=(head=b, sublist=(head=c, sublist=[]))
+
+with the number 3 we instead only have:
+
+    head=(), subnumber=(head=(), subnumber=(head=(), subnumber=zero))
+
+where `()` is the zero-length tuple, the one and only member of its type. That is, the head here is a type of thing that cannot interestingly vary. Or we could just eliminate the head, and have:
+
+    subnumber=(subnumber=(subnumber=zero))
+
+probably more familiar to you as:
+
+    succ (succ (succ zero))
+
+In other words, where a list is inductively built up out of some interesting new datum and its "sublist" or tail, a number is inductively built up out of no interesting new datum and its "subnumber" or predecessor. In this light, a number can be seen as a kind of stunted or undernourished list.
+
+Now, we had a strategy for encoding the list `[a, b, c]` as:
+
+    \f z. f a (f b (f c z))
+
+So perhaps we can apply the same kind of idea to the number 3, and encode it as:
+
+    \f z. f   (f   (f   z))
+
+In fact, there's a way of looking at this that makes it look incredibly natural. You may have encountered the idea of a composition of two functions `f` and `g`, written as <code>f &cir; g</code>, or in Kapulet as `f comp g`. This is defined to be:
+
+    \x. f (g x)
+
+The composition of a function `f` with itself, namely:
+
+    \x. f (f x)
+
+is sometimes abbreviated as <code>f<sup>2</sup></code>. Similarly, <code>f &cir; f &cir; f</code> or `\x. f (f (f x))` is sometimes abbreviated as <code>f<sup>3</sup></code>. <code>f<sup>1</sup></code> is understood to be just `f` itself, and <code>f<sup>0</sup></code> is understood to be the identity function.
+
+So in proposing to encode the number 3 as:
+
+    \f z. f   (f   (f   z))
+
+we are proposing to encode it as:
+
+<code>\f z. f<sup>3</sup> z</code>
+
+(The <code>f<sup><em>n</em></sup></code> isn't some bit of new Lambda Calculus syntax. It's just our metalanguage abbreviation for the same expressions we had before. It does help focus our attention on what we're essentially doing here.)
+
+And indeed this is the Church encoding of the numbers:
+
+<code>0 &equiv; \f z. I z &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;;  or \f z. f<sup>0</sup> z</code>  
+<code>1 &equiv; \f z. f z &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;;  or \f z. f<sup>1</sup> z</code>  
+<code>2 &equiv; \f z. f (f z) &nbsp;&nbsp;&nbsp;&nbsp;;  or \f z. f<sup>2</sup> z</code>  
+<code>3 &equiv; \f z. f (f (f z)) ;  or \f z. f<sup>3</sup> z</code>  
+<code>...</code>
+
+The encoding for `0` can also be written as `\f z. z`, which we've also proposed as the encoding for `[]` and for `false`. Don't read too much into this.
+
+Given the above, can you figure out how to define the `succ` function? We already worked through the definition of `cons`, and this is just a simplification of that, so you should be able to do it. We'll make it a homework.
+
+We saw that using the `cons` operation and `[]` as arguments to `fold_right` over a list gave us back that very same list. In the Lambda Calculus, that would come down to:
+
+    [a, b, c] cons []
+
+with the appropriate lambda terms substituted throughout, would just reduce to the same lambda term we used to encode `[a, b, c]`. What do you think this would reduce to:
+
+    3 succ 0
+
+with the appropriate lambda terms substituted throughout? Since 3 is `\f z. f (f (f z))`, it should reduce to whatever:
+
+    succ (succ (succ 0))
+
+does. And if we define `succ` properly, we should expect that to give us the same lambda term that we used to encode `3` in the first place.
+
  
  
-*More coming...*
+We'll look at other encodings of lists and numbers later.