week1: functional ocaml turing complete after all

[lambda.git] / week2.mdwn
diff --git a/week2.mdwn b/week2.mdwn

index 4391dc6..b43576f 100644 (file)
--- a/week2.mdwn
+++ b/week2.mdwn
@@ -35,9 +35,10 @@ Another way to think of it is to identify expressions not with particular alphab
  
  A third way to think is to identify the lambda formula not with classes of alphabetic sequences, but rather with abstract structures that we might draw like this:
  
-<pre><code>&lambda; ... `___` ...
-^      |
-|`______`|
+<pre><code>
+       &lambda; ... ___ ...
+       ^      |
+       |______|
  </code></pre>
  
  Here there are no bound variables, but there are *bound positions*. We can regard formula like (a) and (b) as just helpfully readable ways to designate these abstract structures.
@@ -62,8 +63,12 @@ Define T to be `(\x. x y) z`. Then T and `(\x. x y) z` are syntactically equal,
  equivalent to `(\z. z y) z` is that when a lambda binds a set of
  occurrences, it doesn't matter which variable serves to carry out the
  binding.  Either way, the function does the same thing and means the
-same thing.  Look in the standard treatments for discussions of alpha
-equivalence for more detail.]
+same thing.  
+Linguistic trivia: some linguistic discussions suppose that alphabetic variance 
+has important linguistic consequences (notably Ivan Sag's dissertation).
+Look in the standard treatments for discussions of alpha
+equivalence for more detail.  Also, as mentioned below, one of the intriguing 
+properties of Combinatory Logic is that alpha equivalence is not an issue.]
  
  This:
  
@@ -183,11 +188,17 @@ The second rule says that the way to translate an application is to translate th
  first element and the second element separately.
  The third rule should be obvious.
  The fourth rule should also be fairly self-evident: since what a lambda term such as `\x.y` does it throw away its first argument and return `y`, that's exactly what the combinatory logic translation should do.  And indeed, `Ky` is a function that throws away its argument and returns `y`.
-The fifth rule deals with an abstract whose body is an application: the S combinator takes its next argument (which will fill the role of the original variable a) and copies it, feeding one copy to the translation of \a.M, and the other copy to the translation of \a.N.  Finally, the last rule says that if the body of an abstract is itself an abstract, translate the inner abstract first, and then do the outermost.  (Since the translation of [\b.M] will not have any lambdas in it, we can be sure that we won't end up applying rule 6 again in an infinite loop.)
+The fifth rule deals with an abstract whose body is an application: the S combinator takes its next argument (which will fill the role of the original variable a) and copies it, feeding one copy to the translation of \a.M, and the other copy to the translation of \a.N.  This ensures that any free occurrences of a inside M or N will end up taking on the appropriate value.  Finally, the last rule says that if the body of an abstract is itself an abstract, translate the inner abstract first, and then do the outermost.  (Since the translation of [\b.M] will not have any lambdas in it, we can be sure that we won't end up applying rule 6 again in an infinite loop.)
+
+[Fussy notes: if the original lambda term has free variables in it, so will the combinatory logic translation.  Feel free to worry about this, though you should be confident that it makes sense.  You should also convince yourself that if the original lambda term contains no free variables---i.e., is a combinator---then the translation will consist only of S, K, and I (plus parentheses).  One other detail: this translation algorithm builds expressions that combine lambdas with combinators.  For instance, the translation of our boolean false `\x.\y.y` is `[\x[\y.y]] = [\x.I] = KI`.  In the intermediate stage, we have `\x.I`, which mixes combinators in the body of a lambda abstract.  It's possible to avoid this if you want to,  but it takes some careful thought.  See, e.g., Barendregt 1984, page 156.]  
+
+Let's check that the translation of the false boolean behaves as expected by feeding it two arbitrary arguments:
+
+    KIXY ~~> IY ~~> Y
  
-[Fussy notes: if the original lambda term has free variables in it, so will the combinatory logic translation.  Feel free to worry about this, though you should be confident that it makes sense.  You should also convince yourself that if the original lambda term contains no free variables---i.e., is a combinator---then the translation will consist only of S, K, and I (plus parentheses).  One other detail: this translation algorithm builds expressions that combine lambdas with combinators.  For instance, the translation of `\x.\y.y` is `[\x[\y.y]] = [\x.I] = KI`.  In that intermediate stage, we have `\x.I`.  It's possible to avoid this, but it takes some careful thought.  See, e.g., Barendregt 1984, page 156.]
+Throws away the first argument, returns the second argument---yep, it works.
  
-Here's an example of the translation:
+Here's a more elaborate example of the translation.  The goal is to establish that combinators can reverse order, so we use the T combinator, where `T = \x\y.yx`:
  
      [\x\y.yx] = [\x[\y.yx]] = [\x.S[\y.y][\y.x]] = [\x.(SI)(Kx)] = S[\x.SI][\x.Kx] = S(K(SI))(S[\x.K][\x.x]) = S(K(SI))(S(KK)I)
  
@@ -310,16 +321,29 @@ This question highlights that there are different choices to make about how eval
  
  With regard to Q3, it should be intuitively clear that `\x. M x` and `M` will behave the same with respect to any arguments they are given. It can also be proven that no other functions can behave differently with respect to them. However, the logical system you get when eta-reduction is added to the proof theory is importantly different from the one where only beta-reduction is permitted.
  
-MORE on extensionality
+If we answer Q2 by permitting reduction inside abstracts, and we also permit eta-reduction, then where none of <code>y<sub>1</sub>, ..., y<sub>n</sub> occur free in M, this:
  
-If we answer Q2 by permitting reduction inside abstracts, and we also permit eta-reduction, then where neither `y` nor `z` occur in M, this:
+<pre><code>\x y<sub>1</sub>... y<sub>n</sub>. M y<sub>1</sub>... y<sub>n</sub></code></pre>
  
-       \x y z. M y z
-
-will eta-reduce by two steps to:
+will eta-reduce by n steps to:
  
         \x. M
  
+The logical system you get when eta-reduction is added to the proof system has the following property:
+
+>      if `M`, `N` are normal forms with no free variables, then <code>M &equiv; N</code> iff `M` and `N` behave the same with respect to every possible sequence of arguments.
+
+That is, when `M` and `N` are (closed normal forms that are) syntactically distinct, there will always be some sequences of arguments <code>L<sub>1</sub>, ..., L<sub>n</sub></code> such that:
+
+<pre><code>M L<sub>1</sub> ... L<sub>n</sub> x y ~~> x
+N L<sub>1</sub> ... L<sub>n</sub> x y ~~> y
+</code></pre>
+
+That is, closed normal forms that are not just beta-reduced but also fully eta-reduced, will be syntactically different iff they yield different values for some arguments. That is, iff their extensions differ.
+
+So the proof theory with eta-reduction added is called "extensional," because its notion of normal form makes syntactic identity of closed normal forms coincide with extensional equivalence.
+
+
  The evaluation strategy which answers Q1 by saying "reduce arguments first" is known as **call-by-value**. The evaluation strategy which answers Q1 by saying "substitute arguments in unreduced" is known as **call-by-name** or **call-by-need** (the difference between these has to do with efficiency, not semantics).
  
  When one has a call-by-value strategy that also permits reduction to continue inside unapplied abstracts, that's known as "applicative order" reduction. When one has a call-by-name strategy that permits reduction inside abstracts, that's known as "normal order" reduction. Consider an expression of the form: