-there are no variables in Combiantory Logic, there is no need to worry
-about variable collision.
-
-Combinatory Logic is what you have when you choose a set of combinators and regulate their behavior with a set of reduction rules. As we said, the most common system uses S, K, and I as defined here.
-
-###The equivalence of the untyped lambda calculus and combinatory logic###
-
-We've claimed that Combinatory Logic is equivalent to the lambda
-calculus. If that's so, then S, K, and I must be enough to accomplish
-any computational task imaginable. Actually, S and K must suffice,
-since we've just seen that we can simulate I using only S and K. In
-order to get an intuition about what it takes to be Turing complete,
-recall our discussion of the lambda calculus in terms of a text editor.
-A text editor has the power to transform any arbitrary text into any other arbitrary text. The way it does this is by deleting, copying, and reordering characters. We've already seen that K deletes its second argument, so we have deletion covered. S duplicates and reorders, so we have some reason to hope that S and K are enough to define arbitrary functions.
-
-We've already established that the behavior of combinatory terms can
-be perfectly mimicked by lambda terms: just replace each combinator
-with its equivalent lambda term, i.e., replace I with `\x.x`, replace
-K with `\fxy.x`, and replace S with `\fgx.fx(gx)`. So the behavior of
-any combination of combinators in Combinatory Logic can be exactly
-reproduced by a lambda term.
-
-How about the other direction? Here is a method for converting an
-arbitrary lambda term into an equivalent Combinatory Logic term using
-only S, K, and I. Besides the intrinsic beauty of this mapping, and
-the importance of what it says about the nature of binding and
-computation, it is possible to hear an echo of computing with
-continuations in this conversion strategy (though you wouldn't be able
-to hear these echos until we've covered a considerable portion of the
-rest of the course). In addition, there is a direct linguistic
-appliction of this mapping in chapter 17 of Barker and Shan 2014,
-where it is used to establish a correpsondence between two natural
-language grammars, one of which is based on lambda-like abstraction,
-the other of which is based on Combinatory Logic like manipulations.
-
-Assume that for any lambda term T, [T] is the equivalent combinatory logic term. The we can define the [.] mapping as follows:
-
- 1. [a] a
- 2. [(M N)] ([M][N])
- 3. [\a.a] I
- 4. [\a.M] K[M] assumption: a does not occur free in M
- 5. [\a.(M N)] S[\a.M][\a.N]
- 6. [\a\b.M] [\a[\b.M]]
-
-It's easy to understand these rules based on what S, K and I do. The first rule says
-that variables are mapped to themselves.
-The second rule says that the way to translate an application is to translate the
-first element and the second element separately.
-The third rule should be obvious.
-The fourth rule should also be fairly self-evident: since what a lambda term such as `\x.y` does it throw away its first argument and return `y`, that's exactly what the combinatory logic translation should do. And indeed, `Ky` is a function that throws away its argument and returns `y`.
-The fifth rule deals with an abstract whose body is an application: the S combinator takes its next argument (which will fill the role of the original variable a) and copies it, feeding one copy to the translation of \a.M, and the other copy to the translation of \a.N. This ensures that any free occurrences of a inside M or N will end up taking on the appropriate value. Finally, the last rule says that if the body of an abstract is itself an abstract, translate the inner abstract first, and then do the outermost. (Since the translation of [\b.M] will not have any lambdas in it, we can be sure that we won't end up applying rule 6 again in an infinite loop.)
-
-[Fussy notes: if the original lambda term has free variables in it, so will the combinatory logic translation. Feel free to worry about this, though you should be confident that it makes sense. You should also convince yourself that if the original lambda term contains no free variables---i.e., is a combinator---then the translation will consist only of S, K, and I (plus parentheses). One other detail: this translation algorithm builds expressions that combine lambdas with combinators. For instance, the translation of our boolean false `\x.\y.y` is `[\x[\y.y]] = [\x.I] = KI`. In the intermediate stage, we have `\x.I`, which mixes combinators in the body of a lambda abstract. It's possible to avoid this if you want to, but it takes some careful thought. See, e.g., Barendregt 1984, page 156.]
-
-[Various, slightly differing translation schemes from combinatorial
-logic to the lambda calculus are also possible. These generate
-different metatheoretical correspondences between the two
-calculii. Consult Hindley and Seldin for details. Also, note that the
-combinatorial proof theory needs to be strengthened with axioms beyond
-anything we've here described in order to make [M] convertible with
-[N] whenever the original lambda-terms M and N are convertible. But
-then, we've been a bit cavalier about giving the full set of reduction
-rules for the lambda calculus in a similar way. For instance, one
-issue is whether reduction rules (in either the lambda calculus or
-Combinatory Logic) apply to embedded expressions. Generally, we want
-that to happen, but making it happen requires adding explicit axioms.]
-
-Let's check that the translation of the false boolean behaves as expected by feeding it two arbitrary arguments:
+there are no variables in Combinatory Logic, there is no need to worry
+about variables colliding when we substitute.
+
+Combinatory Logic is what you have when you choose a set of
+combinators and regulate their behavior with a set of reduction
+rules. As we said, the most common system uses `S`, `K`, and `I` as
+defined here.
+
+###The equivalence of the untyped Lambda Calculus and Combinatory Logic###
+
+We've claimed that Combinatory Logic is "equivalent to" the Lambda Calculus. If
+that's so, then `S`, `K`, and `I` must be enough to accomplish any computational task
+imaginable. Actually, `S` and `K` must suffice, since we've just seen that we can
+simulate `I` using only `S` and `K`. In order to get an intuition about what it
+takes to be Turing Complete, <!-- FIXME -->
+recall our discussion of the Lambda Calculus in
+terms of a text editor. A text editor has the power to transform any arbitrary
+text into any other arbitrary text.
+The way it does this is by deleting, copying, and reordering characters. We've
+already seen that `K` deletes its second argument, so we have deletion covered.
+`S` duplicates and reorders, so we have some reason to hope that `S` and `K` are
+enough to define arbitrary functions.
+
+We've already established that the behavior of combinatory terms can be
+perfectly mimicked by lambda terms: just replace each combinator with its
+equivalent lambda term, i.e., replace `I` with `\x. x`, replace `K` with `\x y. x`,
+and replace `S` with `\f g x. f x (g x)`. So the behavior of any combination of
+combinators in Combinatory Logic can be exactly reproduced by a lambda term.
+
+How about the other direction? Here is a method for converting an arbitrary
+lambda term into an equivalent Combinatory Logic term using only `S`, `K`, and `I`.
+Besides the intrinsic beauty of this mapping, and the importance of what it
+says about the nature of binding and computation, it is possible to hear an
+echo of computing with continuations in this conversion strategy (though you
+wouldn't be able to hear these echos until we've covered a considerable portion
+of the rest of the course). In addition, there is a direct linguistic
+application of this mapping in chapter 17 of Barker and Shan 2014, where it is
+used to establish a correspondence between two natural language grammars, one
+of which is based on lambda-like abstraction, the other of which is based on
+Combinatory Logic-like manipulations.
+
+[WARNING: the mapping from the lambda calculus to Combinatory Logic
+has been changed since the class in which it was presented. It now
+matches the presentation in Barendregt. The revised version is
+cleaner, and more elegant. If you spent a lot of time working to
+understand the original version, there's good news and bad news. The
+bad news is that things have changed. The good news is that the new
+version described the same mapping as before, but does it in a cleaner
+way. That is, the CL term that a given lambda term maps onto hasn't
+changed, only the details of how that CL term gets computed. Sorry if
+the changeup causes any distress!]
+
+In order to establish the correspondence, we need to get a bit more
+official about what counts as an expression in CL. We'll endow CL
+with an infinite stock of variable symbols, just like the lambda
+calculus, including `x`, `y`, and `z`. In addition, `S`, `K`, and `I`
+are expressions in CL. Finally, `(XY)` is in CL for any CL
+expressions `X` and `Y`. So examples of CL expressions include
+`x`, `(xy)`, `Sx`, `SK`, `(x(SK))`, `(K(IS))`, and so on. When we
+omit parentheses, the assumption will be left associativity, so that
+`XYZ == ((XY)Z)`.
+
+It may seem wierd to allow variables in CL. The reason that is
+necessary is because we're trying to show that every lambda term can
+be translated into an equivalent CL term. Since some lambda terms
+contain free variables, we need to provide a translation for free
+variables. As you might expect, it will turn out that whenever the
+lambda term in question contains no free variables (i.e., is a
+combinator), its translation in CL will also contain no variables.
+
+Assume that for any lambda term T, [T] is the equivalent Combinatory
+Logic term. Then we can define the [.] mapping as follows.
+
+ 1. [a] a
+ 2. [\aX] @a[X]
+ 3. [(XY)] ([X][Y])
+
+ 4. @aa I
+ 5. @aX KX if a is not in X
+ 6. @a(XY) S(@aX)(@aY)
+
+Think of `@aX` as a psuedo-lambda abstract.
+
+It's easy to understand these rules based on what `S`, `K` and `I` do.
+
+Rule (1) says that variables are mapped to themselves. If the original
+lambda expression had no free variables in it, then any such
+translations will only be temporary. The variable will later get
+eliminated by the application of other rules.
+
+Rule (2) says that the way to translate an application is to
+first translate the body (i.e., `[X]`), and then prefix a kind of
+temporary psuedo-lambda built from `@` and the original variable.
+
+Rule (3) says that the translation of an application of `X` to `Y` is
+the application of the transtlation of `X` to the translation of `Y`.
+
+As we'll see, the first three rules sweep through the lambda term,
+changing each lambda to an @.
+
+Rules (4) through (6) tell us how to eliminate all the `@`'s.
+
+In rule (4), if we have `@aa`, we need a CL expression that behaves
+like the lambda term `\aa`. Obviously, `I` is the right choice here.
+
+In rule (5), if we're binding into an expression that doesn't contain
+any variables that need binding, then we need a CL term that behaves
+the same as `\aX` would if `X` didn't contain `a` as a free variable.
+Well, how does `\aX` behave? When `\aX` occurs in the head position
+of a redex, then no matter what argument it occurs with, it throws
+away its argument and returns `X`. In other words, `\aX` is a
+constant function returning `X`, which is exactly the behavior
+we get by prefixing `K`.
+
+The easiest way to grasp rule (6) is to consider the following claim:
+
+ \a(XY) <~~> S(\aX)(\aY)
+
+To prove it to yourself, just substitute `(\xyz.xz(yz))` in for `S`
+and reduce.
+
+Persuade yourself that if the original lambda term contains no free
+variables --- i.e., is a combinator --- then the translation will
+consist only of `S`, `K`, and `I` (plus parentheses).
+
+Various, slightly differing translation schemes from Combinatory Logic to the
+Lambda Calculus are also possible. These generate different metatheoretical
+correspondences between the two calculi. Consult Hindley and Seldin for
+details.
+
+Also, note that the combinatorial proof theory needs to be
+strengthened with axioms beyond anything we've here described in order to make
+[M] convertible with [N] whenever the original lambda-terms M and N are
+convertible. But then, we've been a bit cavalier about giving the full set of
+reduction rules for the Lambda Calculus in a similar way. <!-- FIXME -->
+
+For instance, one issue we mentioned in the notes on [[Reduction
+Strategies|week3_reduction_strategies]] is whether reduction rules (in
+either the Lambda Calculus or Combinatory Logic) apply to embedded
+expressions. Often, we do want that to happen, but making it happen
+requires adding explicit axioms.
+
+Let's see the translation rules in action. We'll start by translating
+the combinator we use to represent false:
+
+ [\t\ff]
+ == @t[\ff] rule 2
+ == @t(@ff) rule 2
+ == @tI rule 4
+ == KI rule 5
+
+Let's check that the translation of the `false` boolean behaves as expected by feeding it two arbitrary arguments: