refine week2 notes part1

[lambda.git] / topics / _week2_lambda_calculus_fine_points.mdwn
diff --git a/topics/_week2_lambda_calculus_fine_points.mdwn b/topics/_week2_lambda_calculus_fine_points.mdwn

index 2176f35..1372e43 100644 (file)
--- a/topics/_week2_lambda_calculus_fine_points.mdwn
+++ b/topics/_week2_lambda_calculus_fine_points.mdwn
@@ -1,5 +1,4 @@
-Fine points concerning the lambda calculus
-==========================================
+## Fine points concerning the lambda calculus ##
  
  Hankin uses the symbol
  <code><big><big>&rarr;</big></big></code> for one-step contraction,
  
  Hankin uses the symbol
  <code><big><big>&rarr;</big></big></code> for one-step contraction,
@@ -8,17 +7,18 @@ zero-or-more step reduction. Hindley and Seldin use
  <code><big><big><big>&#8883;</big></big></big><sub>1</sub></code> and
  <code><big><big><big>&#8883;</big></big></big></code>.
  
  <code><big><big><big>&#8883;</big></big></big><sub>1</sub></code> and
  <code><big><big><big>&#8883;</big></big></big></code>.
  
-When M and N are such that there's some P that M reduces to by zero or
-more steps, and that N also reduces to by zero or more steps, then we
-say that M and N are **beta-convertible**. We'll write that like this:
+As we said in the main notes, when M and N are such that there's some P that M
+reduces to by zero or more steps, and that N also reduces to by zero or more
+steps, then we say that M and N are **beta-convertible**. We write that like
+this:
  
  
-       M <~~> N
+    M <~~> N
  
  This is what plays the role of equality in the lambda calculus. Hankin
  uses the symbol `=` for this. So too do Hindley and
  
  This is what plays the role of equality in the lambda calculus. Hankin
  uses the symbol `=` for this. So too do Hindley and
-Seldin. Personally, I keep confusing that with the relation to be
-described next, so let's use this notation instead. Note that `M <~~>
-N` doesn't mean that each of `M` and `N` are reducible to each other;
+Seldin. Personally, we keep confusing that with the relation to be
+described next, so let's use the `<~~>` notation instead. Note that
+`M <~~> N` doesn't mean that each of `M` and `N` are reducible to each other;
  that only holds when `M` and `N` are the same expression. (Or, with
  our convention of only saying "reducible" for one or more reduction
  steps, it never holds.)
  that only holds when `M` and `N` are the same expression. (Or, with
  our convention of only saying "reducible" for one or more reduction
  steps, it never holds.)
@@ -31,33 +31,147 @@ when discussing the metatheory. Instead we'll use `<~~>` as we said
  above. When we want to introduce a stipulative definition, we'll write
  it out longhand, as in:
  
  above. When we want to introduce a stipulative definition, we'll write
  it out longhand, as in:
  
->      T is defined to be `(M N)`.
+> T is defined to be `(M N)`.
+
+or:
+
+> Let T be `(M N)`.
  
  We'll regard the following two expressions:
  
  
  We'll regard the following two expressions:
  
-       (\x (x y))
+    (\x (x y))
  
  
-       (\z (z y))
+    (\z (z y))
  
  as syntactically equivalent, since they only involve a typographic
  
  as syntactically equivalent, since they only involve a typographic
-change of a bound variable. Read Hankin section 2.3 for discussion of
+change of a bound variable. Read Hankin Section 2.3 for discussion of
  different attitudes one can take about this.
  
  different attitudes one can take about this.
  
-Note that neither of those expressions are identical to:
+Note that neither of the above expressions are identical to:
  
  
-       (\x (x w))
+    (\x (x w))
  
  because here it's a free variable that's been changed. Nor are they identical to:
  
  
  because here it's a free variable that's been changed. Nor are they identical to:
  
-       (\y (y y))
+    (\y (y y))
  
  because here the second occurrence of `y` is no longer free.
  
  There is plenty of discussion of this, and the fine points of how
  
  because here the second occurrence of `y` is no longer free.
  
  There is plenty of discussion of this, and the fine points of how
-substitution works, in Hankin and in various of the tutorials we've
-linked to about the lambda calculus. We expect you have a good
+substitution works, in Hankin and in various of the tutorials we'll
+link to about the lambda calculus. We expect you have a good
  intuitive understanding of what to do already, though, even if you're
  not able to articulate it rigorously.
  
  intuitive understanding of what to do already, though, even if you're
  not able to articulate it rigorously.
  
-*      [More discussion in week 2 notes](/week2/#index1h1)
+
+## Substitution and Alpha-Conversion ##
+
+Intuitively, (a) and (b) express the application of the same function to the argument `y`:
+
+<OL type=a>
+<LI><code>(\x. \z. z x) y</code>
+<LI><code>(\x. \y. y x) y</code>
+</OL>
+
+One can't just rename variables freely. (a) and (b) are different than what's expressed by:
+
+<OL type=a start=3>
+<LI><code>(\z. (\z. z z) y</code>
+</OL>
+
+
+Substituting `y` into the body of (a) `(\x. \z. z x)` is unproblematic:
+
+    (\x. \z. z x) y ~~> \z. z y
+
+However, with (b) we have to be more careful. If we just substituted blindly,
+then we might take the result to be `\y. y y`. But this is the self-application
+function, not the function which accepts an arbitrary argument and applies that
+argument to the free variable `y`. In fact, the self-application function is
+what (c) reduces to. So if we took (b) to reduce to `\y. y y`, we'd wrongly be
+counting (b) to be equivalent to (c), instead of (a).
+
+To reduce (b), then, we need to be careful to that no free variables in what
+we're substituting in get captured by binding &lambda;s that they shouldn't be
+captured by.
+
+In practical terms, you'd just replace (b) with (a) and do the unproblematic substitution into (a).
+
+How should we think about the explanation and justification for that practical procedure?
+
+One way to think about things here is to identify expressions of the lambda
+calculus with *particular alphabetic sequences*. Then (a) and (b) would be
+distinct expressions, and we'd have to have an explicit rule permitting us to
+do the kind of variable-renaming that takes us from (a) to (b) (or vice versa).
+This kind of renaming is called "alpha-conversion." Look in the standard
+treatments of the lambda calculus for detailed discussion of this.
+
+Another way to think of it is to identify expressions not with particular
+alphabetic sequences, but rather with *classes* of alphabetic sequences, which
+stand to each other in the way that (a) and (b) do. That's the way we'll talk.
+We say that (a) and (b) are just typographically different notations for a
+*single* lambda formula. As we'll say, the lambda formula written with (a) and
+the lambda formula written with (b) are literally syntactically identical.
+
+A third way to think is to identify the lambda formula not with classes of
+alphabetic sequences, but rather with abstract structures that we might draw
+like this:
+
+<pre><code>
+    (&lambda;. &lambda;. _ _) y
+     ^  ^  | |
+     |  |__| |
+     |_______|
+</code></pre>
+
+Here there are no bound variables, but *bound positions* remain. We can
+regard formula like (a) and (b) as just helpfully readable ways to designate
+these abstract structures.
+
+A version of this last approach is known as [de Bruijn notation](http://en.wikipedia.org/wiki/De_Bruijn_index) for the lambda calculus.
+
+It doesn't seem to matter which of these approaches one takes; the logical
+properties of the systems are exactly the same. It just affects the particulars
+of how one states the rules for substitution, and so on. And whether one talks
+about expressions being literally "syntactically identical," or whether one
+instead counts them as "equivalent modulu alpha-conversion."
+
+(Linguistic trivia: some linguistic discussions do suppose that
+alphabetic variance has important linguistic consequences; see Ivan Sag's
+dissertation.)
+
+Next week, we'll discuss other systems that lack variables. Those systems will
+not just lack variables in the sense that de Bruijn notation does; they will
+furthermore lack any notion of a bound position.
+
+
+## Review: syntactic equality, reduction, convertibility ##
+
+Define N to be `(\x. x y) z`. Then N and `(\x. x y) z` are syntactically equal,
+and we're counting them as syntactically equal to `(\z. z y) z` as well. We'll express
+all these claims in our metalanguage as:
+
+<pre><code>N &equiv; (\x. x y) z &equiv; (\z. z y) z
+</code></pre>
+
+This:
+
+    N ~~> z y
+
+means that N beta-reduces to `z y`. This:
+
+    M <~~> N
+
+means that M and N are beta-convertible, that is, that there's something they both reduce to in zero or more steps.
+
+The symbols `~~>` and `<~~>` aren't part of what we're calling "the Lambda
+Calculus". In our mouths, they're just part of our metatheory for talking about it. In the uses of
+the Lambda Calculus as a formal proof theory, one or the other of these
+symbols (or some notational variant of them) is added to the object language.
+
+See Hankin Sections 2.2 and 2.4 for the proof theory using `<~~>` (which he
+writes as `=`).  He discusses the proof theory using `~~>` in his Chapter 3.
+This material is covered in slightly different ways (different organization and
+some differences in terminology and notation) in Chapters 1, 6, and 7 of the
+Hindley &amp; Seldin.