topics/week2_lambda_advanced.mdwn

   1 ## Fine points concerning the lambda calculus ##
   2
   3 Hankin uses the symbol
   4 <code><big><big>&rarr;</big></big></code> for one-step contraction,
   5 and the symbol <code><big><big>&#8608;</big></big></code> for
   6 zero-or-more step reduction. Hindley and Seldin use
   7 <code><big><big><big>&#8883;</big></big></big><sub>1</sub></code> and
   8 <code><big><big><big>&#8883;</big></big></big></code>.
   9
  10 As we said in the main notes, when M and N are such that there's some P that M
  11 reduces to by zero or more steps, and that N also reduces to by zero or more
  12 steps, then we say that M and N are **beta-convertible**. We write that like
  13 this:
  14
  15     M <~~> N
  16
  17 This is what plays the role of equality in the lambda calculus. Hankin
  18 uses the symbol `=` for this. So too do Hindley and
  19 Seldin. Personally, we keep confusing that with the relation to be
  20 described next, so let's use the `<~~>` notation instead. Note that
  21 `M <~~> N` doesn't mean that each of `M` and `N` are reducible to each other;
  22 that only holds when `M` and `N` are the same expression. (Or, with
  23 our convention of only saying "reducible" for one or more reduction
  24 steps, it never holds.)
  25
  26 In the metatheory, it's also sometimes useful to talk about formulas
  27 that are syntactically equivalent *before any reductions take
  28 place*. Hankin uses the symbol <code>&equiv;</code> for this. So too
  29 do Hindley and Seldin. We'll use that too, and will avoid using `=`
  30 when discussing the metatheory. Instead we'll use `<~~>` as we said
  31 above. When we want to introduce a stipulative definition, we'll write
  32 it out longhand, as in:
  33
  34 > T is defined to be `(M N)`.
  35
  36 or:
  37
  38 > Let T be `(M N)`.
  39
  40 We'll regard the following two expressions:
  41
  42     (\x (x y))
  43
  44     (\z (z y))
  45
  46 as syntactically equivalent, since they only involve a typographic
  47 change of a bound variable. Read Hankin Section 2.3 for discussion of
  48 different attitudes one can take about this.
  49
  50 Note that neither of the above expressions are identical to:
  51
  52     (\x (x w))
  53
  54 because here it's a free variable that's been changed. Nor are they identical to:
  55
  56     (\y (y y))
  57
  58 because here the second occurrence of `y` is no longer free.
  59
  60 There is plenty of discussion of this, and the fine points of how
  61 substitution works, in Hankin and in various of the tutorials we'll
  62 link to about the lambda calculus. We expect you have a good
  63 intuitive understanding of what to do already, though, even if you're
  64 not able to articulate it rigorously.
  65
  66
  67 ## Substitution and Alpha-Conversion ##
  68
  69 Intuitively, (a) and (b) express the application of the same function to the argument `y`:
  70
  71 <OL type=a>
  72 <LI><code>(\x. \z. z x) y</code>
  73 <LI><code>(\x. \y. y x) y</code>
  74 </OL>
  75
  76 One can't just rename variables freely. (a) and (b) are different than what's expressed by:
  77
  78 <OL type=a start=3>
  79 <LI><code>(\z. (\z. z z) y</code>
  80 </OL>
  81
  82
  83 Substituting `y` into the body of (a) `(\x. \z. z x)` is unproblematic:
  84
  85     (\x. \z. z x) y ~~> \z. z y
  86
  87 However, with (b) we have to be more careful. If we just substituted blindly,
  88 then we might take the result to be `\y. y y`. But this is the self-application
  89 function, not the function which accepts an arbitrary argument and applies that
  90 argument to the free variable `y`. In fact, the self-application function is
  91 what (c) reduces to. So if we took (b) to reduce to `\y. y y`, we'd wrongly be
  92 counting (b) to be equivalent to (c), instead of (a).
  93
  94 To reduce (b), then, we need to be careful to that no free variables in what
  95 we're substituting in get captured by binding &lambda;s that they shouldn't be
  96 captured by.
  97
  98 In practical terms, you'd just replace (b) with (a) and do the unproblematic substitution into (a).
  99
 100 How should we think about the explanation and justification for that practical procedure?
 101
 102 One way to think about things here is to identify expressions of the lambda
 103 calculus with *particular alphabetic sequences*. Then (a) and (b) would be
 104 distinct expressions, and we'd have to have an explicit rule permitting us to
 105 do the kind of variable-renaming that takes us from (a) to (b) (or vice versa).
 106 This kind of renaming is called "alpha-conversion." Look in the standard
 107 treatments of the lambda calculus for detailed discussion of this.
 108
 109 Another way to think of it is to identify expressions not with particular
 110 alphabetic sequences, but rather with *classes* of alphabetic sequences, which
 111 stand to each other in the way that (a) and (b) do. That's the way we'll talk.
 112 We say that (a) and (b) are just typographically different notations for a
 113 *single* lambda formula. As we'll say, the lambda formula written with (a) and
 114 the lambda formula written with (b) are literally syntactically identical.
 115
 116 A third way to think is to identify the lambda formula not with classes of
 117 alphabetic sequences, but rather with abstract structures that we might draw
 118 like this:
 119
 120 <pre><code>
 121     (&lambda;. &lambda;. _ _) y
 122      ^  ^  | |
 123      |  |__| |
 124      |_______|
 125 </code></pre>
 126
 127 Here there are no bound variables, but *bound positions* remain. We can
 128 regard formula like (a) and (b) as just helpfully readable ways to designate
 129 these abstract structures.
 130
 131 A version of this last approach is known as [de Bruijn notation](http://en.wikipedia.org/wiki/De_Bruijn_index) for the lambda calculus.
 132
 133 It doesn't seem to matter which of these approaches one takes; the logical
 134 properties of the systems are exactly the same. It just affects the particulars
 135 of how one states the rules for substitution, and so on. And whether one talks
 136 about expressions being literally "syntactically identical," or whether one
 137 instead counts them as "equivalent modulu alpha-conversion."
 138
 139 (Linguistic trivia: some linguistic discussions do suppose that
 140 alphabetic variance has important linguistic consequences; see Ivan Sag's
 141 dissertation.)
 142
 143 Next week, we'll discuss other systems that lack variables. Those systems will
 144 not just lack variables in the sense that de Bruijn notation does; they will
 145 furthermore lack any notion of a bound position.
 146
 147
 148 ## Review: syntactic equality, reduction, convertibility ##
 149
 150 Define N to be `(\x. x y) z`. Then N and `(\x. x y) z` are syntactically equal,
 151 and we're counting them as syntactically equal to `(\z. z y) z` as well. We'll express
 152 all these claims in our metalanguage as:
 153
 154 <pre><code>N &equiv; (\x. x y) z &equiv; (\z. z y) z
 155 </code></pre>
 156
 157 This:
 158
 159     N ~~> z y
 160
 161 means that N beta-reduces to `z y`. This:
 162
 163     M <~~> N
 164
 165 means that M and N are beta-convertible, that is, that there's something they both reduce to in zero or more steps.
 166
 167 The symbols `~~>` and `<~~>` aren't part of what we're calling "the Lambda
 168 Calculus". In our mouths, they're just part of our metatheory for talking about it. In the uses of
 169 the Lambda Calculus as a formal proof theory, one or the other of these
 170 symbols (or some notational variant of them) is added to the object language.
 171
 172 See Hankin Sections 2.2 and 2.4 for the proof theory using `<~~>` (which he
 173 writes as `=`).  He discusses the proof theory using `~~>` in his Chapter 3.
 174 This material is covered in slightly different ways (different organization and
 175 some differences in terminology and notation) in Chapters 1, 6, and 7 of the
 176 Hindley &amp; Seldin.
 177