topics/_week5_simply_typed_lambda.mdwn

   1 [[!toc]]
   2
   3 ##The simply-typed lambda calculus##
   4
   5 The untyped lambda calculus is pure.  Pure in many ways: nothing but
   6 variables and lambdas, with no constants or other special symbols;
   7 also, all functions without any types.  As we'll see eventually, pure
   8 also in the sense of having no side effects, no mutation, just pure
   9 computation.
  10
  11 But we live in an impure world.  It is much more common for practical
  12 programming languages to be typed, either implicitly or explicitly.
  13 Likewise, systems used to investigate philosophical or linguistic
  14 issues are almost always typed.  Types will help us reason about our
  15 computations.  They will also facilitate a connection between logic
  16 and computation.
  17
  18 From a linguistic perspective, types are generalizations of (parts of)
  19 programs.  To make this comment more concrete: types are to (e.g.,
  20 lambda) terms as syntactic categories are to expressions of natural
  21 language.  If so, if it makes sense to gather a class of expressions
  22 together into a set of Nouns, or Verbs, it may also make sense to
  23 gather classes of terms into a set labelled with some computational type.
  24
  25 To develop this analogy just a bit further, syntactic categories
  26 determine which expressions can combine with which other expressions.
  27 If a word is a member of the category of prepositions, it had better
  28 not try to combine (merge) with an expression in the category of, say,
  29 an auxilliary verb, since \**under has* is not a well-formed constituent
  30 in English.  Likewise, types in formal languages will determine which
  31 expressions can be sensibly combined.
  32
  33 Now, of course it is common linguistic practice to supply an analysis
  34 of natural language both with syntactic categories and with semantic
  35 types.  And there is a large degree of overlap between these type
  36 systems.  However, there are mismatches in both directions: there are
  37 syntactic distinctions that do not correspond to any salient semantic
  38 difference (why can't adjectives behave syntactically like verb
  39 phrases, since they both denote properties with (extensional) type
  40 `<e,t>`?); and in some analyses there are semantic differences that do
  41 not correspond to any salient syntactic distinctions (as in any
  42 analysis that involves silent type-shifters, such as Herman Hendriks'
  43 theory of quantifier scope, in which expressions change their semantic
  44 type without any effect on the expressions they can combine
  45 with syntactically).  We will consider again the relationship between
  46 syntactic types and semantic types later in the course.
  47
  48 Soon we will consider polymorphic type systems.  First, however, we
  49 will consider the simply-typed lambda calculus.
  50
  51 [Pedantic on.  Why "*simply* typed"?  Well, the type system is
  52 particularly simple.  As mentioned to us by Koji Mineshima, Church
  53 tells us that "The simple theory of types was suggested as a
  54 modification of Russell's ramified theory of types by Leon Chwistek in
  55 1921 and 1922 and by F. P. Ramsey in 1926."  This footnote appears in
  56 Church's 1940 paper [A formulation of the simple theory of
  57 types](church-simple-types.pdf).  In this paper, Church writes types
  58 by simple apposition, without the ugly angle brackets and commas used
  59 by Montague.  Furthermore, he omits parentheses under the convention
  60 that types associated to the *left*---the opposite of the modern
  61 convention.  This is ok, however, because he also reverses the order,
  62 so that `te` is a function from objects of type `e` to objects of type
  63 `t`.  Cool paper!  If you ever want to see Church numerals in their
  64 native setting--but we're getting ahead of our story.  Pedantic off.]
  65
  66 There's good news and bad news: the good news is that the simply-typed
  67 lambda calculus is strongly normalizing: every term has a normal form.
  68 We shall see that self-application is outlawed, so &Omega; can't even
  69 be written, let alone undergo reduction.  The bad news is that
  70 fixed-point combinators are also forbidden, so recursion is neither
  71 simple nor direct.
  72
  73 #Types#
  74
  75 We will have at least one ground type.  For the sake of linguistic
  76 familiarity, we'll use `e`, the type of individuals, and `t`, the type
  77 of truth values.
  78
  79 In addition, there will be a recursively-defined class of complex
  80 types `T`, the smallest set such that
  81
  82 *    ground types, including `e` and `t`, are in `T`
  83
  84 *    for any types &sigma; and &tau; in `T`, the type &sigma; ->
  85      &tau; is in `T`.
  86
  87 For instance, here are some types in `T`:
  88
  89      e
  90      e -> t
  91      e -> e -> t
  92      (e -> t) -> t
  93      (e -> t) -> e -> t
  94
  95 and so on.
  96
  97 #Typed lambda terms#
  98
  99 Given a set of types `T`, we define the set of typed lambda terms <code>&Lambda;_T</code>,
 100 which is the smallest set such that
 101
 102 *    each type `t` has an infinite set of distinct variables, {x^t}_1,
 103      {x^t}_2, {x^t}_3, ...
 104
 105 *    If a term `M` has type &sigma; -> &tau;, and a term `N` has type
 106      &sigma;, then the application `(M N)` has type &tau;.
 107
 108 *    If a variable `a` has type &sigma;, and term `M` has type &tau;,
 109      then the abstract <code>&lambda; a M</code> has type &sigma; -> &tau;.
 110
 111 The definitions of types and of typed terms should be highly familiar
 112 to semanticists, except that instead of writing &sigma; -> &tau;,
 113 linguists write <&sigma;, &tau;>.  We will use the arrow notation,
 114 since it is more iconic.
 115
 116 Some examples (assume that `x` has type `o`):
 117
 118       x            o
 119       \x.x         o -> o
 120       ((\x.x) x)   o
 121
 122 Excercise: write down terms that have the following types:
 123
 124                    o -> o -> o
 125                    (o -> o) -> o -> o
 126                    (o -> o -> o) -> o
 127
 128 #A first glipse of the connection between types and logic
 129
 130 In the simply-typed lambda calculus, we write types like <code>&sigma;
 131 -> &tau;</code>.  This looks like logical implication.  We'll take
 132 that resemblance seriously when we discuss the Curry-Howard
 133 correspondence.  In the meantime, note that types respect modus
 134 ponens:
 135
 136 <pre>
 137 Expression    Type      Implication
 138 -----------------------------------
 139 fn            &alpha; -> &beta;    &alpha; &sup; &beta;
 140 arg           &alpha;         &alpha;
 141 ------        ------    --------
 142 (fn arg)      &beta;         &beta;
 143 </pre>
 144
 145 The implication in the right-hand column is modus ponens, of course.
 146
 147
 148 #Associativity of types versus terms#
 149
 150 As we have seen many times, in the lambda calculus, function
 151 application is left associative, so that `f x y z == (((f x) y) z)`.
 152 Types, *THEREFORE*, are right associative: if `x`, `y`, and `z`
 153 have types `a`, `b`, and `c`, respectively, then `f` has type
 154 `a -> b -> c -> d == (a -> (b -> (c -> d)))`, where `d` is the
 155 type of the complete term.
 156
 157 It is a serious faux pas to associate to the left for types.  You may
 158 as well use your salad fork to stir your tea.
 159
 160 #The simply-typed lambda calculus is strongly normalizing#
 161
 162 If `M` is a term with type &tau; in &Lambda;_T, then `M` has a
 163 normal form.  The proof is not particularly complex, but we will not
 164 present it here; see Berendregt or Hankin.
 165
 166 Since &Omega; does not have a normal form, it follows that &Omega;
 167 cannot have a type in &Lambda;_T.  We can easily see why:
 168
 169 <code>&Omega; = (\x.xx)(\x.xx)</code>
 170
 171 Assume &Omega; has type &tau;, and `\x.xx` has type &sigma;.  Then
 172 because `\x.xx` takes an argument of type &sigma; and returns
 173 something of type &tau;, `\x.xx` must also have type &sigma; ->
 174 &tau;.  By repeating this reasoning, `\x.xx` must also have type
 175 (&sigma; -> &tau;) -> &tau;; and so on.  Since variables have
 176 finite types, there is no way to choose a type for the variable `x`
 177 that can satisfy all of the requirements imposed on it.
 178
 179 In fact, we can't even type the parts of &Omega;, that is, `&omega;
 180 \equiv \x.xx`.  In general, there is no way for a function to have a
 181 type that can take itself for an argument.
 182
 183 It follows that there is no way to define the identity function in
 184 such a way that it can take itself as an argument.  Instead, there
 185 must be many different identity functions, one for each type.  Some of
 186 those types can be functions, and some of those functions can be
 187 (type-restricted) identity functions; but a simply-types identity
 188 function can never apply to itself.
 189
 190 #Typing numerals#
 191
 192 The Church numerals are well behaved with respect to types.
 193 To see this, consider the first three Church numerals (starting with zero):
 194
 195     \s z . z
 196     \s z . s z
 197     \s z . s (s z)
 198
 199 Given the internal structure of the term we are using to represent
 200 zero, its type must have the form &rho; -> &sigma; -> &sigma; for
 201 some &rho; and &sigma;.  This type is consistent with term for one,
 202 but the structure of the definition of one is more restrictive:
 203 because the first argument (`s`) must apply to the second argument
 204 (`z`), the type of the first argument must describe a function from
 205 expressions of type &sigma; to some result type.  So we can refine
 206 &rho; by replacing it with the more specific type &sigma; -> &tau;.
 207 At this point, the overall type is (&sigma; -> &tau;) -> &sigma; ->
 208 &sigma;.  Note that this refined type remains compatible with the
 209 definition of zero.  Finally, by examinining the definition of two, we
 210 see that expressions of type &tau; must be suitable to serve as
 211 arguments to functions of type &sigma; -> &tau;, since the result of
 212 applying `s` to `z` serves as the argument of `s`.  The most general
 213 way for that to be true is if &tau; &equiv; &sigma;.  So at this
 214 point, we have the overall type of (&sigma; -> &sigma;) -> &sigma;
 215 -> &sigma;.
 216
 217 <!-- Make sure there is talk about unification and computation of the
 218 principle type-->
 219
 220 ## Predecessor and lists are not representable in simply typed lambda-calculus ##
 221
 222
 223 This is not because there is any difficulty typing what the functions
 224 involved do "from the outside": for instance, the predecessor function
 225 is a function from numbers to numbers, or &tau; -> &tau;, where &tau;
 226 is our type for Church numbers (i.e., (&sigma; -> &sigma;) -> &sigma;
 227 -> &sigma;).  (Though this type will only be correct if we decide that
 228 the predecessor of zero should be a number, perhaps zero.)
 229
 230 Rather, the problem is that the definition of the function requires
 231 subterms that can't be simply-typed.  We'll illustrate with our
 232 implementation of the predecessor function, based on the discussion in
 233 Pierce 2002:547:
 234
 235     let zero = \s z. z in
 236     let fst = \x y. x in
 237     let snd = \x y. y in
 238     let pair = \x y . \f . f x y in
 239     let succ = \n s z. s (n s z) in
 240     let shift = \p. pair (succ (p fst)) (p fst) in
 241     let pred = \n. n shift (pair zero zero) snd in
 242
 243 Note that `shift` takes a pair `p` as argument, but makes use of only
 244 the first element of the pair.  Why does it do that?  In order to
 245 understand what this code is doing, it is helpful to go through a
 246 sample computation, the predecessor of 3:
 247
 248     pred 3
 249     3 shift (pair zero zero) snd
 250     (\s z.s(s(s z))) shift (pair zero zero) snd
 251     shift (shift (shift (\f.f 0 0))) snd
 252     shift (shift (pair (succ ((\f.f 0 0) fst)) ((\f.f 0 0) fst))) snd
 253     shift (shift (\f.f 1 0)) snd
 254     shift (\f. f 2 1) snd
 255     (\f. f 3 2) snd
 256     snd 3 2
 257     2
 258
 259 At each stage, `shift` sees an ordered pair that contains two numbers
 260 related by the successor function.  It can safely discard the second
 261 element without losing any information.  The reason we carry around
 262 the second element at all is that when it comes time to complete the
 263 computation---that is, when we finally apply the top-level ordered
 264 pair to `snd`---it's the second element of the pair that will serve as
 265 the final result.
 266
 267 Let's see how far we can get typing these terms.  `zero` is the Church
 268 encoding of zero.  Using `N` as the type for Church numbers (i.e.,
 269 <code>N &equiv; (&sigma; -> &sigma;) -> &sigma; -> &sigma;</code> for
 270 some &sigma;, `zero` has type `N`.  `snd` takes two numbers, and
 271 returns the second, so `snd` has type `N -> N -> N`.  Then the type of
 272 `pair` is `N -> N -> (type(snd)) -> N`, that is, `N -> N -> (N -> N ->
 273 N) -> N`.  Likewise, `succ` has type `N -> N`, and `shift` has type
 274 `pair -> pair`, where `pair` is the type of an ordered pair of
 275 numbers, namely, <code>pair &equiv; (N -> N -> N) -> N</code>.  So far
 276 so good.
 277
 278 The problem is the way in which `pred` puts these parts together.  In
 279 particular, `pred` applies its argument, the number `n`, to the
 280 `shift` function.  Since `n` is a number, its type is <code>(&sigma;
 281 -> &sigma;) -> &sigma; -> &sigma;</code>.  This means that the type of
 282 `shift` has to match <code>&sigma; -> &sigma;</code>. But we
 283 concluded above that the type of `shift` also had to be `pair ->
 284 pair`.  Putting these constraints together, it appears that
 285 <code>&sigma;</code> must be the type of a pair of numbers.  But we
 286 already decided that the type of a pair of numbers is `(N -> N -> N)
 287 -> N`.  Here's the difficulty: `N` is shorthand for a type involving
 288 <code>&sigma;</code>.  If <code>&sigma;</code> turns out to depend on
 289 `N`, and `N` depends in turn on <code>&sigma;</code>, then
 290 <code>&sigma;</code> is a proper subtype of itself, which is not
 291 allowed in the simply-typed lambda calculus.
 292
 293 The way we got here is that the `pred` function relies on the built-in
 294 right-fold structure of the Church numbers to recursively walk down
 295 the spine of its argument.  In order to do that, the argument had to
 296 apply to the `shift` operation.  And since `shift` had to be the
 297 sort of operation that manipulates numbers, the infinite regress is
 298 established.
 299
 300 Now, of course, this is only one of myriad possible implementations of
 301 the predecessor function in the lambda calculus.  Could one of them
 302 possibly be simply-typeable?  It turns out that this can't be done.
 303 See Oleg Kiselyov's discussion and works cited there for details:
 304 [[predecessor and lists can't be represented in the simply-typed
 305 lambda
 306 calculus|http://okmij.org/ftp/Computation/lambda-calc.html#predecessor]].
 307
 308 Because lists are (in effect) a generalization of the Church numbers,
 309 computing the tail of a list is likewise beyond the reach of the
 310 simply-typed lambda calculus.
 311
 312 This result is not obvious, to say the least.  It illustrates how
 313 recursion is built into the structure of the Church numbers (and
 314 lists).  Most importantly for the discussion of the simply-typed
 315 lambda calculus, it demonstrates that even fairly basic recursive
 316 computations are beyond the reach of a simply-typed system.
 317
 318
 319 ## Montague grammar is based on a simply-typed lambda calculus
 320
 321 Systems based on the simply-typed lambda calculus are the bread and
 322 butter of current linguistic semantic analysis.  One of the most
 323 influential modern semantic formalisms---Montague's PTQ
 324 fragment---included a simply-typed version of the Predicate Calculus
 325 with lambda abstraction.
 326
 327 Montague called the semantic part of his PTQ fragment *Intensional
 328 Logic*.  Without getting too fussy about details, we'll present the
 329 popular Ty2 version of the PTQ types, roughly as proposed by Gallin
 330 (1975).  [See Zimmermann, Ede. 1989. Intensional logic and two-sorted
 331 type theory.  *Journal of Symbolic Logic* ***54.1***: 65--77 for a
 332 precise characterization of the correspondence between IL and
 333 two-sorted Ty2.]
 334
 335 We'll need three base types: `e`, for individuals, `t`, for truth
 336 values, and `s` for evaluation indicies (world-time pairs).  The set
 337 of types is defined recursively:
 338
 339     the base types e, t, and s are types
 340     if a and b are types, <a,b> is a type
 341
 342 So `<e,<e,t>>` and `<s,<<s,e>,t>>` are types.  As we have mentioned,
 343 Montague's paper is the source for the convention in linguistics that
 344 a type of the form `<a, b>` corresponds to a functional type that we
 345 will write here as `a -> b`.  So the type `<a, b>` is the type of a
 346 function that maps objects of type `a` onto objects of type `b`.
 347
 348 Montague gave rules for the types of various logical formulas.  Of
 349 particular interest here, he gave the following typing rules for
 350 functional application and for lambda abstracts, which match the rules
 351 for the simply-typed lambda calculus exactly:
 352
 353 * If *&alpha;* is an expression of type *<a, b>*, and *&beta;* is an
 354 expression of type b, then *&alpha;(&beta;)* has type *b*.
 355
 356 * If *&alpha;* is an expression of type *a*, and *u* is a variable of type *b*, then *&lambda;u&alpha;* has type <code><b, a></code>.
 357
 358 When we talk about monads, we will consider Montague's treatment of
 359 intensionality in some detail.  In the meantime, Montague's PTQ is
 360 responsible for making the simply-typed lambda calculus the baseline
 361 semantic analysis for linguistics.