posted link to homework

[lambda.git] / topics / _week5_simply_typed_lambda.mdwn
diff --git a/topics/_week5_simply_typed_lambda.mdwn b/topics/_week5_simply_typed_lambda.mdwn

index b1b4d93..1a0425f 100644 (file)
--- a/topics/_week5_simply_typed_lambda.mdwn
+++ b/topics/_week5_simply_typed_lambda.mdwn
@@ -26,7 +26,7 @@ To develop this analogy just a bit further, syntactic categories
  determine which expressions can combine with which other expressions.
  If a word is a member of the category of prepositions, it had better
  not try to combine (merge) with an expression in the category of, say,
  determine which expressions can combine with which other expressions.
  If a word is a member of the category of prepositions, it had better
  not try to combine (merge) with an expression in the category of, say,
-an auxilliary verb, since *under has* is not a well-formed constituent
+an auxilliary verb, since \**under has* is not a well-formed constituent
  in English.  Likewise, types in formal languages will determine which
  expressions can be sensibly combined. 
  
  in English.  Likewise, types in formal languages will determine which
  expressions can be sensibly combined. 
  
@@ -41,7 +41,7 @@ phrases, since they both denote properties with (extensional) type
  not correspond to any salient syntactic distinctions (as in any
  analysis that involves silent type-shifters, such as Herman Hendriks'
  theory of quantifier scope, in which expressions change their semantic
  not correspond to any salient syntactic distinctions (as in any
  analysis that involves silent type-shifters, such as Herman Hendriks'
  theory of quantifier scope, in which expressions change their semantic
-type without any effect on the syntactic expressions they can combine
+type without any effect on the expressions they can combine
  with syntactically).  We will consider again the relationship between
  syntactic types and semantic types later in the course.
  
  with syntactically).  We will consider again the relationship between
  syntactic types and semantic types later in the course.
  
@@ -61,7 +61,7 @@ that types associated to the *left*---the opposite of the modern
  convention.  This is ok, however, because he also reverses the order,
  so that `te` is a function from objects of type `e` to objects of type
  `t`.  Cool paper!  If you ever want to see Church numerals in their
  convention.  This is ok, however, because he also reverses the order,
  so that `te` is a function from objects of type `e` to objects of type
  `t`.  Cool paper!  If you ever want to see Church numerals in their
-native setting--but I'm getting ahead of my story.  Pedantic off.]
+native setting--but we're getting ahead of our story.  Pedantic off.]
  
  There's good news and bad news: the good news is that the simply-typed
  lambda calculus is strongly normalizing: every term has a normal form.
  
  There's good news and bad news: the good news is that the simply-typed
  lambda calculus is strongly normalizing: every term has a normal form.
@@ -81,16 +81,16 @@ types `T`, the smallest set such that
  
  *    ground types, including `e` and `t`, are in `T`
  
  
  *    ground types, including `e` and `t`, are in `T`
  
-*    for any types &sigma; and &tau; in `T`, the type &sigma; -->
+*    for any types &sigma; and &tau; in `T`, the type &sigma; ->
       &tau; is in `T`.
  
  For instance, here are some types in `T`:
  
       e
       &tau; is in `T`.
  
  For instance, here are some types in `T`:
  
       e
-     e --> t
-     e --> e --> t
-     (e --> t) --> t
-     (e --> t) --> e --> t
+     e -> t
+     e -> e -> t
+     (e -> t) -> t
+     (e -> t) -> e -> t
  
  and so on.
  
  
  and so on.
  
@@ -102,28 +102,48 @@ which is the smallest set such that
  *    each type `t` has an infinite set of distinct variables, {x^t}_1,
       {x^t}_2, {x^t}_3, ...
  
  *    each type `t` has an infinite set of distinct variables, {x^t}_1,
       {x^t}_2, {x^t}_3, ...
  
-*    If a term `M` has type &sigma; --> &tau;, and a term `N` has type
+*    If a term `M` has type &sigma; -> &tau;, and a term `N` has type
       &sigma;, then the application `(M N)` has type &tau;.
  
  *    If a variable `a` has type &sigma;, and term `M` has type &tau;, 
       &sigma;, then the application `(M N)` has type &tau;.
  
  *    If a variable `a` has type &sigma;, and term `M` has type &tau;, 
-     then the abstract <code>&lambda; a M</code> has type &sigma; --> &tau;.
+     then the abstract <code>&lambda; a M</code> has type &sigma; -> &tau;.
  
  The definitions of types and of typed terms should be highly familiar
  
  The definitions of types and of typed terms should be highly familiar
-to semanticists, except that instead of writing &sigma; --> &tau;,
+to semanticists, except that instead of writing &sigma; -> &tau;,
  linguists write <&sigma;, &tau;>.  We will use the arrow notation,
  since it is more iconic.
  
  Some examples (assume that `x` has type `o`):
  
        x            o
  linguists write <&sigma;, &tau;>.  We will use the arrow notation,
  since it is more iconic.
  
  Some examples (assume that `x` has type `o`):
  
        x            o
-      \x.x         o --> o
+      \x.x         o -> o
        ((\x.x) x)   o
  
  Excercise: write down terms that have the following types:
  
        ((\x.x) x)   o
  
  Excercise: write down terms that have the following types:
  
-                   o --> o --> o
-                   (o --> o) --> o --> o
-                   (o --> o --> o) --> o
+                   o -> o -> o
+                   (o -> o) -> o -> o
+                   (o -> o -> o) -> o
+
+#A first glipse of the connection between types and logic
+
+In the simply-typed lambda calculus, we write types like <code>&sigma;
+-> &tau;</code>.  This looks like logical implication.  We'll take
+that resemblance seriously when we discuss the Curry-Howard
+correspondence.  In the meantime, note that types respect modus
+ponens: 
+
+<pre>
+Expression    Type      Implication
+-----------------------------------
+fn            &alpha; -> &beta;    &alpha; &sup; &beta;
+arg           &alpha;         &alpha;
+------        ------    --------
+(fn arg)      &beta;         &beta;
+</pre>
+
+The implication in the right-hand column is modus ponens, of course.
+
  
  #Associativity of types versus terms#
  
  
  #Associativity of types versus terms#
  
@@ -131,7 +151,7 @@ As we have seen many times, in the lambda calculus, function
  application is left associative, so that `f x y z == (((f x) y) z)`.
  Types, *THEREFORE*, are right associative: if `x`, `y`, and `z`
  have types `a`, `b`, and `c`, respectively, then `f` has type 
  application is left associative, so that `f x y z == (((f x) y) z)`.
  Types, *THEREFORE*, are right associative: if `x`, `y`, and `z`
  have types `a`, `b`, and `c`, respectively, then `f` has type 
-`a --> b --> c --> d == (a --> (b --> (c --> d)))`, where `d` is the
+`a -> b -> c -> d == (a -> (b -> (c -> d)))`, where `d` is the
  type of the complete term.
  
  It is a serious faux pas to associate to the left for types.  You may
  type of the complete term.
  
  It is a serious faux pas to associate to the left for types.  You may
@@ -150,45 +170,192 @@ cannot have a type in &Lambda;_T.  We can easily see why:
  
  Assume &Omega; has type &tau;, and `\x.xx` has type &sigma;.  Then
  because `\x.xx` takes an argument of type &sigma; and returns
  
  Assume &Omega; has type &tau;, and `\x.xx` has type &sigma;.  Then
  because `\x.xx` takes an argument of type &sigma; and returns
-something of type &tau;, `\x.xx` must also have type &sigma; -->
+something of type &tau;, `\x.xx` must also have type &sigma; ->
  &tau;.  By repeating this reasoning, `\x.xx` must also have type
  &tau;.  By repeating this reasoning, `\x.xx` must also have type
-(&sigma; --> &tau;) --> &tau;; and so on.  Since variables have
+(&sigma; -> &tau;) -> &tau;; and so on.  Since variables have
  finite types, there is no way to choose a type for the variable `x`
  that can satisfy all of the requirements imposed on it.
  
  finite types, there is no way to choose a type for the variable `x`
  that can satisfy all of the requirements imposed on it.
  
-In general, there is no way for a function to have a type that can
-take itself for an argument.  It follows that there is no way to
-define the identity function in such a way that it can take itself as
-an argument.  Instead, there must be many different identity
-functions, one for each type.  Some of those types can be functions,
-and some of those functions can be (type-restricted) identity
-functions; but a simply-types identity function can never apply to itself.
+In fact, we can't even type the parts of &Omega;, that is, `&omega;
+\equiv \x.xx`.  In general, there is no way for a function to have a
+type that can take itself for an argument.  
+
+It follows that there is no way to define the identity function in
+such a way that it can take itself as an argument.  Instead, there
+must be many different identity functions, one for each type.  Some of
+those types can be functions, and some of those functions can be
+(type-restricted) identity functions; but a simply-types identity
+function can never apply to itself.
  
  #Typing numerals#
  
  
  #Typing numerals#
  
-The Church numerals are well behaved with respect to types.  They can
-all be given the type (&sigma; --> &sigma;) --> &sigma; --> &sigma;.
+The Church numerals are well behaved with respect to types.  
+To see this, consider the first three Church numerals (starting with zero):
+
+    \s z . z
+    \s z . s z
+    \s z . s (s z)
+
+Given the internal structure of the term we are using to represent
+zero, its type must have the form &rho; -> &sigma; -> &sigma; for
+some &rho; and &sigma;.  This type is consistent with term for one,
+but the structure of the definition of one is more restrictive:
+because the first argument (`s`) must apply to the second argument
+(`z`), the type of the first argument must describe a function from
+expressions of type &sigma; to some result type.  So we can refine
+&rho; by replacing it with the more specific type &sigma; -> &tau;.
+At this point, the overall type is (&sigma; -> &tau;) -> &sigma; ->
+&sigma;.  Note that this refined type remains compatible with the
+definition of zero.  Finally, by examinining the definition of two, we
+see that expressions of type &tau; must be suitable to serve as
+arguments to functions of type &sigma; -> &tau;, since the result of
+applying `s` to `z` serves as the argument of `s`.  The most general
+way for that to be true is if &tau; &equiv; &sigma;.  So at this
+point, we have the overall type of (&sigma; -> &sigma;) -> &sigma;
+-> &sigma;.
+
+<!-- Make sure there is talk about unification and computation of the
+principle type-->
  
  
+## Predecessor and lists are not representable in simply typed lambda-calculus ##
  
  
  
  
+This is not because there is any difficulty typing what the functions
+involved do "from the outside": for instance, the predecessor function
+is a function from numbers to numbers, or &tau; -> &tau;, where &tau;
+is our type for Church numbers (i.e., (&sigma; -> &sigma;) -> &sigma;
+-> &sigma;).  (Though this type will only be correct if we decide that
+the predecessor of zero should be a number, perhaps zero.)
+
+Rather, the problem is that the definition of the function requires
+subterms that can't be simply-typed.  We'll illustrate with our
+implementation of the predecessor function, based on the discussion in
+Pierce 2002:547:
+
+    let zero = \s z. z in
+    let fst = \x y. x in
+    let snd = \x y. y in
+    let pair = \x y . \f . f x y in
+    let succ = \n s z. s (n s z) in
+    let shift = \p. pair (succ (p fst)) (p fst) in
+    let pred = \n. n shift (pair zero zero) snd in
+
+Note that `shift` takes a pair `p` as argument, but makes use of only
+the first element of the pair.  Why does it do that?  In order to
+understand what this code is doing, it is helpful to go through a
+sample computation, the predecessor of 3:
+
+    pred 3
+    3 shift (pair zero zero) snd
+    (\s z.s(s(s z))) shift (pair zero zero) snd
+    shift (shift (shift (\f.f 0 0))) snd
+    shift (shift (pair (succ ((\f.f 0 0) fst)) ((\f.f 0 0) fst))) snd
+    shift (shift (\f.f 1 0)) snd
+    shift (\f. f 2 1) snd
+    (\f. f 3 2) snd
+    snd 3 2
+    2
+
+At each stage, `shift` sees an ordered pair that contains two numbers
+related by the successor function.  It can safely discard the second
+element without losing any information.  The reason we carry around
+the second element at all is that when it comes time to complete the
+computation---that is, when we finally apply the top-level ordered
+pair to `snd`---it's the second element of the pair that will serve as
+the final result.
+
+Let's see how far we can get typing these terms.  `zero` is the Church
+encoding of zero.  Using `N` as the type for Church numbers (i.e.,
+<code>N &equiv; (&sigma; -> &sigma;) -> &sigma; -> &sigma;</code> for
+some &sigma;, `zero` has type `N`.  `snd` takes two numbers, and
+returns the second, so `snd` has type `N -> N -> N`.  Then the type of
+`pair` is `N -> N -> (type(snd)) -> N`, that is, `N -> N -> (N -> N ->
+N) -> N`.  Likewise, `succ` has type `N -> N`, and `shift` has type
+`pair -> pair`, where `pair` is the type of an ordered pair of
+numbers, namely, <code>pair &equiv; (N -> N -> N) -> N</code>.  So far
+so good.
+
+The problem is the way in which `pred` puts these parts together.  In
+particular, `pred` applies its argument, the number `n`, to the
+`shift` function.  Since `n` is a number, its type is <code>(&sigma;
+-> &sigma;) -> &sigma; -> &sigma;</code>.  This means that the type of
+`shift` has to match <code>&sigma; -> &sigma;</code>. But we
+concluded above that the type of `shift` also had to be `pair ->
+pair`.  Putting these constraints together, it appears that
+<code>&sigma;</code> must be the type of a pair of numbers.  But we
+already decided that the type of a pair of numbers is `(N -> N -> N)
+-> N`.  Here's the difficulty: `N` is shorthand for a type involving
+<code>&sigma;</code>.  If <code>&sigma;</code> turns out to depend on
+`N`, and `N` depends in turn on <code>&sigma;</code>, then
+<code>&sigma;</code> is a proper subtype of itself, which is not
+allowed in the simply-typed lambda calculus.
+
+The way we got here is that the `pred` function relies on the built-in
+right-fold structure of the Church numbers to recursively walk down
+the spine of its argument.  In order to do that, the argument had to
+apply to the `shift` operation.  And since `shift` had to be the
+sort of operation that manipulates numbers, the infinite regress is
+established.
+
+Now, of course, this is only one of myriad possible implementations of
+the predecessor function in the lambda calculus.  Could one of them
+possibly be simply-typeable?  It turns out that this can't be done.
+See Oleg Kiselyov's discussion and works cited there for details:
+[[predecessor and lists can't be represented in the simply-typed
+lambda
+calculus|http://okmij.org/ftp/Computation/lambda-calc.html#predecessor]].
  
  
-## Predecessor and lists are not representable in simply typed lambda-calculus ##
+Because lists are (in effect) a generalization of the Church numbers,
+computing the tail of a list is likewise beyond the reach of the
+simply-typed lambda calculus.
  
  
-    The predecessor of a Church-encoded numeral, or, generally, the encoding of a list with the car and cdr operations are both impossible in the simply typed lambda-calculus. Henk Barendregt's ``The impact of the lambda-calculus in logic and computer science'' (The Bulletin of Symbolic Logic, v3, N2, June 1997) has the following phrase, on p. 186:
+This result is not obvious, to say the least.  It illustrates how
+recursion is built into the structure of the Church numbers (and
+lists).  Most importantly for the discussion of the simply-typed
+lambda calculus, it demonstrates that even fairly basic recursive
+computations are beyond the reach of a simply-typed system.
  
  
-        Even for a function as simple as the predecessor lambda definability remained an open problem for a while. From our present knowledge it is tempting to explain this as follows. Although the lambda calculus was conceived as an untyped theory, typeable terms are more intuitive. Now the functions addition and multiplication are defineable by typeable terms, while [101] and [108] have characterized the lambda-defineable functions in the (simply) typed lambda calculus and the predecessor is not among them [the story of the removal of Kleene's four wisdom teeth is skipped...]
-        Ref 108 is R.Statman: The typed lambda calculus is not elementary recursive. Theoretical Comp. Sci., vol 9 (1979), pp. 73-81.
  
  
-    Since list is a generalization of numeral -- with cons being a successor, append being the addition, tail (aka cdr) being the predecessor -- it follows then the list cannot be encoded in the simply typed lambda-calculus.
+## Montague grammar is based on a simply-typed lambda calculus
  
  
-    To encode both operations, we need either inductive (generally, recursive) types, or System F with its polymorphism. The first approach is the most common. Indeed, the familiar definition of a list
+Systems based on the simply-typed lambda calculus are the bread and
+butter of current linguistic semantic analysis.  One of the most
+influential modern semantic formalisms---Montague's PTQ
+fragment---included a simply-typed version of the Predicate Calculus
+with lambda abstraction.
  
  
-         data List a = Nil | Cons a (List a)
+Montague called the semantic part of his PTQ fragment *Intensional
+Logic*.  Without getting too fussy about details, we'll present the
+popular Ty2 version of the PTQ types, roughly as proposed by Gallin
+(1975).  [See Zimmermann, Ede. 1989. Intensional logic and two-sorted
+type theory.  *Journal of Symbolic Logic* ***54.1***: 65--77 for a
+precise characterization of the correspondence between IL and
+two-sorted Ty2.]
  
  
-    gives an (iso-) recursive data type (in Haskell. In ML, it is an inductive data type).
+We'll need three base types: `e`, for individuals, `t`, for truth
+values, and `s` for evaluation indicies (world-time pairs).  The set
+of types is defined recursively:
  
  
-    Lists can also be represented in System F. As a matter of fact, we do not need the full System F (where the type reconstruction is not decidable). We merely need the extension of the Hindley-Milner system with higher-ranked types, which requires a modicum of type annotations and yet is able to infer the types of all other terms. This extension is supported in Haskell and OCaml. With such an extension, we can represent a list by its fold, as shown in the code below. It is less known that this representation is faithful: we can implement all list operations, including tail, drop, and even zip.
+    the base types e, t, and s are types
+    if a and b are types, <a,b> is a type
  
  
-See also [[Oleg Kiselyov on the predecessor function in the lambda
-calculus|http://okmij.org/ftp/Computation/lambda-calc.html#predecessor]].
+So `<e,<e,t>>` and `<s,<<s,e>,t>>` are types.  As we have mentioned,
+Montague's paper is the source for the convention in linguistics that
+a type of the form `<a, b>` corresponds to a functional type that we
+will write here as `a -> b`.  So the type `<a, b>` is the type of a
+function that maps objects of type `a` onto objects of type `b`.
+
+Montague gave rules for the types of various logical formulas.  Of
+particular interest here, he gave the following typing rules for
+functional application and for lambda abstracts, which match the rules
+for the simply-typed lambda calculus exactly:
+
+* If *&alpha;* is an expression of type *<a, b>*, and *&beta;* is an
+expression of type b, then *&alpha;(&beta;)* has type *b*.  
+
+* If *&alpha;* is an expression of type *a*, and *u* is a variable of type *b*, then *&lambda;u&alpha;* has type <code><b, a></code>.
  
  
+When we talk about monads, we will consider Montague's treatment of
+intensionality in some detail.  In the meantime, Montague's PTQ is
+responsible for making the simply-typed lambda calculus the baseline
+semantic analysis for linguistics.