rename topics/_week5_simply_typed_lambda.mdwn to topics/week5_simply_typed.mdwn

[lambda.git] / topics / week5_simply_typed.mdwn
diff --git a/topics/week5_simply_typed.mdwn b/topics/week5_simply_typed.mdwn

new file mode 100644 (file)

index 0000000..1a0425f
--- /dev/null
+++ b/topics/week5_simply_typed.mdwn
@@ -0,0 +1,361 @@
+[[!toc]]
+
+##The simply-typed lambda calculus##
+
+The untyped lambda calculus is pure.  Pure in many ways: nothing but
+variables and lambdas, with no constants or other special symbols;
+also, all functions without any types.  As we'll see eventually, pure
+also in the sense of having no side effects, no mutation, just pure
+computation.
+
+But we live in an impure world.  It is much more common for practical
+programming languages to be typed, either implicitly or explicitly.
+Likewise, systems used to investigate philosophical or linguistic
+issues are almost always typed.  Types will help us reason about our
+computations.  They will also facilitate a connection between logic
+and computation.
+
+From a linguistic perspective, types are generalizations of (parts of)
+programs.  To make this comment more concrete: types are to (e.g.,
+lambda) terms as syntactic categories are to expressions of natural
+language.  If so, if it makes sense to gather a class of expressions
+together into a set of Nouns, or Verbs, it may also make sense to
+gather classes of terms into a set labelled with some computational type.
+
+To develop this analogy just a bit further, syntactic categories
+determine which expressions can combine with which other expressions.
+If a word is a member of the category of prepositions, it had better
+not try to combine (merge) with an expression in the category of, say,
+an auxilliary verb, since \**under has* is not a well-formed constituent
+in English.  Likewise, types in formal languages will determine which
+expressions can be sensibly combined. 
+
+Now, of course it is common linguistic practice to supply an analysis
+of natural language both with syntactic categories and with semantic
+types.  And there is a large degree of overlap between these type
+systems.  However, there are mismatches in both directions: there are
+syntactic distinctions that do not correspond to any salient semantic
+difference (why can't adjectives behave syntactically like verb
+phrases, since they both denote properties with (extensional) type
+`<e,t>`?); and in some analyses there are semantic differences that do
+not correspond to any salient syntactic distinctions (as in any
+analysis that involves silent type-shifters, such as Herman Hendriks'
+theory of quantifier scope, in which expressions change their semantic
+type without any effect on the expressions they can combine
+with syntactically).  We will consider again the relationship between
+syntactic types and semantic types later in the course.
+
+Soon we will consider polymorphic type systems.  First, however, we
+will consider the simply-typed lambda calculus.  
+
+[Pedantic on.  Why "*simply* typed"?  Well, the type system is
+particularly simple.  As mentioned to us by Koji Mineshima, Church
+tells us that "The simple theory of types was suggested as a
+modification of Russell's ramified theory of types by Leon Chwistek in
+1921 and 1922 and by F. P. Ramsey in 1926."  This footnote appears in
+Church's 1940 paper [A formulation of the simple theory of
+types](church-simple-types.pdf).  In this paper, Church writes types
+by simple apposition, without the ugly angle brackets and commas used
+by Montague.  Furthermore, he omits parentheses under the convention
+that types associated to the *left*---the opposite of the modern
+convention.  This is ok, however, because he also reverses the order,
+so that `te` is a function from objects of type `e` to objects of type
+`t`.  Cool paper!  If you ever want to see Church numerals in their
+native setting--but we're getting ahead of our story.  Pedantic off.]
+
+There's good news and bad news: the good news is that the simply-typed
+lambda calculus is strongly normalizing: every term has a normal form.
+We shall see that self-application is outlawed, so &Omega; can't even
+be written, let alone undergo reduction.  The bad news is that
+fixed-point combinators are also forbidden, so recursion is neither
+simple nor direct.
+
+#Types#
+
+We will have at least one ground type.  For the sake of linguistic
+familiarity, we'll use `e`, the type of individuals, and `t`, the type
+of truth values.
+
+In addition, there will be a recursively-defined class of complex
+types `T`, the smallest set such that
+
+*    ground types, including `e` and `t`, are in `T`
+
+*    for any types &sigma; and &tau; in `T`, the type &sigma; ->
+     &tau; is in `T`.
+
+For instance, here are some types in `T`:
+
+     e
+     e -> t
+     e -> e -> t
+     (e -> t) -> t
+     (e -> t) -> e -> t
+
+and so on.
+
+#Typed lambda terms#
+
+Given a set of types `T`, we define the set of typed lambda terms <code>&Lambda;_T</code>,
+which is the smallest set such that
+
+*    each type `t` has an infinite set of distinct variables, {x^t}_1,
+     {x^t}_2, {x^t}_3, ...
+
+*    If a term `M` has type &sigma; -> &tau;, and a term `N` has type
+     &sigma;, then the application `(M N)` has type &tau;.
+
+*    If a variable `a` has type &sigma;, and term `M` has type &tau;, 
+     then the abstract <code>&lambda; a M</code> has type &sigma; -> &tau;.
+
+The definitions of types and of typed terms should be highly familiar
+to semanticists, except that instead of writing &sigma; -> &tau;,
+linguists write <&sigma;, &tau;>.  We will use the arrow notation,
+since it is more iconic.
+
+Some examples (assume that `x` has type `o`):
+
+      x            o
+      \x.x         o -> o
+      ((\x.x) x)   o
+
+Excercise: write down terms that have the following types:
+
+                   o -> o -> o
+                   (o -> o) -> o -> o
+                   (o -> o -> o) -> o
+
+#A first glipse of the connection between types and logic
+
+In the simply-typed lambda calculus, we write types like <code>&sigma;
+-> &tau;</code>.  This looks like logical implication.  We'll take
+that resemblance seriously when we discuss the Curry-Howard
+correspondence.  In the meantime, note that types respect modus
+ponens: 
+
+<pre>
+Expression    Type      Implication
+-----------------------------------
+fn            &alpha; -> &beta;    &alpha; &sup; &beta;
+arg           &alpha;         &alpha;
+------        ------    --------
+(fn arg)      &beta;         &beta;
+</pre>
+
+The implication in the right-hand column is modus ponens, of course.
+
+
+#Associativity of types versus terms#
+
+As we have seen many times, in the lambda calculus, function
+application is left associative, so that `f x y z == (((f x) y) z)`.
+Types, *THEREFORE*, are right associative: if `x`, `y`, and `z`
+have types `a`, `b`, and `c`, respectively, then `f` has type 
+`a -> b -> c -> d == (a -> (b -> (c -> d)))`, where `d` is the
+type of the complete term.
+
+It is a serious faux pas to associate to the left for types.  You may
+as well use your salad fork to stir your tea.
+
+#The simply-typed lambda calculus is strongly normalizing#
+
+If `M` is a term with type &tau; in &Lambda;_T, then `M` has a
+normal form.  The proof is not particularly complex, but we will not
+present it here; see Berendregt or Hankin.
+
+Since &Omega; does not have a normal form, it follows that &Omega;
+cannot have a type in &Lambda;_T.  We can easily see why:
+
+<code>&Omega; = (\x.xx)(\x.xx)</code>
+
+Assume &Omega; has type &tau;, and `\x.xx` has type &sigma;.  Then
+because `\x.xx` takes an argument of type &sigma; and returns
+something of type &tau;, `\x.xx` must also have type &sigma; ->
+&tau;.  By repeating this reasoning, `\x.xx` must also have type
+(&sigma; -> &tau;) -> &tau;; and so on.  Since variables have
+finite types, there is no way to choose a type for the variable `x`
+that can satisfy all of the requirements imposed on it.
+
+In fact, we can't even type the parts of &Omega;, that is, `&omega;
+\equiv \x.xx`.  In general, there is no way for a function to have a
+type that can take itself for an argument.  
+
+It follows that there is no way to define the identity function in
+such a way that it can take itself as an argument.  Instead, there
+must be many different identity functions, one for each type.  Some of
+those types can be functions, and some of those functions can be
+(type-restricted) identity functions; but a simply-types identity
+function can never apply to itself.
+
+#Typing numerals#
+
+The Church numerals are well behaved with respect to types.  
+To see this, consider the first three Church numerals (starting with zero):
+
+    \s z . z
+    \s z . s z
+    \s z . s (s z)
+
+Given the internal structure of the term we are using to represent
+zero, its type must have the form &rho; -> &sigma; -> &sigma; for
+some &rho; and &sigma;.  This type is consistent with term for one,
+but the structure of the definition of one is more restrictive:
+because the first argument (`s`) must apply to the second argument
+(`z`), the type of the first argument must describe a function from
+expressions of type &sigma; to some result type.  So we can refine
+&rho; by replacing it with the more specific type &sigma; -> &tau;.
+At this point, the overall type is (&sigma; -> &tau;) -> &sigma; ->
+&sigma;.  Note that this refined type remains compatible with the
+definition of zero.  Finally, by examinining the definition of two, we
+see that expressions of type &tau; must be suitable to serve as
+arguments to functions of type &sigma; -> &tau;, since the result of
+applying `s` to `z` serves as the argument of `s`.  The most general
+way for that to be true is if &tau; &equiv; &sigma;.  So at this
+point, we have the overall type of (&sigma; -> &sigma;) -> &sigma;
+-> &sigma;.
+
+<!-- Make sure there is talk about unification and computation of the
+principle type-->
+
+## Predecessor and lists are not representable in simply typed lambda-calculus ##
+
+
+This is not because there is any difficulty typing what the functions
+involved do "from the outside": for instance, the predecessor function
+is a function from numbers to numbers, or &tau; -> &tau;, where &tau;
+is our type for Church numbers (i.e., (&sigma; -> &sigma;) -> &sigma;
+-> &sigma;).  (Though this type will only be correct if we decide that
+the predecessor of zero should be a number, perhaps zero.)
+
+Rather, the problem is that the definition of the function requires
+subterms that can't be simply-typed.  We'll illustrate with our
+implementation of the predecessor function, based on the discussion in
+Pierce 2002:547:
+
+    let zero = \s z. z in
+    let fst = \x y. x in
+    let snd = \x y. y in
+    let pair = \x y . \f . f x y in
+    let succ = \n s z. s (n s z) in
+    let shift = \p. pair (succ (p fst)) (p fst) in
+    let pred = \n. n shift (pair zero zero) snd in
+
+Note that `shift` takes a pair `p` as argument, but makes use of only
+the first element of the pair.  Why does it do that?  In order to
+understand what this code is doing, it is helpful to go through a
+sample computation, the predecessor of 3:
+
+    pred 3
+    3 shift (pair zero zero) snd
+    (\s z.s(s(s z))) shift (pair zero zero) snd
+    shift (shift (shift (\f.f 0 0))) snd
+    shift (shift (pair (succ ((\f.f 0 0) fst)) ((\f.f 0 0) fst))) snd
+    shift (shift (\f.f 1 0)) snd
+    shift (\f. f 2 1) snd
+    (\f. f 3 2) snd
+    snd 3 2
+    2
+
+At each stage, `shift` sees an ordered pair that contains two numbers
+related by the successor function.  It can safely discard the second
+element without losing any information.  The reason we carry around
+the second element at all is that when it comes time to complete the
+computation---that is, when we finally apply the top-level ordered
+pair to `snd`---it's the second element of the pair that will serve as
+the final result.
+
+Let's see how far we can get typing these terms.  `zero` is the Church
+encoding of zero.  Using `N` as the type for Church numbers (i.e.,
+<code>N &equiv; (&sigma; -> &sigma;) -> &sigma; -> &sigma;</code> for
+some &sigma;, `zero` has type `N`.  `snd` takes two numbers, and
+returns the second, so `snd` has type `N -> N -> N`.  Then the type of
+`pair` is `N -> N -> (type(snd)) -> N`, that is, `N -> N -> (N -> N ->
+N) -> N`.  Likewise, `succ` has type `N -> N`, and `shift` has type
+`pair -> pair`, where `pair` is the type of an ordered pair of
+numbers, namely, <code>pair &equiv; (N -> N -> N) -> N</code>.  So far
+so good.
+
+The problem is the way in which `pred` puts these parts together.  In
+particular, `pred` applies its argument, the number `n`, to the
+`shift` function.  Since `n` is a number, its type is <code>(&sigma;
+-> &sigma;) -> &sigma; -> &sigma;</code>.  This means that the type of
+`shift` has to match <code>&sigma; -> &sigma;</code>. But we
+concluded above that the type of `shift` also had to be `pair ->
+pair`.  Putting these constraints together, it appears that
+<code>&sigma;</code> must be the type of a pair of numbers.  But we
+already decided that the type of a pair of numbers is `(N -> N -> N)
+-> N`.  Here's the difficulty: `N` is shorthand for a type involving
+<code>&sigma;</code>.  If <code>&sigma;</code> turns out to depend on
+`N`, and `N` depends in turn on <code>&sigma;</code>, then
+<code>&sigma;</code> is a proper subtype of itself, which is not
+allowed in the simply-typed lambda calculus.
+
+The way we got here is that the `pred` function relies on the built-in
+right-fold structure of the Church numbers to recursively walk down
+the spine of its argument.  In order to do that, the argument had to
+apply to the `shift` operation.  And since `shift` had to be the
+sort of operation that manipulates numbers, the infinite regress is
+established.
+
+Now, of course, this is only one of myriad possible implementations of
+the predecessor function in the lambda calculus.  Could one of them
+possibly be simply-typeable?  It turns out that this can't be done.
+See Oleg Kiselyov's discussion and works cited there for details:
+[[predecessor and lists can't be represented in the simply-typed
+lambda
+calculus|http://okmij.org/ftp/Computation/lambda-calc.html#predecessor]].
+
+Because lists are (in effect) a generalization of the Church numbers,
+computing the tail of a list is likewise beyond the reach of the
+simply-typed lambda calculus.
+
+This result is not obvious, to say the least.  It illustrates how
+recursion is built into the structure of the Church numbers (and
+lists).  Most importantly for the discussion of the simply-typed
+lambda calculus, it demonstrates that even fairly basic recursive
+computations are beyond the reach of a simply-typed system.
+
+
+## Montague grammar is based on a simply-typed lambda calculus
+
+Systems based on the simply-typed lambda calculus are the bread and
+butter of current linguistic semantic analysis.  One of the most
+influential modern semantic formalisms---Montague's PTQ
+fragment---included a simply-typed version of the Predicate Calculus
+with lambda abstraction.
+
+Montague called the semantic part of his PTQ fragment *Intensional
+Logic*.  Without getting too fussy about details, we'll present the
+popular Ty2 version of the PTQ types, roughly as proposed by Gallin
+(1975).  [See Zimmermann, Ede. 1989. Intensional logic and two-sorted
+type theory.  *Journal of Symbolic Logic* ***54.1***: 65--77 for a
+precise characterization of the correspondence between IL and
+two-sorted Ty2.]
+
+We'll need three base types: `e`, for individuals, `t`, for truth
+values, and `s` for evaluation indicies (world-time pairs).  The set
+of types is defined recursively:
+
+    the base types e, t, and s are types
+    if a and b are types, <a,b> is a type
+
+So `<e,<e,t>>` and `<s,<<s,e>,t>>` are types.  As we have mentioned,
+Montague's paper is the source for the convention in linguistics that
+a type of the form `<a, b>` corresponds to a functional type that we
+will write here as `a -> b`.  So the type `<a, b>` is the type of a
+function that maps objects of type `a` onto objects of type `b`.
+
+Montague gave rules for the types of various logical formulas.  Of
+particular interest here, he gave the following typing rules for
+functional application and for lambda abstracts, which match the rules
+for the simply-typed lambda calculus exactly:
+
+* If *&alpha;* is an expression of type *<a, b>*, and *&beta;* is an
+expression of type b, then *&alpha;(&beta;)* has type *b*.  
+
+* If *&alpha;* is an expression of type *a*, and *u* is a variable of type *b*, then *&lambda;u&alpha;* has type <code><b, a></code>.
+
+When we talk about monads, we will consider Montague's treatment of
+intensionality in some detail.  In the meantime, Montague's PTQ is
+responsible for making the simply-typed lambda calculus the baseline
+semantic analysis for linguistics.