edits

[lambda.git] / topics / _week15_continuation_applications.mdwn
diff --git a/topics/_week15_continuation_applications.mdwn b/topics/_week15_continuation_applications.mdwn

index 001f8b1..328b506 100644 (file)
--- a/topics/_week15_continuation_applications.mdwn
+++ b/topics/_week15_continuation_applications.mdwn
@@ -11,17 +11,13 @@ In the past couple of weeks, we've introduced continuations, first as
  a functional programming technique, then in terms of list and tree
  zippers, then as a monad.  In this lecture, we will generalize
  continuations slightly beyond a monad, and then begin to outline some
-of the applications of monads.  In brief, the generalization can be
-summarized in terms of types: instead of using a Kleisli arrow mapping
-a type α to a continuized type (α -> ρ) -> ρ, we'll allow the result
-types to differ, i.e., we'll map α to (α -> β) -> γ.  This will be
-crucial for some natural language applications.
+of the applications of the generalized continuations.
  
  Many (though not all) of the applications are discussed in detail in
  Barker and Shan 2014, *Continuations in Natural Language*, OUP.
  
-In terms of list zippers, the continuation of a focused element in
-the list is the front part of the list.
+To review, in terms of list zippers, the continuation of a focused
+element in the list is the front part of the list.
  
      list zipper for the list [a;b;c;d;e;f] with focus on d:
  
@@ -33,28 +29,42 @@ the list is the front part of the list.
  In terms of tree zippers, the continuation is the entire context of
  the focused element--the entire rest of the tree.
  
-[drawing of a broken tree]
+[drawing of a tree zipper]
  
-Last week we had trouble computing the doubling task when there was more
-than one shifty operator after moving from a list perspective to a
-tree perspective.  That is, it remained unclear why "aScSe" was
+We explored continuations first in a list setting, then in a tree
+setting, using the doubling task as an example.
  
-    "aacaceecaacaceecee"
+    "abSd" ~~> "ababd"
+    "ab#deSfg" ~~> "abdedefg"
  
-We'll burn through that conceptual fog today.  The natural thing to
-try would have been to defunctionalize the continuation-based solution
-using a tree zipper.  But that would not have been easy, since the
-natural way to implement the doubling behavior of the shifty operator
-would have been to simply copy the context provided by the zipper.  
-This would have produced two uncoordinated copies of the other shifty
-operator, and we'd have been in the situation described in class of
-having a reduction strategy that never reduced the number of shifty
-operators below 2. (There are ways around this limitation of tree zippers, 
-but they are essentially equivalent to the technique given just below.)
+The "S" functions like a shifty operator, and "#" functions like a reset.
+
+Although the list version of the doubling task was easy to understand
+thoroughly, the tree version was significantly more challenging.  In
+particular, it remained unclear why
+
+    "aScSe" ~~> "aacaceecaacaceecee"
+
+We'll burn through that conceptual fog today by learning more about
+how to work with continuations.
+
+The natural thing to try would have been to defunctionalize the
+continuation-based solution using a tree zipper.  But that would not
+have been easy, since the natural way to implement the doubling
+behavior of the shifty operator would have been to simply copy the
+context provided by the zipper.  This would have produced two
+uncoordinated copies of the other shifty operator, and we'd have been
+in the situation described in class of having a reduction strategy
+that never reduced the number of shifty operators below 2.  The
+limitation is that zippers by themselves don't provide a natural way
+to establish a dependency between two distant elements of a data
+structure.  (There are ways around this limitation of tree zippers,
+but they are essentially equivalent to the technique given just
+below.)
  
  Instead, we'll re-interpreting what the continuation monad was doing
-in more or less defunctionalized terms by using Quantifier Raising, a technique
-from linguistics.
+in more or less defunctionalized terms by using Quantifier Raising, a
+technique from linguistics.
  
  But first, motivating quantifier scope as a linguistic application.
  
@@ -101,13 +111,19 @@ The standard technique for handling scope-taking in linguistics is
  Quantifier Raising (QR).  As you might suppose, the rule for Quantifier
  Raising closely resembles the reduction rule for shift:
  
-    Quantifier Raising: given a sentence [... [QDP] ...], build a new
-    sentence [QDP (\x.[... [x] ...])].  
+    Quantifier Raising: given a sentence of the form
+
+             [... [QDP] ...],
+
+    build a new sentence of the form
+
+    [QDP (\x.[... [x] ...])].  
  
  Here, QDP is a scope-taking quantificational DP.
  
  Just to emphasize the similarity between QR and shift, we can use QR
-to provide insight into the tree task that mystified us earlier.
+to provide insight into the tree version of the doubling task that
+mystified us earlier.  Here's the starting point:
  
  <!--
  \tree (. (a)((S)((d)((S)(e)))))
@@ -269,7 +285,8 @@ Three lessons:
  
  * Generalizing from one-sided, list-based continuation
    operators to two-sided, tree-based continuation operators is a
-  dramatic increase in power and complexity.
+  dramatic increase in power and complexity.  (De Groote's dynamic
+  montague semantics continuations are the one-sided, list-based variety.)
  
  * Operators that
    compose multiple copies of a context can be hard to understand
@@ -287,13 +304,14 @@ involving control operators such as shift and reset: using a CPS
  transform, lifting into a continuation monad, and by using QR.
  
  QR is the traditional system in linguistics, but it will not be
-adequate for us in general.  The reason has to do with order.  As
-we've discussed, especially with respect to the CPS transform,
-continuations allow fine-grained control over the order of evaluation.
-One of the main empirical claims of Barker and Shan 2014 is that
-natural language is sensitive to evaluation order.  Unlike other
-presentations of continuations, QR does not lend itself to reasoning
-about evaluation order, so we will need to use a different strategy.
+adequate for us in general.  The reason has to do with evaluation
+order.  As we've discussed, especially with respect to the CPS
+transform, continuations allow fine-grained control over the order of
+evaluation.  One of the main empirical claims of Barker and Shan 2014
+is that natural language is sensitive to evaluation order.  Unlike
+other presentations of continuations, QR does not lend itself to
+reasoning about evaluation order, so we will need to use a different
+strategy.
  
  [Note to self: it is interesting to consider what it would take to
  reproduce the analyses giving in Barker and Shan in purely QR terms.
@@ -331,9 +349,9 @@ into an at-issue (pre-monadic) computation with a layer at which
  side-effects occur.
  
  The tower notation is a precise way of articulating continuation-based
-computations into a payload and (potentially multiple) layers of side-effects.
-We won't keep the outer box, but we will keep the horizontal line
-dividing main effects from side-effects.
+computations into a payload and (potentially multiple) layers of
+side-effects.  Visually, we won't keep the outer box, but we will keep
+the horizontal line dividing main effects from side-effects.
  
  Tower convention for types:
  <pre>
@@ -342,6 +360,8 @@ Tower convention for types:
                                                  α
  </pre>
  
+Read these types counter-clockwise starting at the bottom.
+
  Tower convention for values:
  <pre>
                                             g[] 
@@ -357,7 +377,7 @@ hole in it.  For instance, we might have g[x] = \forall x.P[x].
  We'll use a simply-typed system with two atomic types, DP (the type of
  individuals) and S (the type of truth values).  
  
-## LIFT
+## LIFT (mid)
  
  Then in the spirit of monadic thinking, we'll have a way of lifting an
  arbitrary value into the tower system:
@@ -376,7 +396,7 @@ individual-denoting expression yields the generalized quantifier
  proposed by Montague as the denotation for proper names:
  
                                              []   S|S 
-    LIFT (j:DP) = \k.kx : (DP -> S) -> S == -- : ---
+    LIFT (j:DP) = \k.kj : (DP -> S) -> S == -- : ---
                                              j    DP
  
  So if the proper name *John* denotes the individual j, LIFT(j) is the
@@ -403,8 +423,13 @@ functional application (i.e, f:(α->β) (x:α) = fx:β).
  
  ## Not quite a monad
  
-To demonstrate that this is indeed the continuation monad's ¢
-operator:
+This discussion is based on Wadler's paper `Composable continuations'.
+
+The unit and the combination rule work very much like we are used to
+from the various monads we've studied.
+
+In particular, we can easily see that the ¢ operator defined just
+above is exactly the same as the ¢ operator from the continuation monad:
  
        ¢ (\k.g[kf]) (\k.h[kx])
      = (\MNk.M(\m.N(\n.k(mn)))) (\k.g[kf]) (\k.h[kx])
@@ -416,21 +441,165 @@ operator:
      == ------
           fx
  
-However, these continuations do not form an official monad.  The
-reason is that (see Wadler's paper `Composable continuations' for details).
+However, these continuations do not form an official monad.
+One way to see this is to consider the type of the bind operator.
+The type of the bind operator in a genuine monad is
+
+    mbind : Mα -> (α -> Mβ) -> Mβ
+
+In particular, the result type of the second argument (Mβ) must be
+identical to the result type of the bind operator overall.  For the
+continuation monad, this means that mbind has the following type:
+
+                     ((α -> γ) -> ρ)
+             -> α -> ((β -> δ) -> ρ)
+                  -> ((β -> δ) -> ρ)
+
+But the type of the bind operator in our generalized continuation
+system (call it "kbind") is
+
+    kbind : 
+                     ((α -> γ) -> ρ)
+             -> α -> ((β -> δ) -> γ)
+                  -> ((β -> δ) -> ρ)
+
+Note that `(β -> δ) -> γ` is not the same as `(β -> δ) -> ρ`.
+
+These more general types work out fine when plugged into the
+continuation monad's bind operator:
+
+    kbind u f = \k.   u           (\x.     f              x    k      )
+                 β->δ (α->γ)->ρ     α      α->(β->δ)->γ   α    β->δ
+                                           -----------------
+                                              (β->δ)->γ
+                                              ------------------------
+                                                      γ
+                                   --------------------
+                                        α->γ
+                       ---------------------
+                              ρ
+                ---------------
+                   (β->δ)->ρ
+
+Neverthless, it's easy to see that the generalized continuation system
+obeys the monad laws.  We haven't spent much time proving monad laws,
+so this seems like a worthy occasion on which to give some details.
+Since we're working with bind here, we'll use the version of the monad
+laws that are expressed in terms of bind:
+
+    Prove u >>= ⇧ == u:
+
+       u >>= (\ak.ka)
+    == (\ufk.u(\x.fxk)) u (\ak.ka)
+    ~~> \k.u(\x.(\ak.ka)xk)
+    ~~> \k.u(\x.kx)
+    ~~> \k.uk
+    ~~> u
+
+The last two steps are eta reductions.
+
+    Prove ⇧a >>= f == f a:
+
+       ⇧a >>= f
+    == (\ak.ka)a >>= f
+    ~~> (\k.ka) >>= f
+    == (\ufk.u(\x.fxk)) (\k.ka) f
+    ~~> \k.(\k.ka)(\x.fxk)
+    ~~> \k.fak
+    ~~> fa
+
+    Prove u >>= (\a.fa >>= g) == (u >>= f) >>= g:
+        
+       u >>= (\a.fa >>= g)
+    == u >>= (\a.(\k.fa(\x.gxk)))
+    == (\ufk.u(\x.fxk)) u (\a.(\k.fa(\x.gxk)))
+    == \k.u(\x.(\a.(\k.fa(\x.gxk)))xk)
+    ~~> \k.u(\x.fx(\y.gyk))
+
+       (u >>= f) >>= g
+    == (\ufk.u(\x.fxk)) u f >>= g
+    ~~> (\k.u(\x.fxk)) >>= g
+    == (\ufk.u(\x.fxk)) (\k.u(\x.fxk)) g
+    ~~> \k.(\k.u(\x.fxk))(\y.gyk)
+    ~~> \k.u(\x.fx(\y.gyk))
+
+The fact that the monad laws hold means that we can rely on any
+reasoning that depends on the monad laws.
+
+## Syntactic refinements: subtypes of implication
+
+Because natural language allows the functor to be on the left or on
+the right, we replace the type arrow -> with a left-leaning version \
+and a right-leaning version, as follows:
+
+    α/β   β    = α
+      β   β\α  = α
+
+This means (without adding some fancy footwork, as in Charlow 2014) we
+need two versions of ¢ too, one for each direction for the unmonadized types.
+
+Just to be explicit, here are the two versions:
+
+    g[]    γ | δ      h[]   δ | ρ    g[h[]]   γ | ρ
+    --- : -------  ¢  --- : ----- == ------ : -----
+    f       α/β        x       α        fx       β
+
+    h[]   δ | ρ      g[]    γ | δ       g[h[]]   γ | ρ
+    --- : -----  ¢   --- : -------   == ------ : -----
+     x      α         f      β\α          fx       β
+
+With respect to types, they differ only in replacing α -> β with α/β
+(top version) and with β\α (bottom version).  With respect to
+syntactic order, they differ in a parallel way with respect to whether
+the function f is on the left of its argument x (top version) or on
+the right (bottom version).
+
+Logically, separating -> into \\ and / corresponds to rejecting the
+structural rule of exchange.
+
+Note that the \\ and / only govern types at the bottom of the tower.
+That is, we currently still have arrow at the higher-order levels of
+the types, if we undo the tower notation:
+
+    γ|δ
+    --- == ((α/β) -> δ) -> γ
+    α/β
+
+We'll change these arrows into left-leaning and right-leaning versions
+too, according to the following scheme:
+
+    γ|δ
+    --- == γ//((α/β) \\ δ)
+    α/β
+
+As we'll see in a minute, each of these for implications (\\, /, \\\\,
+//) will have a syntactic interpretation:
+
+    \   argument is adjacent on the left of the functor
+    /    argument is adjacent on the right of the functor
+    \\   argument is surrounded by the functor
+    //  argument surrounds the functor
+
+## LOWER (reset)
+
+One more ingredient: one thing that shifty continuation operators
+require is an upper bound, a reset to show the extent of the remainder
+of the computation that the shift operator captures.  In the list
+version of the doubling task, we have
  
-Neverthless, obeys the monad laws.
+    "a#bdeSfg" ~~> "abdebdefg"   continuation captured: bde
+    "ab#deSfg" ~~> "abdedefg"    continuation captured:  de
  
-Oh, one more thing: because natural language allows the functor to be
-on the left or on the right, we replace the type arrow -> with a
-left-leaning version \ and a right-leaning version, as follows:
+In programming languages, resets are encoded in the computation
+explicitly.  In natural language, resets are always invisible.
+We'll deal with this in the natural language setting by allowing
+spontaneous resets, as long as the types match the following recipe:
  
-    α/β  β = α
-    β  β\α = α
+           g[] α|S
+    LOWER (---:---) == g[p]:α
+            p   S
  
-This means we need two versions of ¢ too (see Barker and Shan 2014
-chapter 1 for full details).
+This will be easiest to explain by presenting our first complete
+example from natural language:
  
-This is (almost) all we need to get some significant linguistic work
-done.