topics/_week15_continuation_applications.mdwn

   1 <!-- λ ◊ ≠ ∃ Λ ∀ ≡ α β γ ρ ω φ ψ Ω ○ μ η δ ζ ξ ⋆ ★ • ∙ ● ⚫ 𝟎 𝟏 𝟐 𝟘 𝟙 𝟚 𝟬 𝟭 𝟮 ⇧ (U+2e17) ¢ -->
   2 [[!toc]]
   3
   4 # Applications of continuations to natural language
   5
   6 We've seen a number of applications of monads to natural language,
   7 including presupposition projection, binding, intensionality, and the
   8 dynamics of the GSV fragment.
   9
  10 In the past couple of weeks, we've introduced continuations, first as
  11 a functional programming technique, then in terms of list and tree
  12 zippers, then as a monad.  In this lecture, we will generalize
  13 continuations slightly beyond a monad, and then begin to outline some
  14 of the applications of monads.  In brief, the generalization can be
  15 summarized in terms of types: instead of using a Kleisli arrow mapping
  16 a type α to a continuized type α -> ρ -> ρ, we'll allow the result
  17 types to differ, i.e., we'll map α to α -> β -> γ.  This will be
  18 crucial for some natural language applications.
  19
  20 Many (though not all) of the applications are discussed in detail in
  21 Barker and Shan 2014, *Continuations in Natural Language*, OUP.
  22
  23 In terms of list zippers, the continuation of a focussed element in
  24 the list is the front part of the list.
  25
  26     list zipper for the list [a;b;c;d;e;f] with focus on d:
  27
  28         ([c;b;a], [d;e;f])
  29          -------
  30      defunctionalized
  31      continuation
  32
  33 In terms of tree zippers, the continuation is the entire context of
  34 the focussed element--the entire rest of the tree.
  35
  36 [drawing of a broken tree]
  37
  38 Last week we had trouble computing the doubling task when there was more
  39 than one shifty operator after moving from a list perspective to a
  40 tree perspective.  That is, it remained unclear why "aScSe" was
  41
  42     "aacaceecaacaceecee"
  43
  44 We'll burn through that conceptual fog today.  The natural thing to
  45 try would have been to defunctionalize the continuation-based solution
  46 using a tree zipper.  But that would not have been easy, since the
  47 natural way to implement the doubling behavior of the shifty operator
  48 would have been to simply copy the context provided by the zipper.
  49 This would have produced two uncoordinated copies of the other shifty
  50 operator, and we'd have been in the situation described in class of
  51 having a reduction strategy that never reduced the number of shifty
  52 operators below 2.
  53
  54 Instead, we'll re-interpreting what the continuation monad was doing
  55 in defunctionalized terms by using Quantifier Raising (a technique
  56 from linguistics).
  57
  58 But first, motivating quantifier scope as a linguistic application.
  59
  60 # The primary application of continuations to natural language: scope-taking
  61
  62 We have seen that continuations allow a deeply-embedded element to
  63 take control over (a portion of) the entire computation that contains
  64 it.  In natural language semantics, this is exactly what it means for
  65 a scope-taking expression to take scope.
  66
  67     1. [Ann put a copy of [everyone]'s homeworks in her briefcase]
  68
  69     2. For every x, [Ann put a copy of x's homeworks in her briefcase]
  70
  71 The sentence in (1) can be paraphrased as in (2), in which the
  72 quantificational DP *every student* takes scope over the rest of the sentence.
  73 Even if you suspect that there could be an analysis of (2) on which
  74 "every student's term paper" could denote some kind of mereological
  75 fusion of a set of papers, it is much more difficult to be satisfied
  76 with a referential analysis when *every student* is replaced with
  77 *no student*, or *fewer than three students*, and so on---see any
  78 semantics text book for abundant discussion.
  79
  80 We can arrive at an analysis by expressing the meaning of
  81 quantificational DP such as *everyone* using continuations:
  82
  83     3. everyone = shift (\k.∀x.kx)
  84
  85 Assuming there is an implicit reset at the top of the sentence (we'll
  86 explicitly address determining where there is or isn't a reset), the
  87 reduction rules for `shift` will apply the handler function (\k.∀x.kx)
  88 to the remainder of the sentence after abstracting over the position
  89 of the shift expression:
  90
  91     [Ann put a copy of [shift (\k.∀x.kx)]'s homeworks in her briefcase]
  92     ~~> (\k.∀x.kx) (\v. Ann put a copy of v's homeworks in her briefcase)
  93     ~~> ∀x. Ann put a copy of x's homeworks in her briefcase
  94
  95 (To be a bit pedantic, this reduction sequence is more suitable for
  96 shift0 than for shift, but we're not being fussy here about subflavors
  97 of shifty operators.)
  98
  99 The standard technique for handling scope-taking in linguistics is
 100 Quantifier Raising (QR).  As you might suppose, the rule for Quantifier
 101 Raising closely resembles the reduction rule for shift:
 102
 103     Quantifier Raising: given a sentence [... [QDP] ...], build a new
 104     sentence [QDP (\x.[... [x] ...])].
 105
 106 Just to emphasize the similarity between QR and shift, we can use QR
 107 to provide insight into the tree task that mystified us earlier.
 108
 109 \tree (. (a)((S)((d)((S)(e)))))
 110
 111   .
 112 __|___
 113 |    |
 114 a  __|___
 115    |    |
 116    S  __|__
 117       |   |
 118       d  _|__
 119          |  |
 120          S  e
 121
 122 First we QR the lower shift operator
 123
 124 \tree (. (S) ((\\x) ((a)((S)((d)((x)(e)))))))
 125
 126    .
 127 ___|___
 128 |     |
 129 S  ___|___
 130    |     |
 131    \x  __|___
 132        |    |
 133        a  __|___
 134           |    |
 135           S  __|__
 136              |   |
 137              d  _|__
 138                 |  |
 139                 x  e
 140
 141 Next, we QR the upper shift operator
 142
 143 \tree (. (S) ((\\y) ((S) ((\\x) ((a)((y)((d)((x)(e)))))))))
 144
 145    .
 146 ___|___
 147 |     |
 148 S  ___|____
 149    |      |
 150    \y  ___|___
 151        |     |
 152        S  ___|___
 153           |     |
 154           \x  __|___
 155               |    |
 156               a  __|___
 157                  |    |
 158                  y  __|__
 159                     |   |
 160                     d  _|__
 161                        |  |
 162                        x  e
 163
 164 We then evaluate, using the same value for the shift operator proposed before:
 165
 166     shift = \k.k(k "")
 167
 168 It will be easiest to begin evaluating this tree with the lower shift
 169 operator (we get the same result if we start with the upper one).
 170 The relevant value for k is (\x.a(y(d(x e)))).  Then k "" is
 171 a(y(d(""(e)))), and k(k "") is a(y(d((a(y(d(""(e)))))(e)))).  In tree
 172 form:
 173
 174 \tree (. (S) ((\\y) ((a)((y)((d)(((a)((y)((d)(("")(e)))))(e)))))))
 175
 176    .
 177 ___|___
 178 |     |
 179 S  ___|____
 180    |      |
 181    \y  ___|___
 182        |     |
 183        a  ___|___
 184           |     |
 185           y  ___|___
 186              |     |
 187              d  ___|___
 188                 |     |
 189               __|___  e
 190               |    |
 191               a  __|___
 192                  |    |
 193                  y  __|___
 194                     |    |
 195                     d  __|__
 196                        |   |
 197                        ""  e
 198
 199
 200 Repeating the process for the upper shift operator replaces each
 201 occurrence of y with a copy of the whole tree.
 202
 203 \tree (. ((a)((((a)(("")((d)(((a)(("")((d)(("")(e)))))(e))))))((d)(((a)((((a)(("")((d)(((a)(("")((d)(("")(e)))))(e))))))((d)(("")(e)))))(e))))))
 204
 205       .
 206       |
 207 ______|______
 208 |           |
 209 a  _________|__________
 210    |                  |
 211    |               ___|___
 212 ___|___            |     |
 213 |     |            d  ___|____
 214 a  ___|____           |      |
 215    |      |        ___|____  e
 216    ""  ___|___     |      |
 217        |     |     a  ____|_____
 218        d  ___|___     |        |
 219           |     |     |      __|___
 220        ___|___  e  ___|___   |    |
 221        |     |     |     |   d  __|__
 222        a  ___|___  a  ___|____  |   |
 223           |     |     |      |  ""  e
 224           ""  __|___  ""  ___|___
 225               |    |      |     |
 226               d  __|__    d  ___|___
 227                  |   |       |     |
 228                  ""  e    ___|___  e
 229                           |     |
 230                           a  ___|___
 231                              |     |
 232                              ""  __|___
 233                                  |    |
 234                                  d  __|__
 235                                     |   |
 236                                     ""  e
 237
 238 The yield of this tree (the sequence of leaf nodes) is aadadeedaadadeedee.
 239
 240 Exercise: the result is different, by the way, if the QR occurs in a
 241 different order.
 242
 243 Three lessons:
 244
 245 * Generalizing from one-sided, list-based continuation
 246   operators to two-sided, tree-based continuation operators is a
 247   dramatic increase in power and complexity.
 248
 249 * Operators that
 250   compose multiple copies of a context can be hard to understand.
 251
 252 * When considering two-sided, tree-based continuation operators,
 253   quantifier raising is a good tool for visualizing (defunctionalizing)
 254   the computation.
 255
 256 ## Tower notation
 257
 258 At this point, we have three ways of representing computations
 259 involving control operators such as shift and reset: using a CPS
 260 transform, lifting into a continuation monad, and by using QR.
 261
 262 QR is the traditional system in linguistics, but it will not be
 263 adequate for us in general.  The reason has to do with order.  As
 264 we've discussed, especially with respect to the CPS transform,
 265 continuations allow fine-grained control over the order of evaluation.
 266 One of the main empirical claims of Barker and Shan 2014 is that
 267 natural language is sensitive to evaluation order.  Unlike other
 268 presentations of continuations, QR does not lend itself to reasoning
 269 about evaluation order, so we will need to use a different strategy.
 270
 271 [Note to self: it is interesting to consider what it would take to
 272 reproduce the analyses giving in Barker and Shan in purely QR terms.
 273 Simple quantificational binding using parasitic scope should be easy,
 274 but how reconstruction would work is not so clear.]
 275
 276 We'll present tower notation, then comment and motivate several of its
 277 features as we consider various applications.  For now, we'll motivate
 278 the tower notation by thinking about box types.  In the discussion of
 279 monads, we've thought of monadic types as values inside of a box.  The
 280 box will often contain information in addition to the core object.
 281 For instance, in the Reader monad, a boxed int contains an expression
 282 of type int as the payload, but also contains a function that
 283 manipulates a list of information.  It is natural to imagine
 284 separating a box into two regions, the payload and the hidden scratch
 285 space:
 286
 287     _______________               _______________           _______________
 288     | [x->2, y->3] |              | [x->2, y->3] |          | [x->2, y->3] |
 289   -------------------           ------------------         ------------------
 290     |              |     ¢        |              |    =     |              |
 291     |    +2        |              |     y        |          |     5        |
 292     |______________|              |______________|          |______________|
 293
 294
 295 (Imagine the + operation has been lifted into the Reader monad too.)
 296
 297 For people who are familiar with Discourse Representation Theory (Kamp
 298 1981, Kamp and Reyle 1993), this separation of boxes into payload and
 299 discourse scorekeeping will be familiar (although many details differ).
 300
 301 The general pattern is that monadic treatments separate computation
 302 into an at-issue (pre-monadic) computation with a layer at which
 303 side-effects occur.
 304
 305 The tower notation is a precise way of articulating continuation-based
 306 computations into a payload and (potentially multiple) layers of side-effects.
 307 We won't keep the outer box, but we will keep the horizontal line
 308 dividing main effects from side-effects.
 309
 310 Tower convention for types:
 311                                               γ | β
 312     (α -> β) -> γ can be equivalently written -----
 313                                                 α
 314
 315 Tower convention for values:
 316                                            g[]
 317     \k.g[k(x)] can be equivalently written ---
 318                                             x
 319
 320 If \k.g[k(x)] has type (α -> β) -> γ, then k has type (α -> β).
 321
 322 Here "g[ ]" is a *context*, that is, an expression with (exactly) one
 323 hole in it.  For instance, we might have g[x] = \forall x.P[x].
 324
 325 We'll use a simply-typed system with two atomic types, DP (the type of
 326 individuals) and S (the type of truth values).
 327
 328 Then in the spirit of monadic thinking, we'll have a way of lifting an
 329 arbitrary value into the tower system:
 330
 331                                            []    γ|β
 332     LIFT (x:α) = \k.kx : (α -> β) -> γ ==  --- : ---
 333                                            x      α
 334
 335 Obviously, LIFT is exactly the midentity (the unit) for the continuation monad.
 336 The name comes from Partee's 1987 theory of type-shifters for
 337 determiner phrases.  Importantly, LIFT applied to an
 338 individual-denoting expression yields the generalized quantifier
 339 proposed by Montague as the denotation for proper names:
 340
 341                                             []   S|S
 342     LIFT (j:DP) = \k.kx : (DP -> S) -> S == -- : ---
 343                                             j    DP
 344
 345 So if the proper name *John* denotes the individual j, LIFT(j) is the
 346 generalized quantifier that maps each property k of type DP -> S to true
 347 just in case kj is true.
 348
 349 Once we have expressions of type (α -> β) -> γ, we'll need to combine
 350 them.  We'll use the ¢ operator from the continuation monad:
 351
 352     g[]    γ | δ      h[]   δ | ρ    g[h[]]   γ | ρ
 353     --- : -------  ¢  --- : ----- == ------ : -----
 354     f     α -> β      x       α        fx       β
 355
 356 Note that the types below the horizontal line combine just like
 357 functional application (i.e, f:(α->β) (x:α) = fx:β).
 358
 359 To demonstrate that this is indeed the continuation monad's ¢
 360 operator:
 361
 362       ¢ (\k.g[kf]) (\k.h[kx])
 363     = (\MNk.M(\m.N(\n.k(mn)))) (\k.g[kf]) (\k.h[kx])
 364     ~~> \k.(\k.g[kf])(\m.(\k.h[kx])(\n.k(mn))
 365     ~~> \k.g[(\k.h[kx])(\n.k(fn))
 366     ~~> \k.g[h[k(fx)]]
 367
 368        g[h[]]
 369     == ------
 370          fx
 371
 372 Not a monad (Wadler); would be if the types were
 373 Neverthless, obeys the monad laws.
 374
 375 This is (almost) all we need to get some significant linguistic work
 376 done.
 377