topics/_week14_continuations.mdwn

   1 <!-- λ ◊ ≠ ∃ Λ ∀ ≡ α β γ ρ ω φ ψ Ω ○ μ η δ ζ ξ ⋆ ★ • ∙ ● ⚫ 𝟎 𝟏 𝟐 𝟘 𝟙 𝟚 𝟬 𝟭 𝟮 ⇧ (U+2e17) ¢ -->
   2
   3 [[!toc]]
   4
   5 # Continuations
   6
   7 Last week we saw how to turn a list zipper into a continuation-based
   8 list processor.  The function computed something we called "the task",
   9 which was a simplified langauge involving control operators.
  10
  11     abSdS ~~> ababdS ~~> ababdababd
  12
  13 The task is to process the list from left to right, and at each "S",
  14 double the list so far.  Here, "S" is a kind of control operator, and
  15 captures the entire previous computation.  We also considered a
  16 variant in which '#' delimited the portion of the list to be copied:
  17
  18     ab#deSfg ~~> abededfg
  19
  20 In this variant, "S" and "#" correspond to `shift` and `reset`, which
  21 provide access to delimited continuations.
  22
  23 The expository logic of starting with this simplified task is the
  24 notion that as lists are to trees, so is this task to full-blown
  25 continuations.  So to the extent that, say, list zippers are easier to
  26 grasp than tree zippers, the task is easier to grasp than full
  27 continuations.
  28
  29 We then presented CPS transforms, and demonstrated how they provide
  30 an order-independent analysis of order of evaluation.
  31
  32 In order to continue to explore continuations, we will proceed in the
  33 following fashion: we introduce the traditional continuation monad,
  34 and show how it solves the task, then generalize the task to
  35 include doubling of both the left and the right context.
  36
  37 ## The continuation monad
  38
  39 In order to build a monad, we start with a Kleisli arrow.
  40
  41     Continuation monad: types: given some ρ, Mα => (α -> ρ) -> ρ
  42                         ⇧ == \ak.ka : a -> Ma
  43                         bind == \ufk. u(\x.fxk)
  44
  45 We'll first show that this monad solves the task, then we'll consider
  46 the monad in more detail.
  47
  48 The unmonadized computation (without the shifty "S" operator) is
  49
  50     t1 = + a (+ b (+ c d)) ~~> abcd
  51
  52 where "+" is string concatenation and the symbol a is shorthand for
  53 the string "a".
  54
  55 In order to use the continuation monad to solve the list task,
  56 we choose α = ρ = [Char].  So "abcd" is a list of characters, and
  57 a boxed list has type M[Char] == ([Char] -> [Char]) -> [Char].
  58
  59 Writing ¢ in between its arguments, t1 corresponds to the following
  60 monadic computation:
  61
  62     mt1 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (⇧+ ¢ ⇧c ¢ ⇧d))
  63
  64 We have to lift each functor (+) and each object (e.g., "b") into the
  65 monad using mid (`⇧`), then combine them using monadic function
  66 application, where
  67
  68     ¢ mf mx = \k -> mf (\f -> mx (\a -> k(f x)))
  69
  70 for the continuation monad.
  71
  72 The way in which we extract a value from a continuation box is by
  73 applying it to a continuation; often, it is convenient to supply the
  74 trivial continuation, the identity function \k.k = I.  So in fact,
  75
  76    t1 = mt1 I
  77
  78 That is, the original computation is the monadic version applied to
  79 the trivial continuation.
  80
  81 We can now imagine replacing the third element ("c") with a shifty
  82 operator.  We would like to replace just the one element, and we will
  83 do just that in a moment; but in order to simulate the original task,
  84 we'll have to take a different strategy initially.  We'll start by
  85 imagining a shift operator that combined direction with the tail of
  86 the list, like this:
  87
  88     mt2 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ ⇧d))
  89
  90 We can now define a shift operator to perform the work of "S":
  91
  92     shift u k = u(\s.k(ks))
  93
  94 Shift takes two arguments: a string continuation u of type M[Char],
  95 and a string continuation k of type [Char] -> [Char].  Since u is the
  96 the argument to shift, it represents the tail of the list after the
  97 shift operator.  Then k is the continuation of the expression headed
  98 by `shift`.  So in order to execute the task, shift needs to invoke k
  99 twice.
 100
 101     mt2 I == "ababd"
 102
 103 just as desired.
 104
 105 Let's just make sure that we have the left-to-right evaluation we were
 106 hoping for by evaluating "abSdeSf":
 107
 108     mt3 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ (⇧+ ¢ ⇧d ¢ (⇧+ ¢ ⇧e ¢ (shift ⇧f)))))
 109
 110 Then
 111
 112     mt3 I = "ababdeababdef"   -- structure: (ababde)(ababde)f
 113
 114
 115 As expected.
 116
 117 For a reset operator #, we can have
 118
 119     # u k = k(u(\k.k))   -- ex.: ab#deSf ~~> abdedef
 120
 121 So the continuation monad solves the list task using continuations in
 122 a way that conforms to our by-now familiar strategy of lifting a
 123 computation into a monad, and then writing a few key functions (shift,
 124 reset) that exploit the power of the monad.
 125
 126 Now we should consider what happens when we write a shift operator
 127 that takes the place of a single letter.
 128
 129     mt2 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ ⇧d))
 130     mt4 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (⇧+ ¢ shift' ¢ ⇧d))
 131
 132 Instead of mt2, we have mt4.  So now the type of "c" (a boxed string,
 133 type M[Char]) is the same as the type of the new shift operator, shift'.
 134
 135     shift' = \k.k(k"")
 136
 137 This shift operator takes a continuation k of type [Char]->[Char], and
 138 invokes it twice.  Since k requires an argument of type [Char], we
 139 need to use the first invocation of k to construction a [Char]; we do
 140 this by feeding it a string.  Since the task does not replace the
 141 shift operator with any marker, we give the empty string "" as the
 142 argument.
 143
 144 But now the new shift operator captures more than just the preceeding
 145 part of the construction---it captures the entire context, including
 146 the portion of the sequence that follows it.  That is,
 147
 148     mt4 I = "ababdd"
 149
 150 We have replaced "S" in "abSd" with "ab_d", where the underbar will be
 151 replaced with the empty string supplied in the definition of shift'.
 152 Crucially, not only is the prefix "ab" duplicated, so is the suffix
 153 "d".
 154
 155 Things get interesting when we have more than one operator in the
 156 initial list.  What should we expect if we start with "aScSe"?
 157 If we assume that when we evaluate each S, all the other S's become
 158 temporarily inert, we expect a reduction path like
 159
 160     aScSe ~~> aacSecSe
 161
 162 But note that the output has just as many S's as the input--if that is
 163 what our reduction strategy delivers, then any initial string with
 164 more than one S will never reach a normal form.
 165
 166 But that's not what the continuation operator shift' delivers.
 167
 168     mt5 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ shift' ¢ (⇧+ ¢ ⇧c ¢ (⇧+ ¢ shift' ¢ "e")))
 169
 170     mt5 I = "aacaceecaacaceecee" -- structure: "aacaceecaacaceecee"
 171
 172 Huh?
 173
 174 This is considerably harder to understand than the original list task.
 175 The key is figuring out in each case what function the argument k to
 176 the shift operator gets bound to.
 177
 178 Let's go back to a simple one-shift example, "aSc".  Let's trace what
 179 the shift' operator sees as its argument k by replacing ⇧ and ¢ with
 180 their definitions:
 181
 182       ⇧+ ¢ ⇧a ¢ (⇧+ ¢ shift' ¢ ⇧c) I
 183    = \k.⇧+(\f.⇧a(\x.k(fx))) ¢ (⇧+ ¢ shift' ¢ ⇧c) I
 184    = \k.(\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(\x.k(fx))) I
 185    ~~> (\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(\x.I(fx)))
 186    ~~> (\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(f))
 187    ~~> ⇧+(\f.⇧a(\x.(\f.(⇧+ ¢ shift' ¢ ⇧c)(f))(fx))))
 188    ~~> ⇧+(\f.⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(fx)))
 189    = (\k.k+)(\f.⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(fx)))
 190    ~~> ⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(+x))
 191    = (\k.ka)(\x.(⇧+ ¢ shift' ¢ ⇧c)(+x))
 192    ~~> (⇧+ ¢ shift' ¢ ⇧c)(+a)
 193    = (\k.⇧+(\f.shift(\x.k(fx)))) ¢ ⇧c (+a)
 194    = (\k.(\k.⇧+(\f.shift(\x.k(fx))))(\f.⇧c(\x.k(fx))))(+a)
 195    ~~> (\k.⇧+(\f.shift(\x.k(fx))))(\f'.⇧c(\x'.(+a)(f'x')))
 196    ~~> ⇧+(\f.shift(\x.(\f'.⇧c(\x'.(+a)(f'x')))(fx)))
 197    ~~> ⇧+(\f.shift(\x.⇧c(\x'.(+a)((fx)x'))))
 198    = (\k.k+)(\f.shift(\x.⇧c(\x'.(+a)((fx)x'))))
 199    ~~> shift(\x.⇧c(\x'.(+a)((+x)x'))))
 200    = shift(\x.(\k.kc)(\x'.(+a)((+x)x'))))
 201    ~~> shift(\x.(+a)((+x)c))
 202
 203 So now we see what the argument of shift will be: a function k from
 204 strings x to the string asc.  So shift k will be k(k "") = aacc.
 205
 206 Ok, this is ridiculous.  We need a way to get ahead of this deluge of
 207 lambda conversion.  We'll adapt the notational strategy developed in
 208 Barker and Shan 2014:
 209
 210 Instead of writing
 211
 212     \k.g(kf): (α -> ρ) -> ρ
 213
 214 we'll write
 215
 216     g[]    ρ
 217     --- : ---
 218      f     α
 219
 220 Then
 221              []
 222     mid(x) = --
 223              x
 224
 225 and
 226
 227     g[]    ρ      h[]    ρ    g[h[]]    ρ
 228     --- : ----  ¢ --- : --- = ------ : ---
 229      f    α->β     x     α     fx       β
 230
 231 Here's the justification:
 232
 233         (\FXk.F(\f.X(\x.k(fx)))) (\k.g(kf)) (\k.h(kx))
 234     ~~> (\Xk.(\k.g(kf))(\f.X(\x.k(fx)))) (\k.h(kx))
 235     ~~> \k.(\k.g(kf))(\f.(\k.h(kx))(\x.k(fx)))
 236     ~~> \k.g((\f.(\k.h(kx))(\x.k(fx)))f)
 237     ~~> \k.g((\k.h(kx))(\x.k(fx)))
 238     ~~> \k.g(h(\x.k(fx))x)
 239     ~~> \k.g(h(k(fx)))
 240
 241 Then
 242                           (\ks.k(ks))[]
 243     shift = \k.k(k("")) = -------------
 244                                ""
 245
 246 Let 2 == \ks.k(ks).
 247
 248 so aSc lifted into the monad is
 249
 250     []     2[]   []
 251     -- ¢ ( --- ¢ --- ) =
 252     a      ""    c
 253
 254 First, we need map2 instead of map1.  Second, the type of the shift
 255 operator will be a string continuation, rather than a function from
 256 string continuations to string continuations.
 257
 258 (\k.k(k1))(\s.(\k.k(k2))(\r.sr))
 259 (\k.k(k1))(\s.(\r.sr)((\r.sr)2))
 260 (\k.k(k1))(\s.(\r.sr)(s2))
 261 (\k.k(k1))(\s.s(s2))
 262 (\s.s(s2))((\s.s(s2))1)
 263 (\s.s(s2))(1(12))
 264 (1(12))((1(12)2))
 265
 266
 267
 268
 269
 270 But here's the interesting part: from the point of view of the shift
 271 operator, the continuation that it will be fed will be the entire rest
 272 of the computation.  This includes not only what has come before, but
 273 what will come after it as well.  That means when the continuation is
 274 doubled (self-composed), the result duplicates not only the prefix
 275 `(ab ~~> abab)`, but also the suffix `(d ~~> dd)`.  In some sense, then,
 276 continuations allow functions to see into the future!
 277
 278 What do you expect will be the result of executing "aScSe" under the
 279 second perspective?  The answer to this question is not determined by
 280 the informal specification of the task that we've been using, since
 281 under the new perspective, we're copying complete (bi-directional)
 282 contexts rather than just left contexts.
 283
 284 It might be natural to assume that what execution does is choose an S,
 285 and double its context, temporarily treating all other shift operators
 286 as dumb letters, then choosing a remaining S to execute.  If that was
 287 our interpreation of the task, then no string with more than one S
 288 would ever terminate, since the number S's after each reduction step
 289 would always be 2(n-1), where n is the number before reduction.
 290
 291 But there is another equally natural way to answer the question.
 292 Assume the leftmost S is executed first.  What will the value of its
 293 continuation k be?  It will be a function that maps a string s to the
 294 result of computing `ascSe`, which will be `ascascee`.  So `k(k "")`
 295 will be `k(acacee)`, which will result in `a(acacee)ca(acacee)cee`
 296 (the parentheses are added to show stages in the construction of the
 297 final result).
 298
 299 So the continuation monad behaves in a way that we expect a
 300 continuation to behave.  But where did the continuation monad come
 301 from?  Let's consider some ways of deriving the continuation monad.
 302
 303 ## Viewing Montague's PTQ as CPS
 304
 305 Montague's conception of determiner phrases as generalized quantifiers
 306 is a limited form of continuation-passing.  (See, e.g., chapter 4 of
 307 Barker and Shan 2014.)  Start by assuming that ordinary DPs such as
 308 proper names denote objects of type `e`.  Then verb phrases denote
 309 functions from individuals to truth values, i.e., functions of type `e
 310 -> t`.
 311
 312 The meaning of extraordinary DPs such as *every woman* or *no dog*
 313 can't be expressed as a simple individual.  As Montague argued, it
 314 works much better to view them as predicates on verb phrase meanings,
 315 i.e., as having type `(e->t)->t`.  Then *no woman left* is true just
 316 in case the property of leaving is true of no woman:
 317
 318     no woman:  \k.not \exists x . (woman x) & kx
 319     left: \x.left x
 320     (no woman) (left) = not \exists x . woman x & left x
 321
 322 Montague also proposed that all determiner phrases should have the
 323 same type.  After all, we can coordinate proper names with
 324 quantificational DPs, as in *John and no dog left*.  Then generalized
 325 quantifier corresponding to the proper name *John* is the quantifier
 326 `\k.kj`.
 327
 328 At this point, we have a type for our Kliesli arrow and a value for
 329 our mid.  Given some result type (such as `t` in the Montague application),
 330
 331     α ==> (α -> ρ) -> ρ
 332     ⇧a = \k.ka
 333
 334 It remains only to find a suitable value for bind.  Montague didn't
 335 provide one, but it's easy to find:
 336
 337     bind ::    Mα -> (α -> Mβ) -> Mβ
 338
 339 given variables of the following types
 340
 341     u :: Mα == (α -> ρ) -> ρ
 342     f :: α -> Mβ
 343     k :: β -> ρ
 344     x :: α
 345
 346 We have
 347
 348     bind u f = \k.u(\x.fxk)
 349
 350 Let's carefully make sure that this types out:
 351
 352     bind u f = \k.       u      (\x .   f       x     k)
 353                                       --------  --
 354                                       α -> Mβ   α
 355                                      ------------  ------
 356                                          Mβ        β -> ρ
 357                                   --  --------------------
 358                                   α            ρ
 359                   -------------  ------------------------
 360                   (α -> ρ) -> ρ             α -> ρ
 361                   ---------------------------------
 362                                 ρ
 363                 -----------------------
 364                     (β -> ρ) -> ρ
 365
 366 Yep!
 367
 368 Is there any other way of building a bind operator?  Well, the
 369 challenge is getting hold of the "a" that is buried inside of `u`.
 370 In the Reader monad, we could get at the a inside of the box by
 371 applying the box to an environment.  In the State monad, we could get
 372 at the a by applying to box to a state and deconstructing the
 373 resulting pair.  In the continuation case, the only way to do it is to
 374 apply the boxed a (i.e., u) to a function that takes an a as an
 375 argument.  That means that f must be invoked inside the scope of the
 376 functional argument to u.  So we've deduced the structure
 377
 378     ... u (\x. ... f x ...) ...
 379
 380 At this point, in order to provide u with an argument of the
 381 appropriate type, the argument must not only take objects of type
 382 α as an argument, it must return a result of type ρ.  That means that
 383 we must apply fx to an object of type β -> ρ.  We can hypothesize such
 384 an object, as long as we eliminate that hypothesis later (by binding
 385 it), and we have the complete bind operation.
 386
 387 The way in which the value of type α that is needed in order to unlock
 388 the function f is hidden inside of u is directly analogous to the
 389 concept of "data hiding" in object-oriented programming.  See Pierce's
 390 discussion of how to extend system F with existential type
 391 quantification by encoding the existential using continuations.
 392
 393 So the Kliesli type pretty much determines the bind operation.
 394
 395 ## What continuations are doing
 396
 397 Ok, we have a continuation monad.  We derived it from first
 398 principles, and we have seen that it behaves at least in some respects
 399 as we expect a continuation monad to behave (in that it allows us to
 400 give a good implementation of the task).
 401
 402 ## How continuations can simulate other monads
 403
 404 Because the continuation monad allows the result type ρ to be any
 405 type, we can choose ρ in clever ways that allow us to simulate other
 406 monads.
 407
 408     Reader: ρ = env -> α
 409     State: ρ = s -> (α, s)
 410     Maybe: ρ = Just α | Nothing
 411
 412 You see how this is going to go.  Let's see an example by adding an
 413 abort operator to our task language, which represents what
 414 we want to have happen if we divide by zero, where what we want to do
 415 is return Nothing.
 416
 417     abort k = Nothing
 418     mid a k = k a
 419     map2 f u v k = u(\u' -> v (\v' -> k(f u' v')))
 420     t13 = map2 (++) (mid "a")
 421                    (map2 (++) (mid "b")
 422                               (map2 (++) (mid "c")
 423                                          (mid "d")))
 424
 425     t13 (\k->Just k) == Just "abcd"
 426
 427     t14 = map2 (++) (mid "a")
 428                    (map2 (++) abort
 429                               (map2 (++) (mid "c")
 430                                          (mid "d")))
 431
 432
 433     t14 (\k->Just k) == Nothing
 434
 435 Super cool.
 436
 437
 438
 439
 440 ## Delimited versus undelimited continuations
 441
 442 ## Natural language requries delimited continuations
 443
 444
 445
 446
 447
 448
 449 "*the* continuation monad"