[[!toc]] # Continuations Last week we saw how to turn a list zipper into a continuation-based list processor. The function computed something we called "the task", which was a simplified langauge involving control operators. abSdS ~~> ababdS ~~> ababdababd The task is to process the list from left to right, and at each "S", double the list so far. Here, "S" is a kind of control operator, and captures the entire previous computation. We also considered a variant in which '#' delimited the portion of the list to be copied: ab#deSfg ~~> abededfg In this variant, "S" and "#" correspond to `shift` and `reset`, which provide access to delimited continuations. The expository logic of starting with this simplified task is the notion that as lists are to trees, so is this task to full-blown continuations. So to the extent that, say, list zippers are easier to grasp than tree zippers, the task is easier to grasp than full continuations. We then presented CPS transforms, and demonstrated how they provide an order-independent analysis of order of evaluation. In order to continue to explore continuations, we will proceed in the following fashion: we introduce the traditional continuation monad, and show how it solves the task, then generalize the task to include doubling of both the left and the right context. ## The continuation monad In order to build a monad, we start with a Kleisli arrow. Continuation monad: types: given some ρ, Mα => (α -> ρ) -> ρ ⇧ == \ak.ka : a -> Ma bind == \ufk. u(\x.fxk) We'll first show that this monad solves the task, then we'll consider the monad in more detail. The unmonadized computation (without the shifty "S" operator) is t1 = + a (+ b (+ c d)) ~~> abcd where "+" is string concatenation and the symbol a is shorthand for the string "a". In order to use the continuation monad to solve the list task, we choose α = ρ = [Char]. So "abcd" is a list of characters, and a boxed list has type M[Char] == ([Char] -> [Char]) -> [Char]. Writing ¢ in between its arguments, t1 corresponds to the following monadic computation: mt1 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (⇧+ ¢ ⇧c ¢ ⇧d)) We have to lift each functor (+) and each object (e.g., "b") into the monad using mid (`⇧`), then combine them using monadic function application, where ¢ mf mx = \k -> mf (\f -> mx (\a -> k(f x))) for the continuation monad. The way in which we extract a value from a continuation box is by applying it to a continuation; often, it is convenient to supply the trivial continuation, the identity function \k.k = I. So in fact, t1 = mt1 I That is, the original computation is the monadic version applied to the trivial continuation. We can now imagine replacing the third element ("c") with a shifty operator. We would like to replace just the one element, and we will do just that in a moment; but in order to simulate the original task, we'll have to take a different strategy initially. We'll start by imagining a shift operator that combined direction with the tail of the list, like this: mt2 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ ⇧d)) We can now define a shift operator to perform the work of "S": shift u k = u(\s.k(ks)) Shift takes two arguments: a string continuation u of type M[Char], and a string continuation k of type [Char] -> [Char]. Since u is the the argument to shift, it represents the tail of the list after the shift operator. Then k is the continuation of the expression headed by `shift`. So in order to execute the task, shift needs to invoke k twice. mt2 I == "ababd" just as desired. Let's just make sure that we have the left-to-right evaluation we were hoping for by evaluating "abSdeSf": mt3 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ (⇧+ ¢ ⇧d ¢ (⇧+ ¢ ⇧e ¢ (shift ⇧f))))) Then mt3 I = "ababdeababdef" -- structure: (ababde)(ababde)f As expected. For a reset operator #, we can have # u k = k(u(\k.k)) -- ex.: ab#deSf ~~> abdedef So the continuation monad solves the list task using continuations in a way that conforms to our by-now familiar strategy of lifting a computation into a monad, and then writing a few key functions (shift, reset) that exploit the power of the monad. Now we should consider what happens when we write a shift operator that takes the place of a single letter. mt2 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ ⇧d)) mt4 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (⇧+ ¢ shift' ¢ ⇧d)) Instead of mt2, we have mt4. So now the type of "c" (a boxed string, type M[Char]) is the same as the type of the new shift operator, shift'. shift' = \k.k(k"") This shift operator takes a continuation k of type [Char]->[Char], and invokes it twice. Since k requires an argument of type [Char], we need to use the first invocation of k to construction a [Char]; we do this by feeding it a string. Since the task does not replace the shift operator with any marker, we give the empty string "" as the argument. But now the new shift operator captures more than just the preceeding part of the construction---it captures the entire context, including the portion of the sequence that follows it. That is, mt4 I = "ababdd" We have replaced "S" in "abSd" with "ab_d", where the underbar will be replaced with the empty string supplied in the definition of shift'. Crucially, not only is the prefix "ab" duplicated, so is the suffix "d". Things get interesting when we have more than one operator in the initial list. What should we expect if we start with "aScSe"? If we assume that when we evaluate each S, all the other S's become temporarily inert, we expect a reduction path like aScSe ~~> aacSecSe But note that the output has just as many S's as the input--if that is what our reduction strategy delivers, then any initial string with more than one S will never reach a normal form. But that's not what the continuation operator shift' delivers. mt5 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ shift' ¢ (⇧+ ¢ ⇧c ¢ (⇧+ ¢ shift' ¢ "e"))) mt5 I = "aacaceecaacaceecee" -- structure: "aacaceecaacaceecee" Huh? This is considerably harder to understand than the original list task. The key is figuring out in each case what function the argument k to the shift operator gets bound to. Let's go back to a simple one-shift example, "aSc". Let's trace what the shift' operator sees as its argument k by replacing ⇧ and ¢ with their definitions: ⇧+ ¢ ⇧a ¢ (⇧+ ¢ shift' ¢ ⇧c) I = \k.⇧+(\f.⇧a(\x.k(fx))) ¢ (⇧+ ¢ shift' ¢ ⇧c) I = \k.(\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(\x.k(fx))) I ~~> (\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(\x.I(fx))) ~~> (\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(f)) ~~> ⇧+(\f.⇧a(\x.(\f.(⇧+ ¢ shift' ¢ ⇧c)(f))(fx)))) ~~> ⇧+(\f.⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(fx))) = (\k.k+)(\f.⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(fx))) ~~> ⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(+x)) = (\k.ka)(\x.(⇧+ ¢ shift' ¢ ⇧c)(+x)) ~~> (⇧+ ¢ shift' ¢ ⇧c)(+a) = (\k.⇧+(\f.shift(\x.k(fx)))) ¢ ⇧c (+a) = (\k.(\k.⇧+(\f.shift(\x.k(fx))))(\f.⇧c(\x.k(fx))))(+a) ~~> (\k.⇧+(\f.shift(\x.k(fx))))(\f'.⇧c(\x'.(+a)(f'x'))) ~~> ⇧+(\f.shift(\x.(\f'.⇧c(\x'.(+a)(f'x')))(fx))) ~~> ⇧+(\f.shift(\x.⇧c(\x'.(+a)((fx)x')))) = (\k.k+)(\f.shift(\x.⇧c(\x'.(+a)((fx)x')))) ~~> shift(\x.⇧c(\x'.(+a)((+x)x')))) = shift(\x.(\k.kc)(\x'.(+a)((+x)x')))) ~~> shift(\x.(+a)((+x)c)) So now we see what the argument of shift will be: a function k from strings x to the string asc. So shift k will be k(k "") = aacc. Ok, this is ridiculous. We need a way to get ahead of this deluge of lambda conversion. We'll adapt the notational strategy developed in Barker and Shan 2014: Instead of writing \k.g(kf): (α -> ρ) -> ρ we'll write g[] ρ --- : --- f α Then [] mid(x) = -- x and g[] ρ h[] ρ g[h[]] ρ --- : ---- ¢ --- : --- = ------ : --- f α->β x α fx β Here's the justification: (\FXk.F(\f.X(\x.k(fx)))) (\k.g(kf)) (\k.h(kx)) ~~> (\Xk.(\k.g(kf))(\f.X(\x.k(fx)))) (\k.h(kx)) ~~> \k.(\k.g(kf))(\f.(\k.h(kx))(\x.k(fx))) ~~> \k.g((\f.(\k.h(kx))(\x.k(fx)))f) ~~> \k.g((\k.h(kx))(\x.k(fx))) ~~> \k.g(h(\x.k(fx))x) ~~> \k.g(h(k(fx))) Then (\ks.k(ks))[] shift = \k.k(k("")) = ------------- "" Let 2 == \ks.k(ks). so aSc lifted into the monad is [] 2[] [] -- ¢ ( --- ¢ --- ) = a "" c First, we need map2 instead of map1. Second, the type of the shift operator will be a string continuation, rather than a function from string continuations to string continuations. (\k.k(k1))(\s.(\k.k(k2))(\r.sr)) (\k.k(k1))(\s.(\r.sr)((\r.sr)2)) (\k.k(k1))(\s.(\r.sr)(s2)) (\k.k(k1))(\s.s(s2)) (\s.s(s2))((\s.s(s2))1) (\s.s(s2))(1(12)) (1(12))((1(12)2)) But here's the interesting part: from the point of view of the shift operator, the continuation that it will be fed will be the entire rest of the computation. This includes not only what has come before, but what will come after it as well. That means when the continuation is doubled (self-composed), the result duplicates not only the prefix `(ab ~~> abab)`, but also the suffix `(d ~~> dd)`. In some sense, then, continuations allow functions to see into the future! What do you expect will be the result of executing "aScSe" under the second perspective? The answer to this question is not determined by the informal specification of the task that we've been using, since under the new perspective, we're copying complete (bi-directional) contexts rather than just left contexts. It might be natural to assume that what execution does is choose an S, and double its context, temporarily treating all other shift operators as dumb letters, then choosing a remaining S to execute. If that was our interpreation of the task, then no string with more than one S would ever terminate, since the number S's after each reduction step would always be 2(n-1), where n is the number before reduction. But there is another equally natural way to answer the question. Assume the leftmost S is executed first. What will the value of its continuation k be? It will be a function that maps a string s to the result of computing `ascSe`, which will be `ascascee`. So `k(k "")` will be `k(acacee)`, which will result in `a(acacee)ca(acacee)cee` (the parentheses are added to show stages in the construction of the final result). So the continuation monad behaves in a way that we expect a continuation to behave. But where did the continuation monad come from? Let's consider some ways of deriving the continuation monad. ## Viewing Montague's PTQ as CPS Montague's conception of determiner phrases as generalized quantifiers is a limited form of continuation-passing. (See, e.g., chapter 4 of Barker and Shan 2014.) Start by assuming that ordinary DPs such as proper names denote objects of type `e`. Then verb phrases denote functions from individuals to truth values, i.e., functions of type `e -> t`. The meaning of extraordinary DPs such as *every woman* or *no dog* can't be expressed as a simple individual. As Montague argued, it works much better to view them as predicates on verb phrase meanings, i.e., as having type `(e->t)->t`. Then *no woman left* is true just in case the property of leaving is true of no woman: no woman: \k.not \exists x . (woman x) & kx left: \x.left x (no woman) (left) = not \exists x . woman x & left x Montague also proposed that all determiner phrases should have the same type. After all, we can coordinate proper names with quantificational DPs, as in *John and no dog left*. Then generalized quantifier corresponding to the proper name *John* is the quantifier `\k.kj`. At this point, we have a type for our Kliesli arrow and a value for our mid. Given some result type (such as `t` in the Montague application), α ==> (α -> ρ) -> ρ ⇧a = \k.ka It remains only to find a suitable value for bind. Montague didn't provide one, but it's easy to find: bind :: Mα -> (α -> Mβ) -> Mβ given variables of the following types u :: Mα == (α -> ρ) -> ρ f :: α -> Mβ k :: β -> ρ x :: α We have bind u f = \k.u(\x.fxk) Let's carefully make sure that this types out: bind u f = \k. u (\x . f x k) -------- -- α -> Mβ α ------------ ------ Mβ β -> ρ -- -------------------- α ρ ------------- ------------------------ (α -> ρ) -> ρ α -> ρ --------------------------------- ρ ----------------------- (β -> ρ) -> ρ Yep! Is there any other way of building a bind operator? Well, the challenge is getting hold of the "a" that is buried inside of `u`. In the Reader monad, we could get at the a inside of the box by applying the box to an environment. In the State monad, we could get at the a by applying to box to a state and deconstructing the resulting pair. In the continuation case, the only way to do it is to apply the boxed a (i.e., u) to a function that takes an a as an argument. That means that f must be invoked inside the scope of the functional argument to u. So we've deduced the structure ... u (\x. ... f x ...) ... At this point, in order to provide u with an argument of the appropriate type, the argument must not only take objects of type α as an argument, it must return a result of type ρ. That means that we must apply fx to an object of type β -> ρ. We can hypothesize such an object, as long as we eliminate that hypothesis later (by binding it), and we have the complete bind operation. The way in which the value of type α that is needed in order to unlock the function f is hidden inside of u is directly analogous to the concept of "data hiding" in object-oriented programming. See Pierce's discussion of how to extend system F with existential type quantification by encoding the existential using continuations. So the Kliesli type pretty much determines the bind operation. ## What continuations are doing Ok, we have a continuation monad. We derived it from first principles, and we have seen that it behaves at least in some respects as we expect a continuation monad to behave (in that it allows us to give a good implementation of the task). ## How continuations can simulate other monads Because the continuation monad allows the result type ρ to be any type, we can choose ρ in clever ways that allow us to simulate other monads. Reader: ρ = env -> α State: ρ = s -> (α, s) Maybe: ρ = Just α | Nothing You see how this is going to go. Let's see an example by adding an abort operator to our task language, which represents what we want to have happen if we divide by zero, where what we want to do is return Nothing. abort k = Nothing mid a k = k a map2 f u v k = u(\u' -> v (\v' -> k(f u' v'))) t13 = map2 (++) (mid "a") (map2 (++) (mid "b") (map2 (++) (mid "c") (mid "d"))) t13 (\k->Just k) == Just "abcd" t14 = map2 (++) (mid "a") (map2 (++) abort (map2 (++) (mid "c") (mid "d"))) t14 (\k->Just k) == Nothing Super cool. ## Delimited versus undelimited continuations ## Natural language requries delimited continuations "*the* continuation monad"