[[!toc]]
# Continuations
Last week we saw how to turn a list zipper into a continuation-based
list processor. The function computed something we called "the task",
which was a simplified langauge involving control operators.
abSdS ~~> ababdS ~~> ababdababd
The task is to process the list from left to right, and at each "S",
double the list so far. Here, "S" is a kind of control operator, and
captures the entire previous computation. We also considered a
variant in which '#' delimited the portion of the list to be copied:
ab#deSfg ~~> abededfg
In this variant, "S" and "#" correspond to `shift` and `reset`, which
provide access to delimited continuations.
The expository logic of starting with this simplified task is the
notion that as lists are to trees, so is this task to full-blown
continuations. So to the extent that, say, list zippers are easier to
grasp than tree zippers, the task is easier to grasp than full
continuations.
We then presented CPS transforms, and demonstrated how they provide
an order-independent analysis of order of evaluation.
In order to continue to explore continuations, we will proceed in the
following fashion: we introduce the traditional continuation monad,
and show how it solves the task, then generalize the task to
include doubling of both the left and the right context.
## The continuation monad
In order to build a monad, we start with a Kleisli arrow.
Continuation monad: types: given some ρ, Mα => (α -> ρ) -> ρ
⇧ == \ak.ka : a -> Ma
bind == \ufk. u(\x.fxk)
We'll first show that this monad solves the task, then we'll consider
the monad in more detail.
The unmonadized computation (without the shifty "S" operator) is
t1 = + a (+ b (+ c d)) ~~> abcd
where "+" is string concatenation and the symbol a is shorthand for
the string "a".
In order to use the continuation monad to solve the list task,
we choose α = ρ = [Char]. So "abcd" is a list of characters, and
a boxed list has type M[Char] == ([Char] -> [Char]) -> [Char].
Writing ¢ in between its arguments, t1 corresponds to the following
monadic computation:
mt1 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (⇧+ ¢ ⇧c ¢ ⇧d))
We have to lift each functor (+) and each object (e.g., "b") into the
monad using mid (`⇧`), then combine them using monadic function
application, where
¢ mf mx = \k -> mf (\f -> mx (\a -> k(f x)))
for the continuation monad.
The way in which we extract a value from a continuation box is by
applying it to a continuation; often, it is convenient to supply the
trivial continuation, the identity function \k.k = I. So in fact,
t1 = mt1 I
That is, the original computation is the monadic version applied to
the trivial continuation.
We can now imagine replacing the third element ("c") with a shifty
operator. We would like to replace just the one element, and we will
do just that in a moment; but in order to simulate the original task,
we'll have to take a different strategy initially. We'll start by
imagining a shift operator that combined direction with the tail of
the list, like this:
mt2 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ ⇧d))
We can now define a shift operator to perform the work of "S":
shift u k = u(\s.k(ks))
Shift takes two arguments: a string continuation u of type M[Char],
and a string continuation k of type [Char] -> [Char]. Since u is the
the argument to shift, it represents the tail of the list after the
shift operator. Then k is the continuation of the expression headed
by `shift`. So in order to execute the task, shift needs to invoke k
twice.
mt2 I == "ababd"
just as desired.
Let's just make sure that we have the left-to-right evaluation we were
hoping for by evaluating "abSdeSf":
mt3 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ (⇧+ ¢ ⇧d ¢ (⇧+ ¢ ⇧e ¢ (shift ⇧f)))))
Then
mt3 I = "ababdeababdef" -- structure: (ababde)(ababde)f
As expected.
For a reset operator #, we can have
# u k = k(u(\k.k)) -- ex.: ab#deSf ~~> abdedef
So the continuation monad solves the list task using continuations in
a way that conforms to our by-now familiar strategy of lifting a
computation into a monad, and then writing a few key functions (shift,
reset) that exploit the power of the monad.
Now we should consider what happens when we write a shift operator
that takes the place of a single letter.
mt2 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ ⇧d))
mt4 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (⇧+ ¢ shift' ¢ ⇧d))
Instead of mt2, we have mt4. So now the type of "c" (a boxed string,
type M[Char]) is the same as the type of the new shift operator, shift'.
shift' = \k.k(k"")
This shift operator takes a continuation k of type [Char]->[Char], and
invokes it twice. Since k requires an argument of type [Char], we
need to use the first invocation of k to construction a [Char]; we do
this by feeding it a string. Since the task does not replace the
shift operator with any marker, we give the empty string "" as the
argument.
But now the new shift operator captures more than just the preceeding
part of the construction---it captures the entire context, including
the portion of the sequence that follows it. That is,
mt4 I = "ababdd"
We have replaced "S" in "abSd" with "ab_d", where the underbar will be
replaced with the empty string supplied in the definition of shift'.
Crucially, not only is the prefix "ab" duplicated, so is the suffix
"d".
Things get interesting when we have more than one operator in the
initial list. What should we expect if we start with "aScSe"?
If we assume that when we evaluate each S, all the other S's become
temporarily inert, we expect a reduction path like
aScSe ~~> aacSecSe
But note that the output has just as many S's as the input--if that is
what our reduction strategy delivers, then any initial string with
more than one S will never reach a normal form.
But that's not what the continuation operator shift' delivers.
mt5 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ shift' ¢ (⇧+ ¢ ⇧c ¢ (⇧+ ¢ shift' ¢ "e")))
mt5 I = "aacaceecaacaceecee" -- structure: "aacaceecaacaceecee"
Huh?
This is considerably harder to understand than the original list task.
The key is figuring out in each case what function the argument k to
the shift operator gets bound to.
Let's go back to a simple one-shift example, "aSc". Let's trace what
the shift' operator sees as its argument k by replacing ⇧ and ¢ with
their definitions:
⇧+ ¢ ⇧a ¢ (⇧+ ¢ shift' ¢ ⇧c) I
= \k.⇧+(\f.⇧a(\x.k(fx))) ¢ (⇧+ ¢ shift' ¢ ⇧c) I
= \k.(\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(\x.k(fx))) I
~~> (\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(\x.I(fx)))
~~> (\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(f))
~~> ⇧+(\f.⇧a(\x.(\f.(⇧+ ¢ shift' ¢ ⇧c)(f))(fx))))
~~> ⇧+(\f.⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(fx)))
= (\k.k+)(\f.⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(fx)))
~~> ⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(+x))
= (\k.ka)(\x.(⇧+ ¢ shift' ¢ ⇧c)(+x))
~~> (⇧+ ¢ shift' ¢ ⇧c)(+a)
= (\k.⇧+(\f.shift(\x.k(fx)))) ¢ ⇧c (+a)
= (\k.(\k.⇧+(\f.shift(\x.k(fx))))(\f.⇧c(\x.k(fx))))(+a)
~~> (\k.⇧+(\f.shift(\x.k(fx))))(\f'.⇧c(\x'.(+a)(f'x')))
~~> ⇧+(\f.shift(\x.(\f'.⇧c(\x'.(+a)(f'x')))(fx)))
~~> ⇧+(\f.shift(\x.⇧c(\x'.(+a)((fx)x'))))
= (\k.k+)(\f.shift(\x.⇧c(\x'.(+a)((fx)x'))))
~~> shift(\x.⇧c(\x'.(+a)((+x)x'))))
= shift(\x.(\k.kc)(\x'.(+a)((+x)x'))))
~~> shift(\x.(+a)((+x)c))
So now we see what the argument of shift will be: a function k from
strings x to the string asc. So shift k will be k(k "") = aacc.
Ok, this is ridiculous. We need a way to get ahead of this deluge of
lambda conversion. We'll adapt the notational strategy developed in
Barker and Shan 2014:
Instead of writing
\k.g(kf): (α -> ρ) -> ρ
we'll write
g[] ρ
--- : ---
f α
Then
[]
mid(x) = --
x
and
g[] ρ h[] ρ g[h[]] ρ
--- : ---- ¢ --- : --- = ------ : ---
f α->β x α fx β
Here's the justification:
(\FXk.F(\f.X(\x.k(fx)))) (\k.g(kf)) (\k.h(kx))
~~> (\Xk.(\k.g(kf))(\f.X(\x.k(fx)))) (\k.h(kx))
~~> \k.(\k.g(kf))(\f.(\k.h(kx))(\x.k(fx)))
~~> \k.g((\f.(\k.h(kx))(\x.k(fx)))f)
~~> \k.g((\k.h(kx))(\x.k(fx)))
~~> \k.g(h(\x.k(fx))x)
~~> \k.g(h(k(fx)))
Then
(\ks.k(ks))[]
shift = \k.k(k("")) = -------------
""
Let 2 == \ks.k(ks).
so aSc lifted into the monad is
[] 2[] []
-- ¢ ( --- ¢ --- ) =
a "" c
First, we need map2 instead of map1. Second, the type of the shift
operator will be a string continuation, rather than a function from
string continuations to string continuations.
(\k.k(k1))(\s.(\k.k(k2))(\r.sr))
(\k.k(k1))(\s.(\r.sr)((\r.sr)2))
(\k.k(k1))(\s.(\r.sr)(s2))
(\k.k(k1))(\s.s(s2))
(\s.s(s2))((\s.s(s2))1)
(\s.s(s2))(1(12))
(1(12))((1(12)2))
But here's the interesting part: from the point of view of the shift
operator, the continuation that it will be fed will be the entire rest
of the computation. This includes not only what has come before, but
what will come after it as well. That means when the continuation is
doubled (self-composed), the result duplicates not only the prefix
`(ab ~~> abab)`, but also the suffix `(d ~~> dd)`. In some sense, then,
continuations allow functions to see into the future!
What do you expect will be the result of executing "aScSe" under the
second perspective? The answer to this question is not determined by
the informal specification of the task that we've been using, since
under the new perspective, we're copying complete (bi-directional)
contexts rather than just left contexts.
It might be natural to assume that what execution does is choose an S,
and double its context, temporarily treating all other shift operators
as dumb letters, then choosing a remaining S to execute. If that was
our interpreation of the task, then no string with more than one S
would ever terminate, since the number S's after each reduction step
would always be 2(n-1), where n is the number before reduction.
But there is another equally natural way to answer the question.
Assume the leftmost S is executed first. What will the value of its
continuation k be? It will be a function that maps a string s to the
result of computing `ascSe`, which will be `ascascee`. So `k(k "")`
will be `k(acacee)`, which will result in `a(acacee)ca(acacee)cee`
(the parentheses are added to show stages in the construction of the
final result).
So the continuation monad behaves in a way that we expect a
continuation to behave. But where did the continuation monad come
from? Let's consider some ways of deriving the continuation monad.
## Viewing Montague's PTQ as CPS
Montague's conception of determiner phrases as generalized quantifiers
is a limited form of continuation-passing. (See, e.g., chapter 4 of
Barker and Shan 2014.) Start by assuming that ordinary DPs such as
proper names denote objects of type `e`. Then verb phrases denote
functions from individuals to truth values, i.e., functions of type `e
-> t`.
The meaning of extraordinary DPs such as *every woman* or *no dog*
can't be expressed as a simple individual. As Montague argued, it
works much better to view them as predicates on verb phrase meanings,
i.e., as having type `(e->t)->t`. Then *no woman left* is true just
in case the property of leaving is true of no woman:
no woman: \k.not \exists x . (woman x) & kx
left: \x.left x
(no woman) (left) = not \exists x . woman x & left x
Montague also proposed that all determiner phrases should have the
same type. After all, we can coordinate proper names with
quantificational DPs, as in *John and no dog left*. Then generalized
quantifier corresponding to the proper name *John* is the quantifier
`\k.kj`.
At this point, we have a type for our Kliesli arrow and a value for
our mid. Given some result type (such as `t` in the Montague application),
α ==> (α -> ρ) -> ρ
⇧a = \k.ka
It remains only to find a suitable value for bind. Montague didn't
provide one, but it's easy to find:
bind :: Mα -> (α -> Mβ) -> Mβ
given variables of the following types
u :: Mα == (α -> ρ) -> ρ
f :: α -> Mβ
k :: β -> ρ
x :: α
We have
bind u f = \k.u(\x.fxk)
Let's carefully make sure that this types out:
bind u f = \k. u (\x . f x k)
-------- --
α -> Mβ α
------------ ------
Mβ β -> ρ
-- --------------------
α ρ
------------- ------------------------
(α -> ρ) -> ρ α -> ρ
---------------------------------
ρ
-----------------------
(β -> ρ) -> ρ
Yep!
Is there any other way of building a bind operator? Well, the
challenge is getting hold of the "a" that is buried inside of `u`.
In the Reader monad, we could get at the a inside of the box by
applying the box to an environment. In the State monad, we could get
at the a by applying to box to a state and deconstructing the
resulting pair. In the continuation case, the only way to do it is to
apply the boxed a (i.e., u) to a function that takes an a as an
argument. That means that f must be invoked inside the scope of the
functional argument to u. So we've deduced the structure
... u (\x. ... f x ...) ...
At this point, in order to provide u with an argument of the
appropriate type, the argument must not only take objects of type
α as an argument, it must return a result of type ρ. That means that
we must apply fx to an object of type β -> ρ. We can hypothesize such
an object, as long as we eliminate that hypothesis later (by binding
it), and we have the complete bind operation.
The way in which the value of type α that is needed in order to unlock
the function f is hidden inside of u is directly analogous to the
concept of "data hiding" in object-oriented programming. See Pierce's
discussion of how to extend system F with existential type
quantification by encoding the existential using continuations.
So the Kliesli type pretty much determines the bind operation.
## What continuations are doing
Ok, we have a continuation monad. We derived it from first
principles, and we have seen that it behaves at least in some respects
as we expect a continuation monad to behave (in that it allows us to
give a good implementation of the task).
## How continuations can simulate other monads
Because the continuation monad allows the result type ρ to be any
type, we can choose ρ in clever ways that allow us to simulate other
monads.
Reader: ρ = env -> α
State: ρ = s -> (α, s)
Maybe: ρ = Just α | Nothing
You see how this is going to go. Let's see an example by adding an
abort operator to our task language, which represents what
we want to have happen if we divide by zero, where what we want to do
is return Nothing.
abort k = Nothing
mid a k = k a
map2 f u v k = u(\u' -> v (\v' -> k(f u' v')))
t13 = map2 (++) (mid "a")
(map2 (++) (mid "b")
(map2 (++) (mid "c")
(mid "d")))
t13 (\k->Just k) == Just "abcd"
t14 = map2 (++) (mid "a")
(map2 (++) abort
(map2 (++) (mid "c")
(mid "d")))
t14 (\k->Just k) == Nothing
Super cool.
## Delimited versus undelimited continuations
## Natural language requries delimited continuations
"*the* continuation monad"