Continuations

Last week we saw how to turn a list zipper into a continuation-based list processor. The function computed something we called "the task", which was a simplified langauge involving control operators.

abSdS ~~> ababdS ~~> ababdababd

The task is to process the list from left to right, and at each "S", double the list so far. Here, "S" is a kind of control operator, and captures the entire previous computation. We also considered a variant in which '#' delimited the portion of the list to be copied:

ab#deSfg ~~> abededfg

In this variant, "S" and "#" correspond to shift and reset, which provide access to delimited continuations.

The expository logic of starting with this simplified task is the notion that as lists are to trees, so is this task to full-blown continuations. So to the extent that, say, list zippers are easier to grasp than tree zippers, the task is easier to grasp than full continuations.

We then presented CPS transforms, and demonstrated how they provide an order-independent analysis of order of evaluation.

In order to continue to explore continuations, we will proceed in the following fashion: we introduce the traditional continuation monad, and show how it solves the task, then generalize the task to include doubling of both the left and the right context.

The continuation monad

In order to build a monad, we start with a Kleisli arrow.

Continuation monad: types: given some ρ, Mα => (α -> ρ) -> ρ
                    ⇧ == \ak.ka : a -> Ma
                    bind == \ufk. u(\x.fxk)

We'll first show that this monad solves the task, then we'll consider the monad in more detail.

The unmonadized computation (without the shifty "S" operator) is

t1 = + a (+ b (+ c d)) ~~> abcd

where "+" is string concatenation and the symbol a is shorthand for the string "a".

In order to use the continuation monad to solve the list task, we choose α = ρ = [Char]. So "abcd" is a list of characters, and a boxed list has type M[Char] == ([Char] -> [Char]) -> [Char].

Writing ¢ in between its arguments, t1 corresponds to the following monadic computation:

mt1 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (⇧+ ¢ ⇧c ¢ ⇧d))

We have to lift each functor (+) and each object (e.g., "b") into the monad using mid (), then combine them using monadic function application, where

¢ M N = \k -> M (\f -> N (\a -> k(f x)))

for the continuation monad.

The way in which we extract a value from a continuation box is by applying it to a continuation; often, it is convenient to supply the trivial continuation, the identity function \k.k = I. So in fact,

t1 = mt1 I

That is, the original computation is the monadic version applied to the trivial continuation.

We can now add a shifty operator. We would like to replace just the one element, and we will do just that in a moment; but in order to simulate the original task, we'll have to take a different strategy initially. We'll start by imagining a shift operator that combined direction with the tail of the list, like this:

mt2 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ ⇧d))

We can now define a shift operator to perform the work of "S":

shift u k = u(\s.k(ks))

Shift takes two arguments: a string continuation u of type M[Char], and a string continuation k of type [Char] -> [Char]. Since u is the the argument to shift, it represents the tail of the list after the shift operator. Then k is the continuation of the expression headed by shift. So in order to execute the task, shift needs to invoke k twice. The expression \s.k(ks) is just the composition of k with itself.

mt2 I == "ababd"

just as desired.

Let's just make sure that we have the left-to-right evaluation we were hoping for by evaluating "abSdeSf":

mt3 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ (⇧+ ¢ ⇧d ¢ (⇧+ ¢ ⇧e ¢ (shift ⇧f)))))

Then

mt3 I = "ababdeababdef"   -- structure: (ababde)(ababde)f

As expected.

For a reset operator #, we can have

# u k = k(u(\k.k))   -- ex.: ab#deSf ~~> abdedef

The reset operator executes the remainder of the list separately, by giving it the trivial continuation (\k.k), then feeds the result to the continuation corresponding to the position of the reset.

So the continuation monad solves the list task using continuations in a way that conforms to our by-now familiar strategy of lifting a computation into a monad, and then writing a few key functions (shift, reset) that exploit the power of the monad.

Generalizing to the tree doubling task

Now we should consider what happens when we write a shift operator that takes the place of a single letter.

mt2 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (shift ¢ ⇧d))
mt4 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ ⇧b ¢ (⇧+ ¢ shift' ¢ ⇧d))

Instead of mt2 (copied from above), we have mt4. So now the type of a leaf (a boxed string, type M[Char]) is the same as the type of the new shift operator, shift'.

shift' = \k.k(k"")

This shift operator takes a continuation k of type [Char]->[Char], and invokes it twice. Since k requires an argument of type [Char], we need to use the first invocation of k to construction a [Char]; we do this by feeding it a string. Since the task does not replace the shift operator with any marker, we give the empty string "" as the argument.

But now the new shift operator captures more than just the preceeding part of the construction---it captures the entire context, including the portion of the sequence that follows it. That is,

mt4 I = "ababdd"

We have replaced "S" in "abSd" with "ab_d", where the underbar will be replaced with the empty string supplied in the definition of shift'. Crucially, not only is the prefix "ab" duplicated, so is the suffix "d".

Things get interesting when we have more than one operator in the initial list. What should we expect if we start with "aScSe"? If we assume that when we evaluate each S, all the other S's become temporarily inert, we expect a reduction path like

aScSe ~~> aacSecSe

But note that the output has just as many S's as the input--if that is what our reduction strategy delivers, then any initial string with more than one S will never reach a normal form.

But that's not what the continuation operator shift' delivers.

mt5 = ⇧+ ¢ ⇧a ¢ (⇧+ ¢ shift' ¢ (⇧+ ¢ ⇧c ¢ (⇧+ ¢ shift' ¢ "e")))

mt5 I = "aacaceecaacaceecee" -- structure: "aacaceecaacaceecee"

Huh?

This is considerably harder to understand than the original list task. The key is figuring out in each case what function the argument k to the shift operator gets bound to.

Let's go back to a simple one-shift example, "aSc". Let's trace what the shift' operator sees as its argument k by replacing ⇧ and ¢ with their definitions:

      ⇧+ ¢ ⇧a ¢ (⇧+ ¢ shift' ¢ ⇧c) I
   = \k.⇧+(\f.⇧a(\x.k(fx))) ¢ (⇧+ ¢ shift' ¢ ⇧c) I
   = \k.(\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(\x.k(fx))) I
   ~~> (\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(\x.I(fx))) 
   ~~> (\k.⇧+(\f.⇧a(\x.k(fx))))(\f.(⇧+ ¢ shift' ¢ ⇧c)(f)) 
   ~~> ⇧+(\f.⇧a(\x.(\f.(⇧+ ¢ shift' ¢ ⇧c)(f))(fx)))) 
   ~~> ⇧+(\f.⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(fx)))
   = (\k.k+)(\f.⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(fx)))
   ~~> ⇧a(\x.(⇧+ ¢ shift' ¢ ⇧c)(+x))
   = (\k.ka)(\x.(⇧+ ¢ shift' ¢ ⇧c)(+x))
   ~~> (⇧+ ¢ shift' ¢ ⇧c)(+a)
   = (\k.⇧+(\f.shift(\x.k(fx)))) ¢ ⇧c (+a)
   = (\k.(\k.⇧+(\f.shift(\x.k(fx))))(\f.⇧c(\x.k(fx))))(+a)
   ~~> (\k.⇧+(\f.shift(\x.k(fx))))(\f'.⇧c(\x'.(+a)(f'x')))
   ~~> ⇧+(\f.shift(\x.(\f'.⇧c(\x'.(+a)(f'x')))(fx)))
   ~~> ⇧+(\f.shift(\x.⇧c(\x'.(+a)((fx)x'))))
   = (\k.k+)(\f.shift(\x.⇧c(\x'.(+a)((fx)x'))))
   ~~> shift(\x.⇧c(\x'.(+a)((+x)x'))))
   = shift(\x.(\k.kc)(\x'.(+a)((+x)x'))))
   ~~> shift(\x.(+a)((+x)c))

So now we see what the argument of shift will be: a function k from strings x to the string asc. So shift k will be k(k "") = aacc.

Ok, this is ridiculous. We need a way to get ahead of this deluge of lambda conversion. We'll see how to understand what is going on when we talk about quantifier raising in the next lecture.

Viewing Montague's PTQ as CPS

Montague's conception of determiner phrases as generalized quantifiers is a limited form of continuation-passing. (See, e.g., chapter 4 of Barker and Shan 2014.) Start by assuming that ordinary DPs such as proper names denote objects of type e. Then verb phrases denote functions from individuals to truth values, i.e., functions of type e -> t.

The meaning of extraordinary DPs such as every woman or no dog can't be expressed as a simple individual. As Montague argued, it works much better to view them as predicates on verb phrase meanings, i.e., as having type (e->t)->t. Then no woman left is true just in case the property of leaving is true of no woman:

no woman:  \k.not \exists x . (woman x) & kx
left: \x.left x
(no woman) (left) = not \exists x . woman x & left x

Montague also proposed that all determiner phrases should have the same type. After all, we can coordinate proper names with quantificational DPs, as in John and no dog left. Then generalized quantifier corresponding to the proper name John is the quantifier \k.kj.

How continuations can simulate other monads

Because the continuation monad allows the result type ρ to be any type, we can choose ρ in clever ways that allow us to simulate other monads.

Reader: ρ = env -> α
State: ρ = s -> (α, s)
Maybe: ρ = Just α | Nothing

You see how this is going to go. Let's see an example by adding an abort operator to our task language, which represents what we want to have happen if we divide by zero, where what we want to do is return Nothing.

abort k = Nothing
mid a k = k a
map2 f u v k = u(\u' -> v (\v' -> k(f u' v'))) 
t13 = map2 (++) (mid "a")
               (map2 (++) (mid "b")
                          (map2 (++) (mid "c")
                         (mid "d")))

t13 (\k->Just k) == Just "abcd"

t14 = map2 (++) (mid "a")
               (map2 (++) abort
                          (map2 (++) (mid "c")
                         (mid "d")))


t14 (\k->Just k) == Nothing

Super cool.

Continuation Passing Style Transforms

Gaining control over order of evaluation

We know that evaluation order matters. We're beginning to learn how to gain some control over order of evaluation (think of Jim's abort handler). We continue to reason about order of evaluation.

A lucid discussion of evaluation order in the context of the lambda calculus can be found here: Sestoft: Demonstrating Lambda Calculus Reduction. Sestoft also provides a lovely on-line lambda evaluator: Sestoft: Lambda calculus reduction workbench, which allows you to select multiple evaluation strategies, and to see reductions happen step by step.

Evaluation order matters

We've seen this many times. For instance, consider the following reductions. It will be convenient to use the abbreviation w = \x.xx. I'll indicate which lambda is about to be reduced with a * underneath:

(\x.y)(ww)
 *
y

Done! We have a normal form. But if we reduce using a different strategy, things go wrong:

(\x.y)(ww) =
(\x.y)((\x.xx)w) =
        *
(\x.y)(ww) =
(\x.y)((\x.xx)w) =
        *
(\x.y)(ww) 

Etc.

As a second reminder of when evaluation order matters, consider using Y = \f.(\h.f(hh))(\h.f(hh)) as a fixed point combinator to define a recursive function:

Y (\f n. blah) =
(\f.(\h.f(hh))(\h.f(hh))) (\f n. blah) 
     *
(\f.f((\h.f(hh))(\h.f(hh)))) (\f n. blah) 
       *
(\f.f(f((\h.f(hh))(\h.f(hh))))) (\f n. blah) 
         *
(\f.f(f(f((\h.f(hh))(\h.f(hh)))))) (\f n. blah) 

And we never get the recursion off the ground.

Using a Continuation Passing Style transform to control order of evaluation

We'll present a technique for controlling evaluation order by transforming a lambda term using a Continuation Passing Style transform (CPS), then we'll explore what the CPS is doing, and how.

In order for the CPS to work, we have to adopt a new restriction on beta reduction: beta reduction does not occur underneath a lambda. That is, (\x.y)z reduces to z, but \u.(\x.y)z does not reduce to \u.z, because the \u protects the redex in the body from reduction. (In this context, a "redex" is a part of a term that matches the pattern ...((\xM)N)..., i.e., something that can potentially be the target of beta reduction.)

Start with a simple form that has two different reduction paths:

reducing the leftmost lambda first: (\x.y)((\x.z)u) ~~> y

reducing the rightmost lambda first: (\x.y)((\x.z)u) ~~> (\x.y)z ~~> y

After using the following call-by-name CPS transform---and assuming that we never evaluate redexes protected by a lambda---only the first reduction path will be available: we will have gained control over the order in which beta reductions are allowed to be performed.

Here's the CPS transform defined:

[x] = x
[\xM] = \k.k(\x[M])
[MN] = \k.[M](\m.m[N]k)

Here's the result of applying the transform to our simple example:

[(\x.y)((\x.z)u)] =
\k.[\x.y](\m.m[(\x.z)u]k) =
\k.(\k.k(\x.[y]))(\m.m(\k.[\x.z](\m.m[u]k))k) =
\k.(\k.k(\x.y))(\m.m(\k.(\k.k(\x.z))(\m.muk))k)

Because the initial \k protects (i.e., takes scope over) the entire transformed term, we can't perform any reductions. In order to watch the computation unfold, we have to apply the transformed term to a trivial continuation, usually the identity function I = \x.x.

[(\x.y)((\x.z)u)] I =
(\k.[\x.y](\m.m[(\x.z)u]k)) I
 *
[\x.y](\m.m[(\x.z)u] I) =
(\k.k(\x.y))(\m.m[(\x.z)u] I)
 *           *
(\x.y)[(\x.z)u] I           --A--
 *
y I

The application to I unlocks the leftmost functor. Because that functor (\x.y) throws away its argument (consider the reduction in the line marked (A)), we never need to expand the CPS transform of the argument. This means that we never bother to reduce redexes inside the argument.

Compare with a call-by-value xform:

{x} = \k.kx
{\aM} = \k.k(\a{M})
{MN} = \k.{M}(\m.{N}(\n.mnk))

This time the reduction unfolds in a different manner:

{(\x.y)((\x.z)u)} I =
(\k.{\x.y}(\m.{(\x.z)u}(\n.mnk))) I
 *
{\x.y}(\m.{(\x.z)u}(\n.mnI)) =
(\k.k(\x.{y}))(\m.{(\x.z)u}(\n.mnI))
 *             *
{(\x.z)u}(\n.(\x.{y})nI) =
(\k.{\x.z}(\m.{u}(\n.mnk)))(\n.(\x.{y})nI)
 *
{\x.z}(\m.{u}(\n.mn(\n.(\x.{y})nI))) =
(\k.k(\x.{z}))(\m.{u}(\n.mn(\n.(\x.{y})nI)))
 *             *
{u}(\n.(\x.{z})n(\n.(\x.{y})nI)) =
(\k.ku)(\n.(\x.{z})n(\n.(\x.{y})nI))
 *      *
(\x.{z})u(\n.(\x.{y})nI)       --A--
 *
{z}(\n.(\x.{y})nI) =
(\k.kz)(\n.(\x.{y})nI)
 *      *
(\x.{y})zI
 *
{y}I =
(\k.ky)I
 *
I y

In this case, the argument does get evaluated: consider the reduction in the line marked (A).

Both xforms make the following guarantee: as long as redexes underneath a lambda are never evaluated, there will be at most one reduction available at any step in the evaluation. That is, all choice is removed from the evaluation process.

Now let's verify that the CBN CPS avoids the infinite reduction path discussed above (remember that w = \x.xx):

[(\x.y)(ww)] I =
(\k.[\x.y](\m.m[ww]k)) I
 *
[\x.y](\m.m[ww]I) =
(\k.k(\x.y))(\m.m[ww]I)
 *             *
(\x.y)[ww]I
 *
y I

Questions and exercises:

  1. Prove that {(\x.y)(ww)} does not terminate.

  2. Why is the CBN xform for variables [x] = x instead of something involving kappas (i.e., k's)?

  3. Write an Ocaml function that takes a lambda term and returns a CPS-xformed lambda term. You can use the following data declaration:

    type form = Var of char | Abs of char * form | App of form * form;;

  4. The discussion above talks about the "leftmost" redex, or the "rightmost". But these words apply accurately only in a special set of terms. Characterize the order of evaluation for CBN (likewise, for CBV) more completely and carefully.

  5. What happens (in terms of evaluation order) when the application rule for CBV CPS is changed to {MN} = \k.{N}(\n.{M}(\m.mnk))?

  6. A term and its CPS xform are different lambda terms. Yet in some sense they "do" the same thing computationally. Make this sense precise.

Thinking through the types

This discussion is based on Meyer and Wand 1985.

Let's say we're working in the simply-typed lambda calculus. Then if the original term is well-typed, the CPS xform will also be well-typed. But what will the type of the transformed term be?

The transformed terms all have the form \k.blah. The rule for the CBN xform of a variable appears to be an exception, but instead of writing [x] = x, we can write [x] = \k.xk, which is eta-equivalent. The k's are continuations: functions from something to a result. Let's use σ as the result type. The each k in the transform will be a function of type ρ --> σ for some choice of ρ.

We'll need an ancilliary function ': for any ground type a, a' = a; for functional types a->b, (a->b)' = ((a' -> σ) -> σ) -> (b' -> σ) -> σ.

Call by name transform

Terms                            Types

[x] = \k.xk                      [a] = (a'->o)->o
[\xM] = \k.k(\x[M])              [a->b] = ((a->b)'->o)->o
[MN] = \k.[M](\m.m[N]k)          [b] = (b'->o)->o

Remember that types associate to the right. Let's work through the application xform and make sure the types are consistent. We'll have the following types:

M:a->b
N:a
MN:b 
k:b'->o
[N]:(a'->o)->o
m:((a'->o)->o)->(b'->o)->o
m[N]:(b'->o)->o
m[N]k:o 
[M]:((a->b)'->o)->o = ((((a'->o)->o)->(b'->o)->o)->o)->o
[M](\m.m[N]k):o
[MN]:(b'->o)->o

Be aware that even though the transform uses the same symbol for the translation of a variable (i.e., [x] = x), in general the variable in the transformed term will have a different type than in the source term.

Excercise: what should the function ' be for the CBV xform? Hint: see the Meyer and Wand abstract linked above for the answer.

Other CPS transforms

It is easy to think that CBN and CBV are the only two CPS transforms. (We've already seen a variant on call-by-value one of the excercises above.)

In fact, the number of distinct transforms is unbounded. For instance, here is a variant of CBV that uses the same types as CBN:

<x> = x
<\xM> = \k.k(\x<M>)
<MN> = \k.<M>(\m.<N>(\n.m(\k.kn)k))

Try reducing <(\x.x) ((\y.y) (\z.z))> I to convince yourself that this is a version of call-by-value.

Once we have two evaluation strategies that rely on the same types, we can mix and match:

[x] = x
<x> = x
[\xM] = \k.k(\x<M>)
<\xM] = \k.k(\x[M])
[MN] = \k.<M>(\m.m<N>k)
<MN> = \k.[M](\m.[N](\n.m(\k.kn)k))

This xform interleaves call-by-name and call-by-value in layers, according to the depth of embedding. (Cf. page 4 of Reynold's 1974 paper ftp://ftp.cs.cmu.edu/user/jcr/reldircont.pdf (equation (4) and the explanation in the paragraph below.)

CPS Transforms

We've already approached some tasks now by programming in continuation-passing style. We first did that with tuples at the start of term, and then with the v5 lists in ?week4, and now more recently and self-consciously when discussing aborts, and the "abSd" task. and the use of tree_monadize specialized to the Continuation monad, which required us to supply an initial continuation.

In our discussion of aborts, we showed how to rearrange code like this:

let foo x =
+---try begin----------------+
|       (if x = 1 then 10    |
|       else abort 20        |
|       ) + 100              |
+---end----------------------+
in (foo 2) + 1000;;

into a form like this:

let x = 2
in let snapshot = fun box ->
    let foo_result = box
    in (foo_result) + 1000
in let continue_normally = fun from_value ->
    let value = from_value + 100
    in snapshot value
in
    if x = 1 then continue_normally 10
    else snapshot 20;;

How did we figure out how to rearrange that code? There are algorithms that can do this for us mechanically. These algorithms are known as CPS transforms, because they transform code that might not yet be in CPS form into that form.

We won't attempt to give a full CPS transform for OCaml; instead we'll just focus on the lambda calculus and a few extras, to be introduced as we proceed.

In fact there are multiple ways to do a CPS transform. Here is one:

[x]     --> x
[\x. M] --> \k. k (\x. [M])
[M N]   --> \k. [M] (\m. m [N] k)

And here is another:

[x]     --> \k. k x
[\x. M] --> \k. k (\x. [M])
[M N]   --> \k. [M] (\m. [N] (\n. m n k))

These transforms have some interesting properties. One is that---assuming we never reduce inside a lambda term, but only when redexes are present in the outermost level---the formulas generated by these transforms will always only have a single candidate redex to be reduced at any stage. In other words, the generated expressions dictate in what order the components from the original expressions will be evaluated. As it happens, the first transform above forces a call-by-name reduction order: assuming M N to be a redex, redexes inside N will be evaluated only after N has been substituted into M. And the second transform forces a call-by-value reduction order. These reduction orders will be forced no matter what the native reduction order of the interpreter is, just so long as we're only allowed to reduce redexes not underneath lambdas.

Plotkin did important early work with CPS transforms, and they are now a staple of academic computer science. (See the end of his 1975 paper Call-by-name, call-by-value, and the lambda-calculus.)

Here's another interesting fact about these transforms. Compare the translations for variables and applications in the call-by-value transform:

[x]     --> \k. k x
[M N]   --> \k. [M] (\m. [N] (\n. m n k))

to the implementations we proposed for unit and bind when developing a Continuation monads, for example here. I'll relabel some of the variable names to help the comparison:

let cont_unit x = fun k -> k x
let cont_bind N M = fun k -> N (fun n -> M n k)

The transform for x is just cont_unit x! And the transform for M N is, though not here exactly the same as cont_bind N M, quite reminiscent of it. (I don't yet know whether there's an easy and satisfying explanation of why these two are related as they are.)

Doing CPS transforms by hand is very cumbersome. (Try it.) But you can leverage our lambda evaluator to help you out. Here's how to do it. From here on out, we'll be working with and extending the call-by-value CPS transform set out above:

let var = \x (\k. k x) in
let lam = \x_body (\k. k (\x. x_body x)) in
let app = \m n. (\k. m (\m. n (\n. m n k))) in
...

Then if you want to use [x], you'd write var x. If you want to use [\x. body], you'd write lam (\x. BODY), where BODY is whatever [body] amounts to. If you want to use [m n], you'd write app M N, where M and N are whatever [m] and [n] amount to.

To play around with this, you'll also want to help yourself to some primitives already in CPS form. (You won't want to rebuild everything again from scratch.) For a unary function like succ, you can take its primitive CPS analogue [succ] to be \u. u (\a k. k (succ a)) (where succ in this expansion is the familiar non-CPS form of succ). Then for example:

[succ x]
      = \k. [succ] (\m. [x] (\n. m n k))
    ~~> ...
    ~~> \k. k (succ x)

Or, using the lambda evaluator, that is:

...
let op1 = \op. \u. u (\a k. k (op a)) in
app (op1 succ) (var x)
~~> \k. k (succ x)

Some other handy tools:

let app2 = \a b c. app (app a b) c in
let app3 = \a b c d. app (app (app a b) c) d in
let op2 = \op. \u. u (\a v. v (\b k. k (op a b))) in
let op3 = \op. \u. u (\a v. v (\b w. w (\c k. k (op a b c)))) in
...

Then, for instance, [plus x y] would be rendered in the lambda evaluator as:

app2 (op2 plus) (var x) (var y)
~~> \k. k (plus x y)

To finish off a CPS computation, you have to supply it with an "initial" or "outermost" continuation. (This is somewhat like "running" a monadic computation.) Usually you'll give the identity function, representing that nothing further happens to the continuation-expecting value.

If the program you're working with is already in CPS form, then some elegant and powerful computational patterns become available, as we've been seeing. But it's tedious to convert to and work in fully-explicit CPS form. Usually you'll just want to be using the power of continuations at some few points in your program. It'd be nice if we had some way to make use of those patterns without having to convert our code explicitly into CPS form.

Callcc

Well, we can. Consider the space of lambda formulas. Consider their image under a CPS transform. There will be many well-formed lambda expressions not in that image---that is, expressions that aren't anybody's CPS transform. Some of these will be useful levers in the CPS patterns we want to make use of. We can think of them as being the CPS transforms of some new syntax in the original language. For example, the expression callcc is explained as a new bit of syntax having some of that otherwise unclaimed CPS real-estate. The meaning of the new syntax can be understood in terms of how the CPS transform we specify for it behaves, when the whole language is in CPS form.

I won't give the CPS transform for callcc itself, but instead for the complex form:

[callcc (\k. body)] = \outk. (\k. [body] outk) (\v localk. outk v)

The behavior of callcc is this. The whole expression callcc (\k. body), call it C, is being evaluated in a context, call it E[_]. When we convert to CPS form, the continuation of this occurrence of C will be bound to the variable outk. What happens then is that we bind the expression \v localk. outk v to the variable k and evaluate [body], passing through to it the existing continuation outk. Now if body is just, for example, x, then its CPS transform [x] will be \j. j x and this will accept the continuation outk and feed it x, and we'll continue on with nothing unusual occurring. If on the other hand body makes use of the variable k, what happens then? For example, suppose body includes foo (k v). In the reduction of the CPS transform [foo (k v)], v will be passed to k which as we said is now \v localk. outk v. The continuation of that application---what is scheduled to happen to k v after it's evaluated and foo gets access to it---will be bound next to localk. But notice that this localk is discarded. The computation goes on without it. Instead, it just continues evaluating outk v, where as we said outk is the outside continuation E[_] of the whole callcc (\k. body) invocation.

So in other words, since the continuation in which foo was to be applied to the value of k v was discarded, that application never gets evaluated. We escape from that whole block of code.

It's important to understand that callcc binds k to a pipe into the continuation as still then installed. Not just to a function that performs the same computation as the context E[_] does---that has the same normal form and extension. But rather, a pipe into E[_] in its continuation-playing role. This is manifested by the fact that when k v finishes evaluating, that value is not delivered to foo for the computation to proceed. Instead, when k v finishes evaluating, the program will then be done. Not because of some "stop here" block attached to k, but rather because of what it is that k represents. Walking through the explanation above several times may help you understand this better.

So too will examples. We'll give some examples, and show you how to try them out in a variety of formats:

  1. using the lambda evaluator to check how the CPS transforms reduce

    To do this, you can use the following helper function:

    let callcc = \k_body. \outk. (\k. (k_body k) outk) (\v localk. outk v) in
    ...
    

    Used like this: [callcc (\k. body)] = callcc (\k. BODY), where BODY is [body].

  2. using a callcc operation on our Continuation monad

    This is implemented like this:

    let callcc body = fun outk -> body (fun v localk -> outk v) outk
    

Ok, so now let's see how to perform these same computations via CPS.

In the lambda evaluator:

let var = \x (\k. k x) in
let lam = \x_body (\k. k (\x. x_body x)) in
let app = \m n. (\k. m (\m. n (\n. m n k))) in
let app2 = \a b c. app (app a b) c in
let app3 = \a b c d. app (app (app a b) c) d in
let op1 = \op. \u. u (\a k. k (op a)) in
let op2 = \op. \u. u (\a v. v (\b k. k (op a b))) in
let op3 = \op. \u. u (\a v. v (\b w. w (\c k. k (op a b c)))) in
let callcc = \k_body. \outk. (\k. (k_body k) outk) (\v localk. outk v) in

; (+ 100 (let/cc k (+ 10 1))) ~~> 111
app2 (op2 plus) (var hundred) (callcc (\k. app2 (op2 plus) (var ten) (var one)))
; evaluates to \k. k (plus hundred (plus ten one))

Next:

; (+ 100 (let/cc k (+ 10 (k 1)))) ~~> 101
app2 (op2 plus) (var hundred) (callcc (\k. app2 (op2 plus) (var ten) (app (var k) (var one))))
; evaluates to \k. k (plus hundred one)

We won't try to do the third example in this framework.

Finally, using the Continuation monad from our OCaml monad library. We begin:

# #use "path/to/monads.ml"
# module C = Continuation_monad;;

Now what we want to do is something like this:

# C.(run0 (100 + callcc (fun k -> 10 + 1)));;

run0 is a special function in the Continuation monad that runs a value of that monad using the identity function as its initial continuation. The above expression won't type-check, for several reasons. First, we're trying to add 100 to callcc (...) but the latter is a Continuation.m value, not an int. So we have to do this instead:

# C.(run0 (callcc (fun k -> 10 + 1) >>= fun i -> 100 + i));;

Except that's still no good, because 10 + 1 and 100 + i are of type int, but their context demands Continuation monadic values. So we have to throw in some units:

# C.(run0 (callcc (fun k -> unit (10 + 1)) >>= fun i -> unit (100 + i)));;
- : int = 111

This works and as you can see, delivers the same answer 111 that we got by the other methods.

Next we try:

# C.(run0 (callcc (fun k -> unit (10 + (k 1))) >>= fun i -> unit (100 + i)));;

That won't work because k 1 doesn't have type int, but we're trying to add it to 10. So we have to do instead:

# C.(run0 (callcc (fun k -> k 1 >>= fun j -> unit (10 + j)) >>= fun i -> unit (100 + i)));;
- : int = 101

This also works and as you can see, delivers the expected answer 101.

The third example is more difficult to make work with the monadic library, because its types are tricky. I was able to get this to work, which uses OCaml's "polymorphic variants." These are generally more relaxed about typing. There may be a version that works with regular OCaml types, but I haven't yet been able to identify it. Here's what does work:

# C.(run0 (callcc (fun k -> unit (1,`Box k)) >>= fun (p1,`Box p2) -> p2 (2,`Box unit) >>= fun p2' -> unit (p1,p2')));;
- : int * (int * [ `Box of 'b -> ('a, 'b) C.m ] as 'b) as 'a =
(2, (2, `Box <fun>))

CPS in the lambda evaluator

You can make the lambda evaluator perform the required CPS transforms with these helper functions:

let reset = \body. \outk. outk (body (\i i)) in
let shift = \k_body. \midk. (\k. (k_body k) (\i i)) (\a localk. localk (midk a)) in
let abort = \body. \midk. body (\i i) in
...

You use these like so:

  • [reset body] is reset BODY where BODY is [body]
  • [shift k body] is shift (\k. BODY) where BODY is [body]
  • and [abort value] is abort VALUE where VALUE is [value]

There are also reset and shift and abort operations in the Continuation monad in our OCaml ?monad library. You can check the code for details.

As we said, there are many varieties of delimited continuations. Another common pair is prompt and control. There's no difference in meaning between prompt and reset; it's just that people tend to say reset when talking about shift, and prompt when talking about control. control acts subtly differently from shift. In the uses you're likely to make as you're just learning about continuations, you won't see any difference. If you'll do more research in this vicinity, you'll soon enough learn about the differences.

(You can start by reading the Racket docs.)

Ken Shan has done terrific work exploring the relations of shift and control and other control operators to each other.

In collecting these CPS transforms and implementing the monadic versions, we've been helped by Ken and by Oleg and by these papers:

  • Danvy and Filinski, "Representing control: a study of the CPS transformation" (1992)
  • Sabry, "Note on axiomatizing the semantics of control operators" (1996)

-- cutting some of the following for control operators --

Examples of shift/reset/abort

Here are some more examples of using delimited control operators. We present each example three ways: first a Scheme formulation; then we compute the same result using CPS and the lambda evaluator; then we do the same using the Continuation monad in OCaml. (We don't demonstrate the use of Oleg's delimcc library.)

Example 1:

; (+ 1000 (+ 100 (abort 11))) ~~> 11

app2 (op2 plus) (var thousand)
  (app2 (op2 plus) (var hundred) (abort (var eleven)))

# Continuation_monad.(run0(
    abort 11 >>= fun i ->
    unit (100+i) >>= fun j ->
    unit (1000+j)));;
- : int = 11

When no reset is specified, there's understood to be an implicit one surrounding the entire computation (but unlike in the case of callcc, you still can't capture up to and including the end of the computation). So it makes no difference if we say instead:

# Continuation_monad.(run0(
    reset (
      abort 11 >>= fun i ->
      unit (100+i) >>= fun j ->
      unit (1000+j))));;
- : int = 11

Example 2:

; (+ 1000 (reset (+ 100 (abort 11)))) ~~> 1011

app2 (op2 plus) (var thousand)
  (reset (app2 (op2 plus) (var hundred) (abort (var eleven))))

# Continuation_monad.(run0(
    reset (
      abort 11 >>= fun i ->
      unit (100+i)
    ) >>= fun j ->
    unit (1000+j)));;
- : int = 1011

Example 3:

; (+ 1000 (reset (+ 100 (shift k (+ 10 1))))) ~~> 1011

app2 (op2 plus) (var thousand)
  (reset (app2 (op2 plus) (var hundred)
    (shift (\k. ((op2 plus) (var ten) (var one))))))

Continuation_monad.(
  let v = reset (
    let u = shift (fun k -> unit (10 + 1))
    in u >>= fun x -> unit (100 + x)
  ) in let w = v >>= fun x -> unit (1000 + x)
  in run0 w);;
- : int = 1011

Example 4:

; (+ 1000 (reset (+ 100 (shift k (k (+ 10 1)))))) ~~> 1111

app2 (op2 plus) (var thousand)
  (reset (app2 (op2 plus) (var hundred)
    (shift (\k. (app (var k) ((op2 plus) (var ten) (var one)))))))

Continuation_monad.(
  let v = reset (
    let u = shift (fun k -> k (10 :: [1]))
    in u >>= fun x -> unit (100 :: x)
  ) in let w = v >>= fun x -> unit (1000 :: x)
  in run0 w);;
- : int list = [1000; 100; 10; 1]

To demonstrate the different adding order between Examples 4 and 5, we use :: in the OCaml version instead of +. Here is Example 5:

; (+ 1000 (reset (+ 100 (shift k (+ 10 (k 1)))))) ~~> 1111 but added differently

app2 (op2 plus) (var thousand)
  (reset (app2 (op2 plus) (var hundred)
    (shift (\k. ((op2 plus) (var ten) (app (var k) (var one)))))))

Continuation_monad.(let v = reset (
    let u = shift (fun k -> k [1] >>= fun x -> unit (10 :: x))
    in u >>= fun x -> unit (100 :: x)
  ) in let w = v >>= fun x -> unit (1000 :: x)
  in run0  w);;
- : int list = [1000; 10; 100; 1]

Example 6:

; (+ 100 ((reset (+ 10 (shift k k))) 1)) ~~> 111

app2 (op2 plus) (var hundred)
  (app (reset (app2 (op2 plus) (var ten)
    (shift (\k. (var k))))) (var one))

(* not sure if this example can be typed as-is in OCaml... this is the best I an do at the moment... *)

# type 'x either = Left of (int -> ('x,'x either) Continuation_monad.m) | Right of int;;
# Continuation_monad.(let v = reset (
    shift (fun k -> unit (Left k)) >>= fun i -> unit (Right (10+i))
  ) in let w = v >>= fun (Left k) ->
      k 1 >>= fun (Right i) ->
      unit (100+i)
  in run0 w);;
- : int = 111

Example 7:

; (+ 100 (reset (+ 10 (shift k (k (k 1)))))) ~~> 121

app2 (op2 plus) (var hundred)
  (reset (app2 (op2 plus) (var ten)
    (shift (\k. app (var k) (app (var k) (var one))))))

Continuation_monad.(let v = reset (
    let u = shift (fun k -> k 1 >>= fun x -> k x)
    in u >>= fun x -> unit (10 + x)
  ) in let w = v >>= fun x -> unit (100 + x)
  in run0 w)
- : int = 121