X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=blobdiff_plain;f=cps.mdwn;h=6668b48ce1bcde7c63aed5a13b113c376e4bc884;hp=605e86a5fc4f038b00dc9cf9444f0fbfe4d52ead;hb=339a5442b568742f36ddab4a44f77cfb26b609a2;hpb=172783a2998a2d2d6a0e1cbda2e4f68710099fe3 diff --git a/cps.mdwn b/cps.mdwn index 605e86a5..6668b48c 100644 --- a/cps.mdwn +++ b/cps.mdwn @@ -1,61 +1,250 @@ -;; call-by-value CPS -; see Dancy and Filinski, "Representing control: a study of the CPS transformation" (1992) -; and Sabry, "Note on axiomatizing the semantics of control operators" (1996) - -; [x] = var x -let var = \x (\k. k x) in -; [\x. body] = lam (\x. [body]) -let lam = \x_body (\k. k (\x. x_body x)) in -; [M N] = app [M] [N] -let app = \m n. (\k. m (\m. n (\n. m n k))) in - -; helpers -let app3 = \a b c. app (app a b) c in -let app4 = \a b c d. app (app (app a b) c) d in -; [succ] = op1 succ -let op1 = \op. \u. u (\a k. k (op a)) in -; [plus] = op2 plus -let op2 = \op. \u. u (\a v. v (\b k. k (op a b))) in -let op3 = \op. \u. u (\a v. v (\b w. w (\c k. k (op a b c)))) in - -;; continuation operators -; [let/cc k M] = letcc (\k. [M]) -let callcc = \k. k (\f u. (\j. f j u) (\y w. u y)) in -let letcc = \x_body. app callcc (lam x_body) in -let letcc = \k_body. \k. (\j. (k_body j) k) (\y w. k y) in - -; [abort M] = abort [M] -let abort = \body. \k. body (\m m) in -; [prompt M] = prompt [M] -let prompt = \body. \k. k (body (\m m)) in -; [shift k M] = shift (\k. [M]) -let shift = \k_body. \k. (\j. (k_body j) (\m m)) (\y w. w (k y)) in - -;; examples -; (+ 100 (let/cc k (+ 10 1))) ~~> 111 -; app3 (op2 plus) (var hundred) (letcc (\k. app3 (op2 plus) (var ten) (var one))) - -; (+ 100 (let/cc k (+ 10 (k 1)))) ~~> 101 -; app3 (op2 plus) (var hundred) (letcc (\k. app3 (op2 plus) (var ten) (app (var k) (var one)))) - -; (+ 100 (+ 10 (abort 1))) ~~> 1 -; app3 (op2 plus) (var hundred) (app3 (op2 plus) (var ten) (abort (var one))) - -; (+ 100 (prompt (+ 10 (abort 1)))) ~~> 101 -; app3 (op2 plus) (var hundred) (prompt (app3 (op2 plus) (var ten) (abort (var one)))) - -; (+ 1000 (prompt (+ 100 (shift k (+ 10 1))))) ~~> 1011 -; app3 (op2 plus) (var thousand) (prompt (app3 (op2 plus) (var hundred) (shift (\k. ((op2 plus) (var ten) (var one)))))) - -; (+ 1000 (prompt (+ 100 (shift k (k (+ 10 1)))))) ~~> 1111 -; app3 (op2 plus) (var thousand) (prompt (app3 (op2 plus) (var hundred) (shift (\k. (app (var k) ((op2 plus) (var ten) (var one))))))) - -; (+ 1000 (prompt (+ 100 (shift k (+ 10 (k 1)))))) ~~> 1111 but added differently -; app3 (op2 plus) (var thousand) (prompt (app3 (op2 plus) (var hundred) (shift (\k. ((op2 plus) (var ten) (app (var k) (var one))))))) - -; (+ 100 ((prompt (+ 10 (shift k k))) 1)) ~~> 111 -; app3 (op2 plus) (var hundred) (app (prompt (app3 (op2 plus) (var ten) (shift (\k. (var k))))) (var one)) - -; (+ 100 (prompt (+ 10 (shift k (k (k 1)))))) ~~> 121 -; app3 (op2 plus) (var hundred) (prompt (app3 (op2 plus) (var ten) (shift (\k. app (var k) (app (var k) (var one)))))) +Gaining control over order of evaluation +---------------------------------------- +We know that evaluation order matters. We're beginning to learn how +to gain some control over order of evaluation (think of Jim's abort handler). +We continue to reason about order of evaluation. + +A lucid discussion of evaluation order in the +context of the lambda calculus can be found here: +[Sestoft: Demonstrating Lambda Calculus Reduction](http://www.itu.dk/~sestoft/papers/mfps2001-sestoft.pdf). +Sestoft also provides a lovely on-line lambda evaluator: +[Sestoft: Lambda calculus reduction workbench](http://www.itu.dk/~sestoft/lamreduce/index.html), +which allows you to select multiple evaluation strategies, +and to see reductions happen step by step. + +Evaluation order matters +------------------------ + +We've seen this many times. For instance, consider the following +reductions. It will be convenient to use the abbreviation `w = +\x.xx`. I'll +indicate which lambda is about to be reduced with a * underneath: + +
+(\x.y)(ww)
+ *
+y
+
+ +Done! We have a normal form. But if we reduce using a different +strategy, things go wrong: + +
+(\x.y)(ww) =
+(\x.y)((\x.xx)w) =
+        *
+(\x.y)(ww) =
+(\x.y)((\x.xx)w) =
+        *
+(\x.y)(ww) 
+
+ +Etc. + +As a second reminder of when evaluation order matters, consider using +`Y = \f.(\h.f(hh))(\h.f(hh))` as a fixed point combinator to define a recursive function: + +
+Y (\f n. blah) =
+(\f.(\h.f(hh))(\h.f(hh))) (\f n. blah) 
+     *
+(\f.f((\h.f(hh))(\h.f(hh)))) (\f n. blah) 
+       *
+(\f.f(f((\h.f(hh))(\h.f(hh))))) (\f n. blah) 
+         *
+(\f.f(f(f((\h.f(hh))(\h.f(hh)))))) (\f n. blah) 
+
+ +And we never get the recursion off the ground. + + +Using a Continuation Passing Style transform to control order of evaluation +--------------------------------------------------------------------------- + +We'll present a technique for controlling evaluation order by transforming a lambda term +using a Continuation Passing Style transform (CPS), then we'll explore +what the CPS is doing, and how. + +In order for the CPS to work, we have to adopt a new restriction on +beta reduction: beta reduction does not occur underneath a lambda. +That is, `(\x.y)z` reduces to `z`, but `\u.(\x.y)z` does not reduce to +`\u.z`, because the `\u` protects the redex in the body from +reduction. (In this context, a "redex" is a part of a term that matches +the pattern `...((\xM)N)...`, i.e., something that can potentially be +the target of beta reduction.) + +Start with a simple form that has two different reduction paths: + +reducing the leftmost lambda first: `(\x.y)((\x.z)u) ~~> y` + +reducing the rightmost lambda first: `(\x.y)((\x.z)u) ~~> (\x.y)z ~~> y` + +After using the following call-by-name CPS transform---and assuming +that we never evaluate redexes protected by a lambda---only the first +reduction path will be available: we will have gained control over the +order in which beta reductions are allowed to be performed. + +Here's the CPS transform defined: + + [x] = x + [\xM] = \k.k(\x[M]) + [MN] = \k.[M](\m.m[N]k) + +Here's the result of applying the transform to our simple example: + + [(\x.y)((\x.z)u)] = + \k.[\x.y](\m.m[(\x.z)u]k) = + \k.(\k.k(\x.[y]))(\m.m(\k.[\x.z](\m.m[u]k))k) = + \k.(\k.k(\x.y))(\m.m(\k.(\k.k(\x.z))(\m.muk))k) + +Because the initial `\k` protects (i.e., takes scope over) the entire +transformed term, we can't perform any reductions. In order to watch +the computation unfold, we have to apply the transformed term to a +trivial continuation, usually the identity function `I = \x.x`. + + [(\x.y)((\x.z)u)] I = + (\k.[\x.y](\m.m[(\x.z)u]k)) I + * + [\x.y](\m.m[(\x.z)u] I) = + (\k.k(\x.y))(\m.m[(\x.z)u] I) + * * + (\x.y)[(\x.z)u] I --A-- + * + y I + +The application to `I` unlocks the leftmost functor. Because that +functor (`\x.y`) throws away its argument (consider the reduction in the +line marked (A)), we never need to expand the +CPS transform of the argument. This means that we never bother to +reduce redexes inside the argument. + +Compare with a call-by-value xform: + + {x} = \k.kx + {\aM} = \k.k(\a{M}) + {MN} = \k.{M}(\m.{N}(\n.mnk)) + +This time the reduction unfolds in a different manner: + + {(\x.y)((\x.z)u)} I = + (\k.{\x.y}(\m.{(\x.z)u}(\n.mnk))) I + * + {\x.y}(\m.{(\x.z)u}(\n.mnI)) = + (\k.k(\x.{y}))(\m.{(\x.z)u}(\n.mnI)) + * * + {(\x.z)u}(\n.(\x.{y})nI) = + (\k.{\x.z}(\m.{u}(\n.mnk)))(\n.(\x.{y})nI) + * + {\x.z}(\m.{u}(\n.mn(\n.(\x.{y})nI))) = + (\k.k(\x.{z}))(\m.{u}(\n.mn(\n.(\x.{y})nI))) + * * + {u}(\n.(\x.{z})n(\n.(\x.{y})nI)) = + (\k.ku)(\n.(\x.{z})n(\n.(\x.{y})nI)) + * * + (\x.{z})u(\n.(\x.{y})nI) --A-- + * + {z}(\n.(\x.{y})nI) = + (\k.kz)(\n.(\x.{y})nI) + * * + (\x.{y})zI + * + {y}I = + (\k.ky)I + * + I y + +In this case, the argument does get evaluated: consider the reduction +in the line marked (A). + +Both xforms make the following guarantee: as long as redexes +underneath a lambda are never evaluated, there will be at most one +reduction available at any step in the evaluation. +That is, all choice is removed from the evaluation process. + +Now let's verify that the CBN CPS avoids the infinite reduction path +discussed above (remember that `w = \x.xx`): + + [(\x.y)(ww)] I = + (\k.[\x.y](\m.m[ww]k)) I + * + [\x.y](\m.m[ww]I) = + (\k.k(\x.y))(\m.m[ww]I) + * * + (\x.y)[ww]I + * + y I + + +Questions and exercises: + +1. Prove that {(\x.y)(ww)} does not terminate. + +2. Why is the CBN xform for variables `[x] = x' instead of something +involving kappas? + +3. Write an Ocaml function that takes a lambda term and returns a +CPS-xformed lambda term. You can use the following data declaration: + + type form = Var of char | Abs of char * form | App of form * form;; + +4. The discussion above talks about the "leftmost" redex, or the +"rightmost". But these words apply accurately only in a special set +of terms. Characterize the order of evaluation for CBN (likewise, for +CBV) more completely and carefully. + +5. What happens (in terms of evaluation order) when the application +rule for CBV CPS is changed to `{MN} = \k.{N}(\n.{M}(\m.mnk))`? + + +Thinking through the types +-------------------------- + +This discussion is based on [Meyer and Wand 1985](http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.44.7943&rep=rep1&type=pdf). + +Let's say we're working in the simply-typed lambda calculus. +Then if the original term is well-typed, the CPS xform will also be +well-typed. But what will the type of the transformed term be? + +The transformed terms all have the form `\k.blah`. The rule for the +CBN xform of a variable appears to be an exception, but instead of +writing `[x] = x`, we can write `[x] = \k.xk`, which is +eta-equivalent. The `k`'s are continuations: functions from something +to a result. Let's use σ as the result type. The each `k` in +the transform will be a function of type ρ --> σ for some +choice of ρ. + +We'll need an ancilliary function ': for any ground type a, a' = a; +for functional types a->b, (a->b)' = ((a' -> σ) -> σ) -> (b' -> σ) -> σ. + + Call by name transform + + Terms Types + + [x] = \k.xk [a] = (a'->o)->o + [\xM] = \k.k(\x[M]) [a->b] = ((a->b)'->o)->o + [MN] = \k.[M](\m.m[N]k) [b] = (b'->o)->o + +Remember that types associate to the right. Let's work through the +application xform and make sure the types are consistent. We'll have +the following types: + + M:a->b + N:a + MN:b + k:b'->o + [N]:(a'->o)->o + m:((a'->o)->o)->(b'->o)->o + m[N]:(b'->o)->o + m[N]k:o + [M]:((a->b)'->o)->o = ((((a'->o)->o)->(b'->o)->o)->o)->o + [M](\m.m[N]k):o + [MN]:(b'->o)->o + +Be aware that even though the transform uses the same symbol for the +translation of a variable (i.e., `[x] = x`), in general the variable +in the transformed term will have a different type than in the source +term. + +Excercise: what should the function ' be for the CBV xform? Hint: +see the Meyer and Wand abstract linked above for the answer.