X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=blobdiff_plain;f=cps.mdwn;h=6668b48ce1bcde7c63aed5a13b113c376e4bc884;hp=35d0680c2319da37047c65f5702dd47465baad3f;hb=339a5442b568742f36ddab4a44f77cfb26b609a2;hpb=aa47742f7ab6e132d2afe3dd6703855bfaeb7ecf diff --git a/cps.mdwn b/cps.mdwn index 35d0680c..6668b48c 100644 --- a/cps.mdwn +++ b/cps.mdwn @@ -18,8 +18,8 @@ Evaluation order matters We've seen this many times. For instance, consider the following reductions. It will be convenient to use the abbreviation `w = -\x.xx`. I'll indicate which lambda is about to be reduced with a * -underneath: +\x.xx`. I'll +indicate which lambda is about to be reduced with a * underneath:
(\x.y)(ww) @@ -68,16 +68,17 @@ what the CPS is doing, and how. In order for the CPS to work, we have to adopt a new restriction on beta reduction: beta reduction does not occur underneath a lambda. -That is, `(\x.y)z` reduces to `z`, but `\w.(\x.y)z` does not, because -the `\w` protects the redex in the body from reduction. -(A redex is a subform ...(\xM)N..., i.e., something that can be the -target of reduction.) +That is, `(\x.y)z` reduces to `z`, but `\u.(\x.y)z` does not reduce to +`\u.z`, because the `\u` protects the redex in the body from +reduction. (In this context, a "redex" is a part of a term that matches +the pattern `...((\xM)N)...`, i.e., something that can potentially be +the target of beta reduction.) Start with a simple form that has two different reduction paths: -reducing the leftmost lambda first: `(\x.y)((\x.z)w) ~~> y` +reducing the leftmost lambda first: `(\x.y)((\x.z)u) ~~> y` -reducing the rightmost lambda first: `(\x.y)((\x.z)w) ~~> (\x.y)z ~~> y` +reducing the rightmost lambda first: `(\x.y)((\x.z)u) ~~> (\x.y)z ~~> y` After using the following call-by-name CPS transform---and assuming that we never evaluate redexes protected by a lambda---only the first @@ -90,31 +91,33 @@ Here's the CPS transform defined: [\xM] = \k.k(\x[M]) [MN] = \k.[M](\m.m[N]k) -Here's the result of applying the transform to our problem term: +Here's the result of applying the transform to our simple example: - [(\x.y)((\x.z)w)] = - \k.[\x.y](\m.m[(\x.z)w]k) = - \k.(\k.k(\x.[y]))(\m.m(\k.[\x.z](\m.m[w]k))k) = - \k.(\k.k(\x.y))(\m.m(\k.(\k.k(\x.z))(\m.mwk))k) + [(\x.y)((\x.z)u)] = + \k.[\x.y](\m.m[(\x.z)u]k) = + \k.(\k.k(\x.[y]))(\m.m(\k.[\x.z](\m.m[u]k))k) = + \k.(\k.k(\x.y))(\m.m(\k.(\k.k(\x.z))(\m.muk))k) -Because the initial `\k` protects the entire transformed term, -we can't perform any reductions. In order to see the computation -unfold, we have to apply the transformed term to a trivial -continuation, usually the identity function `I = \x.x`. +Because the initial `\k` protects (i.e., takes scope over) the entire +transformed term, we can't perform any reductions. In order to watch +the computation unfold, we have to apply the transformed term to a +trivial continuation, usually the identity function `I = \x.x`. - [(\x.y)((\x.z)w)] I = - (\k.[\x.y](\m.m[(\x.z)w]k)) I + [(\x.y)((\x.z)u)] I = + (\k.[\x.y](\m.m[(\x.z)u]k)) I * - [\x.y](\m.m[(\x.z)w] I) = - (\k.k(\x.y))(\m.m[(\x.z)w] I) + [\x.y](\m.m[(\x.z)u] I) = + (\k.k(\x.y))(\m.m[(\x.z)u] I) * * - (\x.y)[(\x.z)w] I + (\x.y)[(\x.z)u] I --A-- * y I The application to `I` unlocks the leftmost functor. Because that -functor (`\x.y`) throws away its argument, we never need to expand the -CPS transform of the argument. +functor (`\x.y`) throws away its argument (consider the reduction in the +line marked (A)), we never need to expand the +CPS transform of the argument. This means that we never bother to +reduce redexes inside the argument. Compare with a call-by-value xform: @@ -124,22 +127,22 @@ Compare with a call-by-value xform: This time the reduction unfolds in a different manner: - {(\x.y)((\x.z)w)} I = - (\k.{\x.y}(\m.{(\x.z)w}(\n.mnk))) I + {(\x.y)((\x.z)u)} I = + (\k.{\x.y}(\m.{(\x.z)u}(\n.mnk))) I * - {\x.y}(\m.{(\x.z)w}(\n.mnI)) = - (\k.k(\x.{y}))(\m.{(\x.z)w}(\n.mnI)) + {\x.y}(\m.{(\x.z)u}(\n.mnI)) = + (\k.k(\x.{y}))(\m.{(\x.z)u}(\n.mnI)) * * - {(\x.z)w}(\n.(\x.{y})nI) = - (\k.{\x.z}(\m.{w}(\n.mnk)))(\n.(\x.{y})nI) + {(\x.z)u}(\n.(\x.{y})nI) = + (\k.{\x.z}(\m.{u}(\n.mnk)))(\n.(\x.{y})nI) * - {\x.z}(\m.{w}(\n.mn(\n.(\x.{y})nI))) = - (\k.k(\x.{z}))(\m.{w}(\n.mn(\n.(\x.{y})nI))) + {\x.z}(\m.{u}(\n.mn(\n.(\x.{y})nI))) = + (\k.k(\x.{z}))(\m.{u}(\n.mn(\n.(\x.{y})nI))) * * - {w}(\n.(\x.{z})n(\n.(\x.{y})nI)) = - (\k.kw)(\n.(\x.{z})n(\n.(\x.{y})nI)) + {u}(\n.(\x.{z})n(\n.(\x.{y})nI)) = + (\k.ku)(\n.(\x.{z})n(\n.(\x.{y})nI)) * * - (\x.{z})w(\n.(\x.{y})nI) + (\x.{z})u(\n.(\x.{y})nI) --A-- * {z}(\n.(\x.{y})nI) = (\k.kz)(\n.(\x.{y})nI) @@ -151,32 +154,48 @@ This time the reduction unfolds in a different manner: * I y +In this case, the argument does get evaluated: consider the reduction +in the line marked (A). + Both xforms make the following guarantee: as long as redexes underneath a lambda are never evaluated, there will be at most one reduction available at any step in the evaluation. That is, all choice is removed from the evaluation process. +Now let's verify that the CBN CPS avoids the infinite reduction path +discussed above (remember that `w = \x.xx`): + + [(\x.y)(ww)] I = + (\k.[\x.y](\m.m[ww]k)) I + * + [\x.y](\m.m[ww]I) = + (\k.k(\x.y))(\m.m[ww]I) + * * + (\x.y)[ww]I + * + y I + + Questions and exercises: -1. Why is the CBN xform for variables `[x] = x' instead of something +1. Prove that {(\x.y)(ww)} does not terminate. + +2. Why is the CBN xform for variables `[x] = x' instead of something involving kappas? -2. Write an Ocaml function that takes a lambda term and returns a +3. Write an Ocaml function that takes a lambda term and returns a CPS-xformed lambda term. You can use the following data declaration: type form = Var of char | Abs of char * form | App of form * form;; -3. What happens (in terms of evaluation order) when the application -rule for CBN CPS is changed to `[MN] = \k.[N](\n.[M]nk)`? Likewise, -What happens when the application rule for CBV CPS is changed to -`{MN} = \k.{N}(\n.{M}(\m.mnk))`? +4. The discussion above talks about the "leftmost" redex, or the +"rightmost". But these words apply accurately only in a special set +of terms. Characterize the order of evaluation for CBN (likewise, for +CBV) more completely and carefully. -4. What happens when the application rules for the CPS xforms are changed to +5. What happens (in terms of evaluation order) when the application +rule for CBV CPS is changed to `{MN} = \k.{N}(\n.{M}(\m.mnk))`? -- [MN] = \k.{M}(\m.m{N}k) - {MN} = \k.[M](\m.[N](\n.mnk)) -Thinking through the types -------------------------- @@ -196,7 +215,7 @@ the transform will be a function of type ρ --> σ for some choice of ρ. We'll need an ancilliary function ': for any ground type a, a' = a; -for functional types a->b, (a->b)' = ((a' -> o) -> o) -> (b' -> o) -> o. +for functional types a->b, (a->b)' = ((a' -> σ) -> σ) -> (b' -> σ) -> σ. Call by name transform @@ -222,7 +241,10 @@ the following types: [M](\m.m[N]k):o [MN]:(b'->o)->o -Note that even though the transform uses the same symbol for the -translation of a variable, in general it will have a different type in -the transformed term. +Be aware that even though the transform uses the same symbol for the +translation of a variable (i.e., `[x] = x`), in general the variable +in the transformed term will have a different type than in the source +term. +Excercise: what should the function ' be for the CBV xform? Hint: +see the Meyer and Wand abstract linked above for the answer.