**Note to Chris**: [[don't forget this material to be merged in somehow|/topics/_cps_and_continuation_operators]]. I marked where I cut some material to put into week13_control_operators, but that page is still a work in progress in my browser...
-Gaining control over order of evaluation
-----------------------------------------
-
-We know that evaluation order matters. We're beginning to learn how
-to gain some control over order of evaluation (think of Jim's abort handler).
-We continue to reason about order of evaluation.
-
-A lucid discussion of evaluation order in the
-context of the lambda calculus can be found here:
-[Sestoft: Demonstrating Lambda Calculus Reduction](http://www.itu.dk/~sestoft/papers/mfps2001-sestoft.pdf).
-Sestoft also provides a lovely on-line lambda evaluator:
-[Sestoft: Lambda calculus reduction workbench](http://www.itu.dk/~sestoft/lamreduce/index.html),
-which allows you to select multiple evaluation strategies,
-and to see reductions happen step by step.
-
-Evaluation order matters
-------------------------
-
-We've seen this many times. For instance, consider the following
-reductions. It will be convenient to use the abbreviation `w =
-\x.xx`. I'll
-indicate which lambda is about to be reduced with a * underneath:
-
-<pre>
-(\x.y)(ww)
- *
-y
-</pre>
-
-Done! We have a normal form. But if we reduce using a different
-strategy, things go wrong:
-
-<pre>
-(\x.y)(ww) =
-(\x.y)((\x.xx)w) =
- *
-(\x.y)(ww) =
-(\x.y)((\x.xx)w) =
- *
-(\x.y)(ww)
-</pre>
-
-Etc.
-
-As a second reminder of when evaluation order matters, consider using
-`Y = \f.(\h.f(hh))(\h.f(hh))` as a fixed point combinator to define a recursive function:
-
-<pre>
-Y (\f n. blah) =
-(\f.(\h.f(hh))(\h.f(hh))) (\f n. blah)
- *
-(\f.f((\h.f(hh))(\h.f(hh)))) (\f n. blah)
- *
-(\f.f(f((\h.f(hh))(\h.f(hh))))) (\f n. blah)
- *
-(\f.f(f(f((\h.f(hh))(\h.f(hh)))))) (\f n. blah)
-</pre>
-
-And we never get the recursion off the ground.
-
-
-Using a Continuation Passing Style transform to control order of evaluation
----------------------------------------------------------------------------
-
-We'll present a technique for controlling evaluation order by transforming a lambda term
-using a Continuation Passing Style transform (CPS), then we'll explore
-what the CPS is doing, and how.
-
-In order for the CPS to work, we have to adopt a new restriction on
-beta reduction: beta reduction does not occur underneath a lambda.
-That is, `(\x.y)z` reduces to `z`, but `\u.(\x.y)z` does not reduce to
-`\u.z`, because the `\u` protects the redex in the body from
-reduction. (In this context, a "redex" is a part of a term that matches
-the pattern `...((\xM)N)...`, i.e., something that can potentially be
-the target of beta reduction.)
-
-Start with a simple form that has two different reduction paths:
-
-reducing the leftmost lambda first: `(\x.y)((\x.z)u) ~~> y`
-
-reducing the rightmost lambda first: `(\x.y)((\x.z)u) ~~> (\x.y)z ~~> y`
-
-After using the following call-by-name CPS transform---and assuming
-that we never evaluate redexes protected by a lambda---only the first
-reduction path will be available: we will have gained control over the
-order in which beta reductions are allowed to be performed.
-
-Here's the CPS transform defined:
-
- [x] = x
- [\xM] = \k.k(\x[M])
- [MN] = \k.[M](\m.m[N]k)
-
-Here's the result of applying the transform to our simple example:
-
- [(\x.y)((\x.z)u)] =
- \k.[\x.y](\m.m[(\x.z)u]k) =
- \k.(\k.k(\x.[y]))(\m.m(\k.[\x.z](\m.m[u]k))k) =
- \k.(\k.k(\x.y))(\m.m(\k.(\k.k(\x.z))(\m.muk))k)
-
-Because the initial `\k` protects (i.e., takes scope over) the entire
-transformed term, we can't perform any reductions. In order to watch
-the computation unfold, we have to apply the transformed term to a
-trivial continuation, usually the identity function `I = \x.x`.
-
- [(\x.y)((\x.z)u)] I =
- (\k.[\x.y](\m.m[(\x.z)u]k)) I
- *
- [\x.y](\m.m[(\x.z)u] I) =
- (\k.k(\x.y))(\m.m[(\x.z)u] I)
- * *
- (\x.y)[(\x.z)u] I --A--
- *
- y I
-
-The application to `I` unlocks the leftmost functor. Because that
-functor (`\x.y`) throws away its argument (consider the reduction in the
-line marked (A)), we never need to expand the
-CPS transform of the argument. This means that we never bother to
-reduce redexes inside the argument.
-
-Compare with a call-by-value xform:
-
- {x} = \k.kx
- {\aM} = \k.k(\a{M})
- {MN} = \k.{M}(\m.{N}(\n.mnk))
-
-This time the reduction unfolds in a different manner:
-
- {(\x.y)((\x.z)u)} I =
- (\k.{\x.y}(\m.{(\x.z)u}(\n.mnk))) I
- *
- {\x.y}(\m.{(\x.z)u}(\n.mnI)) =
- (\k.k(\x.{y}))(\m.{(\x.z)u}(\n.mnI))
- * *
- {(\x.z)u}(\n.(\x.{y})nI) =
- (\k.{\x.z}(\m.{u}(\n.mnk)))(\n.(\x.{y})nI)
- *
- {\x.z}(\m.{u}(\n.mn(\n.(\x.{y})nI))) =
- (\k.k(\x.{z}))(\m.{u}(\n.mn(\n.(\x.{y})nI)))
- * *
- {u}(\n.(\x.{z})n(\n.(\x.{y})nI)) =
- (\k.ku)(\n.(\x.{z})n(\n.(\x.{y})nI))
- * *
- (\x.{z})u(\n.(\x.{y})nI) --A--
- *
- {z}(\n.(\x.{y})nI) =
- (\k.kz)(\n.(\x.{y})nI)
- * *
- (\x.{y})zI
- *
- {y}I =
- (\k.ky)I
- *
- I y
-
-In this case, the argument does get evaluated: consider the reduction
-in the line marked (A).
-
-Both xforms make the following guarantee: as long as redexes
-underneath a lambda are never evaluated, there will be at most one
-reduction available at any step in the evaluation.
-That is, all choice is removed from the evaluation process.
-
-Now let's verify that the CBN CPS avoids the infinite reduction path
-discussed above (remember that `w = \x.xx`):
-
- [(\x.y)(ww)] I =
- (\k.[\x.y](\m.m[ww]k)) I
- *
- [\x.y](\m.m[ww]I) =
- (\k.k(\x.y))(\m.m[ww]I)
- * *
- (\x.y)[ww]I
- *
- y I
-
-
-Questions and exercises:
-
-1. Prove that {(\x.y)(ww)} does not terminate.
-
-2. Why is the CBN xform for variables `[x] = x` instead of something
-involving kappas (i.e., `k`'s)?
-
-3. Write an Ocaml function that takes a lambda term and returns a
-CPS-xformed lambda term. You can use the following data declaration:
-
- type form = Var of char | Abs of char * form | App of form * form;;
-
-4. The discussion above talks about the "leftmost" redex, or the
-"rightmost". But these words apply accurately only in a special set
-of terms. Characterize the order of evaluation for CBN (likewise, for
-CBV) more completely and carefully.
-
-5. What happens (in terms of evaluation order) when the application
-rule for CBV CPS is changed to `{MN} = \k.{N}(\n.{M}(\m.mnk))`?
-
-6. A term and its CPS xform are different lambda terms. Yet in some
-sense they "do" the same thing computationally. Make this sense
-precise.
-
-
-Thinking through the types
---------------------------
-
-This discussion is based on [Meyer and Wand 1985](http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.44.7943&rep=rep1&type=pdf).
-
-Let's say we're working in the simply-typed lambda calculus.
-Then if the original term is well-typed, the CPS xform will also be
-well-typed. But what will the type of the transformed term be?
-
-The transformed terms all have the form `\k.blah`. The rule for the
-CBN xform of a variable appears to be an exception, but instead of
-writing `[x] = x`, we can write `[x] = \k.xk`, which is
-eta-equivalent. The `k`'s are continuations: functions from something
-to a result. Let's use σ as the result type. The each `k` in
-the transform will be a function of type ρ --> σ for some
-choice of ρ.
-
-We'll need an ancilliary function ': for any ground type a, a' = a;
-for functional types a->b, (a->b)' = ((a' -> σ) -> σ) -> (b' -> σ) -> σ.
-
- Call by name transform
-
- Terms Types
-
- [x] = \k.xk [a] = (a'->o)->o
- [\xM] = \k.k(\x[M]) [a->b] = ((a->b)'->o)->o
- [MN] = \k.[M](\m.m[N]k) [b] = (b'->o)->o
-
-Remember that types associate to the right. Let's work through the
-application xform and make sure the types are consistent. We'll have
-the following types:
-
- M:a->b
- N:a
- MN:b
- k:b'->o
- [N]:(a'->o)->o
- m:((a'->o)->o)->(b'->o)->o
- m[N]:(b'->o)->o
- m[N]k:o
- [M]:((a->b)'->o)->o = ((((a'->o)->o)->(b'->o)->o)->o)->o
- [M](\m.m[N]k):o
- [MN]:(b'->o)->o
-
-Be aware that even though the transform uses the same symbol for the
-translation of a variable (i.e., `[x] = x`), in general the variable
-in the transformed term will have a different type than in the source
-term.
-
-Excercise: what should the function ' be for the CBV xform? Hint:
-see the Meyer and Wand abstract linked above for the answer.
-
-
-Other CPS transforms
---------------------
-
-It is easy to think that CBN and CBV are the only two CPS transforms.
-(We've already seen a variant on call-by-value one of the excercises above.)
-
-In fact, the number of distinct transforms is unbounded. For
-instance, here is a variant of CBV that uses the same types as CBN:
-
- <x> = x
- <\xM> = \k.k(\x<M>)
- <MN> = \k.<M>(\m.<N>(\n.m(\k.kn)k))
-
-Try reducing `<(\x.x) ((\y.y) (\z.z))> I` to convince yourself that
-this is a version of call-by-value.
-
-Once we have two evaluation strategies that rely on the same types, we
-can mix and match:
-
- [x] = x
- <x> = x
- [\xM] = \k.k(\x<M>)
- <\xM] = \k.k(\x[M])
- [MN] = \k.<M>(\m.m<N>k)
- <MN> = \k.[M](\m.[N](\n.m(\k.kn)k))
-
-This xform interleaves call-by-name and call-by-value in layers,
-according to the depth of embedding.
-(Cf. page 4 of Reynold's 1974 paper ftp://ftp.cs.cmu.edu/user/jcr/reldircont.pdf (equation (4) and the
-explanation in the paragraph below.)