X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=blobdiff_plain;f=topics%2F_cps.mdwn;fp=topics%2F_cps.mdwn;h=d7752f4dfcffc9e1bc7c5d190580e876881db72b;hp=0000000000000000000000000000000000000000;hb=df2bc76a01104a5a0983b1fe16e073de7277f844;hpb=18a893a1567d4cc1db9826d4e1bb60588c71e1c9

diff --git a/topics/_cps.mdwn b/topics/_cps.mdwn
new file mode 100644
index 00000000..d7752f4d
--- /dev/null
+++ b/topics/_cps.mdwn
@@ -0,0 +1,286 @@
+Gaining control over order of evaluation
+----------------------------------------
+
+We know that evaluation order matters.  We're beginning to learn how
+to gain some control over order of evaluation (think of Jim's abort handler).
+We continue to reason about order of evaluation.
+
+A lucid discussion of evaluation order in the
+context of the lambda calculus can be found here:
+[Sestoft: Demonstrating Lambda Calculus Reduction](http://www.itu.dk/~sestoft/papers/mfps2001-sestoft.pdf).
+Sestoft also provides a lovely on-line lambda evaluator:
+[Sestoft: Lambda calculus reduction workbench](http://www.itu.dk/~sestoft/lamreduce/index.html),
+which allows you to select multiple evaluation strategies, 
+and to see reductions happen step by step.
+
+Evaluation order matters
+------------------------
+
+We've seen this many times.  For instance, consider the following
+reductions.  It will be convenient to use the abbreviation `w =
+\x.xx`.  I'll
+indicate which lambda is about to be reduced with a * underneath:
+
+<pre>
+(\x.y)(ww)
+ *
+y
+</pre>
+
+Done!  We have a normal form.  But if we reduce using a different
+strategy, things go wrong:
+
+<pre>
+(\x.y)(ww) =
+(\x.y)((\x.xx)w) =
+        *
+(\x.y)(ww) =
+(\x.y)((\x.xx)w) =
+        *
+(\x.y)(ww) 
+</pre>
+
+Etc.  
+
+As a second reminder of when evaluation order matters, consider using
+`Y = \f.(\h.f(hh))(\h.f(hh))` as a fixed point combinator to define a recursive function:
+
+<pre>
+Y (\f n. blah) =
+(\f.(\h.f(hh))(\h.f(hh))) (\f n. blah) 
+     *
+(\f.f((\h.f(hh))(\h.f(hh)))) (\f n. blah) 
+       *
+(\f.f(f((\h.f(hh))(\h.f(hh))))) (\f n. blah) 
+         *
+(\f.f(f(f((\h.f(hh))(\h.f(hh)))))) (\f n. blah) 
+</pre>
+
+And we never get the recursion off the ground.
+
+
+Using a Continuation Passing Style transform to control order of evaluation
+---------------------------------------------------------------------------
+
+We'll present a technique for controlling evaluation order by transforming a lambda term
+using a Continuation Passing Style transform (CPS), then we'll explore
+what the CPS is doing, and how.
+
+In order for the CPS to work, we have to adopt a new restriction on
+beta reduction: beta reduction does not occur underneath a lambda.
+That is, `(\x.y)z` reduces to `z`, but `\u.(\x.y)z` does not reduce to
+`\u.z`, because the `\u` protects the redex in the body from
+reduction.  (In this context, a "redex" is a part of a term that matches
+the pattern `...((\xM)N)...`, i.e., something that can potentially be
+the target of beta reduction.)
+
+Start with a simple form that has two different reduction paths:
+
+reducing the leftmost lambda first: `(\x.y)((\x.z)u)  ~~> y`
+
+reducing the rightmost lambda first: `(\x.y)((\x.z)u)  ~~> (\x.y)z ~~> y`
+
+After using the following call-by-name CPS transform---and assuming
+that we never evaluate redexes protected by a lambda---only the first
+reduction path will be available: we will have gained control over the
+order in which beta reductions are allowed to be performed.
+
+Here's the CPS transform defined:
+
+    [x] = x
+    [\xM] = \k.k(\x[M])
+    [MN] = \k.[M](\m.m[N]k)
+
+Here's the result of applying the transform to our simple example:
+
+    [(\x.y)((\x.z)u)] =
+    \k.[\x.y](\m.m[(\x.z)u]k) =
+    \k.(\k.k(\x.[y]))(\m.m(\k.[\x.z](\m.m[u]k))k) =
+    \k.(\k.k(\x.y))(\m.m(\k.(\k.k(\x.z))(\m.muk))k)
+
+Because the initial `\k` protects (i.e., takes scope over) the entire
+transformed term, we can't perform any reductions.  In order to watch
+the computation unfold, we have to apply the transformed term to a
+trivial continuation, usually the identity function `I = \x.x`.
+
+    [(\x.y)((\x.z)u)] I =
+    (\k.[\x.y](\m.m[(\x.z)u]k)) I
+     *
+    [\x.y](\m.m[(\x.z)u] I) =
+    (\k.k(\x.y))(\m.m[(\x.z)u] I)
+     *           *
+    (\x.y)[(\x.z)u] I           --A--
+     *
+    y I
+
+The application to `I` unlocks the leftmost functor.  Because that
+functor (`\x.y`) throws away its argument (consider the reduction in the
+line marked (A)), we never need to expand the
+CPS transform of the argument.  This means that we never bother to
+reduce redexes inside the argument.
+
+Compare with a call-by-value xform:
+
+    {x} = \k.kx
+    {\aM} = \k.k(\a{M})
+    {MN} = \k.{M}(\m.{N}(\n.mnk))
+
+This time the reduction unfolds in a different manner:
+
+    {(\x.y)((\x.z)u)} I =
+    (\k.{\x.y}(\m.{(\x.z)u}(\n.mnk))) I
+     *
+    {\x.y}(\m.{(\x.z)u}(\n.mnI)) =
+    (\k.k(\x.{y}))(\m.{(\x.z)u}(\n.mnI))
+     *             *
+    {(\x.z)u}(\n.(\x.{y})nI) =
+    (\k.{\x.z}(\m.{u}(\n.mnk)))(\n.(\x.{y})nI)
+     *
+    {\x.z}(\m.{u}(\n.mn(\n.(\x.{y})nI))) =
+    (\k.k(\x.{z}))(\m.{u}(\n.mn(\n.(\x.{y})nI)))
+     *             *
+    {u}(\n.(\x.{z})n(\n.(\x.{y})nI)) =
+    (\k.ku)(\n.(\x.{z})n(\n.(\x.{y})nI))
+     *      *
+    (\x.{z})u(\n.(\x.{y})nI)       --A--
+     *
+    {z}(\n.(\x.{y})nI) =
+    (\k.kz)(\n.(\x.{y})nI)
+     *      *
+    (\x.{y})zI
+     *
+    {y}I =
+    (\k.ky)I
+     *
+    I y
+
+In this case, the argument does get evaluated: consider the reduction
+in the line marked (A).
+
+Both xforms make the following guarantee: as long as redexes
+underneath a lambda are never evaluated, there will be at most one
+reduction available at any step in the evaluation.
+That is, all choice is removed from the evaluation process.
+
+Now let's verify that the CBN CPS avoids the infinite reduction path
+discussed above (remember that `w = \x.xx`):
+
+    [(\x.y)(ww)] I =
+    (\k.[\x.y](\m.m[ww]k)) I
+     *
+    [\x.y](\m.m[ww]I) =
+    (\k.k(\x.y))(\m.m[ww]I)
+     *             *
+    (\x.y)[ww]I
+     *
+    y I
+
+
+Questions and exercises:
+
+1. Prove that {(\x.y)(ww)} does not terminate.
+
+2. Why is the CBN xform for variables `[x] = x` instead of something
+involving kappas (i.e., `k`'s)?  
+
+3. Write an Ocaml function that takes a lambda term and returns a
+CPS-xformed lambda term.  You can use the following data declaration:
+
+    type form = Var of char | Abs of char * form | App of form * form;;
+
+4. The discussion above talks about the "leftmost" redex, or the
+"rightmost".  But these words apply accurately only in a special set
+of terms.  Characterize the order of evaluation for CBN (likewise, for
+CBV) more completely and carefully.
+
+5. What happens (in terms of evaluation order) when the application
+rule for CBV CPS is changed to `{MN} = \k.{N}(\n.{M}(\m.mnk))`?
+
+6. A term and its CPS xform are different lambda terms.  Yet in some
+sense they "do" the same thing computationally.  Make this sense
+precise.
+
+
+Thinking through the types
+--------------------------
+
+This discussion is based on [Meyer and Wand 1985](http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.44.7943&rep=rep1&type=pdf).
+
+Let's say we're working in the simply-typed lambda calculus.
+Then if the original term is well-typed, the CPS xform will also be
+well-typed.  But what will the type of the transformed term be?
+
+The transformed terms all have the form `\k.blah`.  The rule for the
+CBN xform of a variable appears to be an exception, but instead of
+writing `[x] = x`, we can write `[x] = \k.xk`, which is
+eta-equivalent.  The `k`'s are continuations: functions from something
+to a result.  Let's use &sigma; as the result type.  The each `k` in
+the transform will be a function of type &rho; --> &sigma; for some
+choice of &rho;.
+
+We'll need an ancilliary function ': for any ground type a, a' = a;
+for functional types a->b, (a->b)' = ((a' -> &sigma;) -> &sigma;) -> (b' -> &sigma;) -> &sigma;.
+
+    Call by name transform
+
+    Terms                            Types
+
+    [x] = \k.xk                      [a] = (a'->o)->o
+    [\xM] = \k.k(\x[M])              [a->b] = ((a->b)'->o)->o
+    [MN] = \k.[M](\m.m[N]k)          [b] = (b'->o)->o
+
+Remember that types associate to the right.  Let's work through the
+application xform and make sure the types are consistent.  We'll have
+the following types:
+
+    M:a->b
+    N:a
+    MN:b 
+    k:b'->o
+    [N]:(a'->o)->o
+    m:((a'->o)->o)->(b'->o)->o
+    m[N]:(b'->o)->o
+    m[N]k:o 
+    [M]:((a->b)'->o)->o = ((((a'->o)->o)->(b'->o)->o)->o)->o
+    [M](\m.m[N]k):o
+    [MN]:(b'->o)->o
+
+Be aware that even though the transform uses the same symbol for the
+translation of a variable (i.e., `[x] = x`), in general the variable
+in the transformed term will have a different type than in the source
+term.
+
+Excercise: what should the function ' be for the CBV xform?  Hint: 
+see the Meyer and Wand abstract linked above for the answer.
+
+
+Other CPS transforms
+--------------------
+
+It is easy to think that CBN and CBV are the only two CPS transforms.
+(We've already seen a variant on call-by-value one of the excercises above.) 
+
+In fact, the number of distinct transforms is unbounded.  For
+instance, here is a variant of CBV that uses the same types as CBN:
+
+    <x> = x
+    <\xM> = \k.k(\x<M>)
+    <MN> = \k.<M>(\m.<N>(\n.m(\k.kn)k))
+
+Try reducing `<(\x.x) ((\y.y) (\z.z))> I` to convince yourself that
+this is a version of call-by-value.
+
+Once we have two evaluation strategies that rely on the same types, we
+can mix and match:
+
+    [x] = x
+    <x> = x
+    [\xM] = \k.k(\x<M>)
+    <\xM] = \k.k(\x[M])
+    [MN] = \k.<M>(\m.m<N>k)
+    <MN> = \k.[M](\m.[N](\n.m(\k.kn)k))
+
+This xform interleaves call-by-name and call-by-value in layers,
+according to the depth of embedding.
+(Cf. page 4 of Reynold's 1974 paper ftp://ftp.cs.cmu.edu/user/jcr/reldircont.pdf (equation (4) and the
+explanation in the paragraph below.)