changes

[lambda.git] / cps.mdwn
diff --git a/cps.mdwn b/cps.mdwn

index 7f037ca..73edf0d 100644 (file)
--- a/cps.mdwn
+++ b/cps.mdwn
@@ -5,13 +5,14 @@ We know that evaluation order matters.  We're beginning to learn how
  to gain some control over order of evaluation (think of Jim's abort handler).
  We continue to reason about order of evaluation.
  
-A superbly clear and lucid discussion can be found here:
-[Sestoft: Demonstrating Lambda Calculus Reduction](http://tinyurl.com/27nd3ub).
-Sestoft also provides a really lovely lambda evaluator, 
+A lucid discussion of evaluation order in the
+context of the lambda calculus can be found here:
+[Sestoft: Demonstrating Lambda Calculus Reduction](http://www.itu.dk/~sestoft/papers/mfps2001-sestoft.pdf).
+Sestoft also provides a lovely on-line lambda evaluator:
+[Sestoft: Lambda calculus reduction workbench]
+(http://www.itu.dk/~sestoft/lamreduce/index.html),
  which allows you to select multiple evaluation strategies, 
-and to see reductions happen step by step:
-[Sestoft's lambda reduction webpage](http://ellemose.dina.kvl.dk/~sestoft/lamreduce/lamframes.html).
-
+and to see reductions happen step by step.
  
  Evaluation order matters
  ------------------------
@@ -59,67 +60,155 @@ Y (\f n. blah) =
  And we never get the recursion off the ground.
  
  
-Restricting evaluation: call by name, call by value
----------------------------------------------------
+Using a Continuation Passing Style transform to control order of evaluation
+---------------------------------------------------------------------------
  
-One way to begin to gain some control is by adjusting our notion of
-beta reduction.  This strategy has some of the flavor of adding types,
-but is actually a part of the untyped calculus.
+We'll exhibit and explore the technique of transforming a lambda term
+using a Continuation Passing Style transform (CPS), then we'll explore
+what the CPS is doing, and how.
  
-In order to have a precise discussion, we'll need some vocabulary.
-We've been talking about normal form (lambda terms that can't be
-further reduced), and leftmost evaluation (the reduction strategy of
-always choosing the left most reducible lambda); we'll need to refine
-both of those concepts.
+In order for the CPS to work, we have to adopt a new restriction on
+beta reduction: beta reduction does not occur underneath a lambda.
+That is, `(\x.y)z` reduces to `z`, but `\w.(\x.y)z` does not, because
+the `\w` protects the redex in the body from reduction.  
  
-Kinds of normal form:
----------------------
+Start with a simple form that has two different reduction paths:
  
-Recall that there are three kinds of lambda term.  Let `a` be an
-arbitrary variable, and let `M` and `N` be arbitrary terms:
+reducing the leftmost lambda first: `(\x.y)((\x.z)w)  ~~> y'
  
-<pre>
-The pure untyped lambda calculus (again):
+reducing the rightmost lambda first: `(\x.y)((\x.z)w)  ~~> (x.y)z ~~> y'
  
-             Form     Examples  
-Variable     a        x, y, z
-Abstract     \aM      \x.x, \x.y, \x.\y.y
-Application  MN       (x x), ((\x.x) y), ((\x.x)(\y.y))
-</pre>
+After using the following call-by-name CPS transform---and assuming
+that we never evaluate redexes protected by a lambda---only the first
+reduction path will be available: we will have gained control over the
+order in which beta reductions are allowed to be performed.
+
+Here's the CPS transform:
+
+    [x] => x
+    [\xM] => \k.k(\x[M])
+    [MN] => \k.[M](\m.m[N]k)
+
+Here's the result of applying the transform to our problem term:
+
+    [(\x.y)((\x.z)w)]
+    \k.[\x.y](\m.m[(\x.z)w]k)
+    \k.(\k.k(\x.[y]))(\m.m(\k.[\x.z](\m.m[w]k))k)
+    \k.(\k.k(\x.y))(\m.m(\k.(\k.k(\x.z))(\m.mwk))k)
+
+Because the initial `\k` protects the entire transformed term, 
+we can't perform any reductions.  In order to see the computation
+unfold, we have to apply the transformed term to a trivial
+continuation, usually the identity function `I = \x.x`.
+
+    [(\x.y)((\x.z)w)] I
+    \k.[\x.y](\m.m[(\x.z)w]k) I
+    [\x.y](\m.m[(\x.z)w] I)
+    (\k.k(\x.y))(\m.m[(\x.z)w] I)
+    (\x.y)[(\x.z)w] I
+    y I
+
+The application to `I` unlocks the leftmost functor.  Because that
+functor (`\x.y`) throws away its argument, we never need to expand the
+CPS transform of the argument.
+
+Compare with a call-by-value xform:
+
+    <x> => \k.kx
+    <\aM> => \k.k(\a<M>)
+    <MN> => \k.<M>(\m.<N>(\n.mnk))
+
+This time the reduction unfolds in a different manner:
  
-It will be helpful to define a *redex*, which is a lambda term of
-the form `((\aM) N)`.  That is, a redex is a form for which beta
-reduction is defined (ignoring, as usual, the manouvering required in
-order to avoid variable collision).
+    <(\x.y)((\x.z)w)> I
+    (\k.<\x.y>(\m.<(\x.z)w>(\n.mnk))) I
+    <\x.y>(\m.<(\x.z)w>(\n.mnI))
+    (\k.k(\x.<y>))(\m.<(\x.z)w>(\n.mnI))
+    <(\x.z)w>(\n.(\x.<y>)nI)
+    (\k.<\x.z>(\m.<w>(\n.mnk)))(\n.(\x.<y>)nI)
+    <\x.z>(\m.<w>(\n.mn(\n.(\x.<y>)nI)))
+    (\k.k(\x.<z>))(\m.<w>(\n.mn(\n.(\x.<y>)nI)))
+    <w>(\n.(\x.<z>)n(\n.(\x.<y>)nI))
+    (\k.kw)(\n.(\x.<z>)n(\n.(\x.<y>)nI))
+    (\x.<z>)w(\n.(\x.<y>)nI)
+    <z>(\n.(\x.<y>)nI)
+    (\k.kz)(\n.(\x.<y>)nI)
+    (\x.<y>)zI
+    <y>I
+    (\k.ky)I
+    I y
  
-It will also be helpful to have names for the two components of an
-application `(M N)`: we'll call `M` the *functor*, and `N` the
-*argument*.  Note that functor position can be occupied by a variable,
-since `x` is in the functor position of the term `(x y)`.
+Both xforms make the following guarantee: as long as redexes
+underneath a lambda are never evaluated, there will be at most one
+reduction avaialble at any step in the evaluation.
+That is, all choice is removed from the evaluation process.
  
-Ok, with that vocabulary, we can distinguish four different types of 
-normal form:
+Questions and excercises:
  
+1. Why is the CBN xform for variables `[x] = x' instead of something
+involving kappas?  
  
+2. Write an Ocaml function that takes a lambda term and returns a
+CPS-xformed lambda term.
  
+    type form = Var of char | Abs of char * form | App of form * form;;
  
+3. What happens (in terms of evaluation order) when the application
+rule for CBN CPS is changed to `[MN] = \k.[N](\n.[M]nk)`?  Likewise,
+What happens when the application rule for CBV CPS is changed to `<MN>
+= \k.[N](\n.[M](\m.mnk))'?
  
+4. What happens when the application rules for the CPS xforms are changed to
  
+    [MN] = \k.<M>(\m.m<N>k)
+    <MN> = \k.[M](\m.[N](\n.mnk))
  
-Take Plotkin's CBN CPS:
  
-[x] ~~> x
-[\xM] ~~> \k.k(\x[M])
-[MN] ~~> \k.[M](\m.m[N]k)
+Thinking through the types
+--------------------------
+
+This discussion is based on [Meyer and Wand 1985](http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.44.7943&rep=rep1&type=pdf).
  
-let w = \x.xx in
+Let's say we're working in the simply-typed lambda calculus.
+Then if the original term is well-typed, the CPS xform will also be
+well-typed.  But what will the type of the transformed term be?
  
-[(\xy)(ww)] ~~>
-\k.[\xy](\m.m[ww]k) ~~>
-\k.[\xy](\m.m(\k.[w](\m.m[w]k))k) ~~>
-\k.[\xy](\m.m(\k.(\k.k(\x[xx]))(\m.m[w]k))k) ~~> beta*
-\k.[\xy](\m.m(\k.(\x[xx])[w]k)k) ~~>
-\k.[\xy](\m.m(\k.(\x(\k.[x](\m.m[x]k)))[w]k)k) ~~>
-\k.[\xy](\m.m(\k.(\x(\k.x(\m.mxk)))[w]k)k) ~~> beta
-\k.[\xy](\m.m(\k.[w](\m.m[w]k))k) --- same as second line!
+The transformed terms all have the form `\k.blah`.  The rule for the
+CBN xform of a variable appears to be an exception, but instead of
+writing `[x] => x`, we can write `[x] => \k.xk`, which is
+eta-equivalent.  The `k`'s are continuations: functions from something
+to a result.  Let's use $sigma; as the result type.  The each `k` in
+the transform will be a function of type `&rho; --> &sigma;` for some
+choice of &rho;.
+
+We'll need an ancilliary function ': for any ground type a, a' = a;
+for functional types a->b, (a->b)' = a' -> (b' -> o) -> o.
+
+    Call by name transform
+
+    Terms                            Types
+
+    [x] => \k.xk                     [a] => (a'->o)->o
+    [\xM] => \k.k(\x[M])             [a->b] => ((a->b)'->o)->o
+    [MN] => \k.[M](\m.m[N]k)         [b] => (b'->o)->o
+
+Remember that types associate to the right.  Let's work through the
+application xform and make sure the types are consistent.  We'll have
+the following types:
+
+    M:a->b
+    N:a
+    MN:b 
+    k:b'->o
+    [N]:a'
+    m:a'->(b'->o)->o
+    m[N]:(b'->o)->o
+    m[N]k:o 
+    [M]:((a->b)'->o)->o = ((a'->(b'->o)->o)->o)->o
+    [M](\m.m[N]k):o
+    [MN]:(b'->o)->o
+
+Note that even though the transform uses the same symbol for the
+translation of a variable, in general it will have a different type in
+the transformed term.