changes

[lambda.git] / cps.mdwn
diff --git a/cps.mdwn b/cps.mdwn

index f869b79..d7752f4 100644 (file)
--- a/cps.mdwn
+++ b/cps.mdwn
@@ -69,16 +69,16 @@ what the CPS is doing, and how.
  In order for the CPS to work, we have to adopt a new restriction on
  beta reduction: beta reduction does not occur underneath a lambda.
  That is, `(\x.y)z` reduces to `z`, but `\u.(\x.y)z` does not reduce to
-`\w.z`, because the `\w` protects the redex in the body from
-reduction.  (In this context, a redex is a part of a term that matches
+`\u.z`, because the `\u` protects the redex in the body from
+reduction.  (In this context, a "redex" is a part of a term that matches
  the pattern `...((\xM)N)...`, i.e., something that can potentially be
  the target of beta reduction.)
  
  Start with a simple form that has two different reduction paths:
  
-reducing the leftmost lambda first: `(\x.y)((\x.z)w)  ~~> y`
+reducing the leftmost lambda first: `(\x.y)((\x.z)u)  ~~> y`
  
-reducing the rightmost lambda first: `(\x.y)((\x.z)w)  ~~> (\x.y)z ~~> y`
+reducing the rightmost lambda first: `(\x.y)((\x.z)u)  ~~> (\x.y)z ~~> y`
  
  After using the following call-by-name CPS transform---and assuming
  that we never evaluate redexes protected by a lambda---only the first
@@ -91,7 +91,7 @@ Here's the CPS transform defined:
      [\xM] = \k.k(\x[M])
      [MN] = \k.[M](\m.m[N]k)
  
-Here's the result of applying the transform to our problem term:
+Here's the result of applying the transform to our simple example:
  
      [(\x.y)((\x.z)u)] =
      \k.[\x.y](\m.m[(\x.z)u]k) =
@@ -109,13 +109,15 @@ trivial continuation, usually the identity function `I = \x.x`.
      [\x.y](\m.m[(\x.z)u] I) =
      (\k.k(\x.y))(\m.m[(\x.z)u] I)
       *           *
-    (\x.y)[(\x.z)u] I
+    (\x.y)[(\x.z)u] I           --A--
       *
      y I
  
  The application to `I` unlocks the leftmost functor.  Because that
-functor (`\x.y`) throws away its argument, we never need to expand the
-CPS transform of the argument.
+functor (`\x.y`) throws away its argument (consider the reduction in the
+line marked (A)), we never need to expand the
+CPS transform of the argument.  This means that we never bother to
+reduce redexes inside the argument.
  
  Compare with a call-by-value xform:
  
@@ -125,7 +127,7 @@ Compare with a call-by-value xform:
  
  This time the reduction unfolds in a different manner:
  
-    {(\x.y)((\x.z)w)} I =
+    {(\x.y)((\x.z)u)} I =
      (\k.{\x.y}(\m.{(\x.z)u}(\n.mnk))) I
       *
      {\x.y}(\m.{(\x.z)u}(\n.mnI)) =
@@ -140,7 +142,7 @@ This time the reduction unfolds in a different manner:
      {u}(\n.(\x.{z})n(\n.(\x.{y})nI)) =
      (\k.ku)(\n.(\x.{z})n(\n.(\x.{y})nI))
       *      *
-    (\x.{z})u(\n.(\x.{y})nI)
+    (\x.{z})u(\n.(\x.{y})nI)       --A--
       *
      {z}(\n.(\x.{y})nI) =
      (\k.kz)(\n.(\x.{y})nI)
@@ -152,6 +154,9 @@ This time the reduction unfolds in a different manner:
       *
      I y
  
+In this case, the argument does get evaluated: consider the reduction
+in the line marked (A).
+
  Both xforms make the following guarantee: as long as redexes
  underneath a lambda are never evaluated, there will be at most one
  reduction available at any step in the evaluation.
@@ -175,8 +180,8 @@ Questions and exercises:
  
  1. Prove that {(\x.y)(ww)} does not terminate.
  
-2. Why is the CBN xform for variables `[x] = x' instead of something
-involving kappas?  
+2. Why is the CBN xform for variables `[x] = x` instead of something
+involving kappas (i.e., `k`'s)?  
  
  3. Write an Ocaml function that takes a lambda term and returns a
  CPS-xformed lambda term.  You can use the following data declaration:
@@ -191,6 +196,10 @@ CBV) more completely and carefully.
  5. What happens (in terms of evaluation order) when the application
  rule for CBV CPS is changed to `{MN} = \k.{N}(\n.{M}(\m.mnk))`?
  
+6. A term and its CPS xform are different lambda terms.  Yet in some
+sense they "do" the same thing computationally.  Make this sense
+precise.
+
  
  Thinking through the types
  --------------------------
@@ -216,9 +225,9 @@ for functional types a->b, (a->b)' = ((a' -> &sigma;) -> &sigma;) -> (b' -> &sig
  
      Terms                            Types
  
-    [x] = \k.xk                      [a] = (a'->&sigma;)->&sigma;
-    [\xM] = \k.k(\x[M])              [a->b] = ((a->b)'->&sigma;)->&sigma;
-    [MN] = \k.[M](\m.m[N]k)          [b] = (b'->&sigma;)->&sigma;
+    [x] = \k.xk                      [a] = (a'->o)->o
+    [\xM] = \k.k(\x[M])              [a->b] = ((a->b)'->o)->o
+    [MN] = \k.[M](\m.m[N]k)          [b] = (b'->o)->o
  
  Remember that types associate to the right.  Let's work through the
  application xform and make sure the types are consistent.  We'll have
@@ -227,14 +236,14 @@ the following types:
      M:a->b
      N:a
      MN:b 
-    k:b'->&sigma;
-    [N]:(a'->&sigma;)->&sigma;
-    m:((a'->&sigma;)->&sigma;)->(b'->&sigma;)->&sigma;
-    m[N]:(b'->&sigma;)->&sigma;
-    m[N]k:&sigma; 
-    [M]:((a->b)'->&sigma;)->&sigma; = ((((a'->&sigma;)->&sigma;)->(b'->&sigma;)->&sigma;)->&sigma;)->&sigma;
-    [M](\m.m[N]k):&sigma;
-    [MN]:(b'->&sigma;)->&sigma;
+    k:b'->o
+    [N]:(a'->o)->o
+    m:((a'->o)->o)->(b'->o)->o
+    m[N]:(b'->o)->o
+    m[N]k:o 
+    [M]:((a->b)'->o)->o = ((((a'->o)->o)->(b'->o)->o)->o)->o
+    [M](\m.m[N]k):o
+    [MN]:(b'->o)->o
  
  Be aware that even though the transform uses the same symbol for the
  translation of a variable (i.e., `[x] = x`), in general the variable
@@ -243,3 +252,35 @@ term.
  
  Excercise: what should the function ' be for the CBV xform?  Hint: 
  see the Meyer and Wand abstract linked above for the answer.
+
+
+Other CPS transforms
+--------------------
+
+It is easy to think that CBN and CBV are the only two CPS transforms.
+(We've already seen a variant on call-by-value one of the excercises above.) 
+
+In fact, the number of distinct transforms is unbounded.  For
+instance, here is a variant of CBV that uses the same types as CBN:
+
+    <x> = x
+    <\xM> = \k.k(\x<M>)
+    <MN> = \k.<M>(\m.<N>(\n.m(\k.kn)k))
+
+Try reducing `<(\x.x) ((\y.y) (\z.z))> I` to convince yourself that
+this is a version of call-by-value.
+
+Once we have two evaluation strategies that rely on the same types, we
+can mix and match:
+
+    [x] = x
+    <x> = x
+    [\xM] = \k.k(\x<M>)
+    <\xM] = \k.k(\x[M])
+    [MN] = \k.<M>(\m.m<N>k)
+    <MN> = \k.[M](\m.[N](\n.m(\k.kn)k))
+
+This xform interleaves call-by-name and call-by-value in layers,
+according to the depth of embedding.
+(Cf. page 4 of Reynold's 1974 paper ftp://ftp.cs.cmu.edu/user/jcr/reldircont.pdf (equation (4) and the
+explanation in the paragraph below.)