cps.mdwn

   1 Gaining control over order of evaluation
   2 ----------------------------------------
   3
   4 We know that evaluation order matters.  We're beginning to learn how
   5 to gain some control over order of evaluation (think of Jim's abort handler).
   6 We continue to reason about order of evaluation.
   7
   8 A lucid discussion of evaluation order in the
   9 context of the lambda calculus can be found here:
  10 [Sestoft: Demonstrating Lambda Calculus Reduction](http://www.itu.dk/~sestoft/papers/mfps2001-sestoft.pdf).
  11 Sestoft also provides a lovely on-line lambda evaluator:
  12 [Sestoft: Lambda calculus reduction workbench](http://www.itu.dk/~sestoft/lamreduce/index.html),
  13 which allows you to select multiple evaluation strategies,
  14 and to see reductions happen step by step.
  15
  16 Evaluation order matters
  17 ------------------------
  18
  19 We've seen this many times.  For instance, consider the following
  20 reductions.  It will be convenient to use the abbreviation `w =
  21 \x.xx`.  I'll
  22 indicate which lambda is about to be reduced with a * underneath:
  23
  24 <pre>
  25 (\x.y)(ww)
  26  *
  27 y
  28 </pre>
  29
  30 Done!  We have a normal form.  But if we reduce using a different
  31 strategy, things go wrong:
  32
  33 <pre>
  34 (\x.y)(ww) =
  35 (\x.y)((\x.xx)w) =
  36         *
  37 (\x.y)(ww) =
  38 (\x.y)((\x.xx)w) =
  39         *
  40 (\x.y)(ww)
  41 </pre>
  42
  43 Etc.
  44
  45 As a second reminder of when evaluation order matters, consider using
  46 `Y = \f.(\h.f(hh))(\h.f(hh))` as a fixed point combinator to define a recursive function:
  47
  48 <pre>
  49 Y (\f n. blah) =
  50 (\f.(\h.f(hh))(\h.f(hh))) (\f n. blah)
  51      *
  52 (\f.f((\h.f(hh))(\h.f(hh)))) (\f n. blah)
  53        *
  54 (\f.f(f((\h.f(hh))(\h.f(hh))))) (\f n. blah)
  55          *
  56 (\f.f(f(f((\h.f(hh))(\h.f(hh)))))) (\f n. blah)
  57 </pre>
  58
  59 And we never get the recursion off the ground.
  60
  61
  62 Using a Continuation Passing Style transform to control order of evaluation
  63 ---------------------------------------------------------------------------
  64
  65 We'll present a technique for controlling evaluation order by transforming a lambda term
  66 using a Continuation Passing Style transform (CPS), then we'll explore
  67 what the CPS is doing, and how.
  68
  69 In order for the CPS to work, we have to adopt a new restriction on
  70 beta reduction: beta reduction does not occur underneath a lambda.
  71 That is, `(\x.y)z` reduces to `z`, but `\u.(\x.y)z` does not reduce to
  72 `\w.z`, because the `\w` protects the redex in the body from
  73 reduction.  (In this context, a redex is a part of a term that matches
  74 the pattern `...((\xM)N)...`, i.e., something that can potentially be
  75 the target of beta reduction.)
  76
  77 Start with a simple form that has two different reduction paths:
  78
  79 reducing the leftmost lambda first: `(\x.y)((\x.z)w)  ~~> y`
  80
  81 reducing the rightmost lambda first: `(\x.y)((\x.z)w)  ~~> (\x.y)z ~~> y`
  82
  83 After using the following call-by-name CPS transform---and assuming
  84 that we never evaluate redexes protected by a lambda---only the first
  85 reduction path will be available: we will have gained control over the
  86 order in which beta reductions are allowed to be performed.
  87
  88 Here's the CPS transform defined:
  89
  90     [x] = x
  91     [\xM] = \k.k(\x[M])
  92     [MN] = \k.[M](\m.m[N]k)
  93
  94 Here's the result of applying the transform to our problem term:
  95
  96     [(\x.y)((\x.z)u)] =
  97     \k.[\x.y](\m.m[(\x.z)u]k) =
  98     \k.(\k.k(\x.[y]))(\m.m(\k.[\x.z](\m.m[u]k))k) =
  99     \k.(\k.k(\x.y))(\m.m(\k.(\k.k(\x.z))(\m.muk))k)
 100
 101 Because the initial `\k` protects (i.e., takes scope over) the entire
 102 transformed term, we can't perform any reductions.  In order to watch
 103 the computation unfold, we have to apply the transformed term to a
 104 trivial continuation, usually the identity function `I = \x.x`.
 105
 106     [(\x.y)((\x.z)u)] I =
 107     (\k.[\x.y](\m.m[(\x.z)u]k)) I
 108      *
 109     [\x.y](\m.m[(\x.z)u] I) =
 110     (\k.k(\x.y))(\m.m[(\x.z)u] I)
 111      *           *
 112     (\x.y)[(\x.z)u] I
 113      *
 114     y I
 115
 116 The application to `I` unlocks the leftmost functor.  Because that
 117 functor (`\x.y`) throws away its argument, we never need to expand the
 118 CPS transform of the argument.
 119
 120 Compare with a call-by-value xform:
 121
 122     {x} = \k.kx
 123     {\aM} = \k.k(\a{M})
 124     {MN} = \k.{M}(\m.{N}(\n.mnk))
 125
 126 This time the reduction unfolds in a different manner:
 127
 128     {(\x.y)((\x.z)w)} I =
 129     (\k.{\x.y}(\m.{(\x.z)u}(\n.mnk))) I
 130      *
 131     {\x.y}(\m.{(\x.z)u}(\n.mnI)) =
 132     (\k.k(\x.{y}))(\m.{(\x.z)u}(\n.mnI))
 133      *             *
 134     {(\x.z)u}(\n.(\x.{y})nI) =
 135     (\k.{\x.z}(\m.{u}(\n.mnk)))(\n.(\x.{y})nI)
 136      *
 137     {\x.z}(\m.{u}(\n.mn(\n.(\x.{y})nI))) =
 138     (\k.k(\x.{z}))(\m.{u}(\n.mn(\n.(\x.{y})nI)))
 139      *             *
 140     {u}(\n.(\x.{z})n(\n.(\x.{y})nI)) =
 141     (\k.ku)(\n.(\x.{z})n(\n.(\x.{y})nI))
 142      *      *
 143     (\x.{z})u(\n.(\x.{y})nI)
 144      *
 145     {z}(\n.(\x.{y})nI) =
 146     (\k.kz)(\n.(\x.{y})nI)
 147      *      *
 148     (\x.{y})zI
 149      *
 150     {y}I =
 151     (\k.ky)I
 152      *
 153     I y
 154
 155 Both xforms make the following guarantee: as long as redexes
 156 underneath a lambda are never evaluated, there will be at most one
 157 reduction available at any step in the evaluation.
 158 That is, all choice is removed from the evaluation process.
 159
 160 Now let's verify that the CBN CPS avoids the infinite reduction path
 161 discussed above (remember that `w = \x.xx`):
 162
 163     [(\x.y)(ww)] I =
 164     (\k.[\x.y](\m.m[ww]k)) I
 165      *
 166     [\x.y](\m.m[ww]I) =
 167     (\k.k(\x.y))(\m.m[ww]I)
 168      *             *
 169     (\x.y)[ww]I
 170      *
 171     y I
 172
 173
 174 Questions and exercises:
 175
 176 1. Prove that {(\x.y)(ww)} does not terminate.
 177
 178 2. Why is the CBN xform for variables `[x] = x' instead of something
 179 involving kappas?
 180
 181 3. Write an Ocaml function that takes a lambda term and returns a
 182 CPS-xformed lambda term.  You can use the following data declaration:
 183
 184     type form = Var of char | Abs of char * form | App of form * form;;
 185
 186 4. The discussion above talks about the "leftmost" redex, or the
 187 "rightmost".  But these words apply accurately only in a special set
 188 of terms.  Characterize the order of evaluation for CBN (likewise, for
 189 CBV) more completely and carefully.
 190
 191 5. What happens (in terms of evaluation order) when the application
 192 rule for CBV CPS is changed to `{MN} = \k.{N}(\n.{M}(\m.mnk))`?
 193
 194
 195 Thinking through the types
 196 --------------------------
 197
 198 This discussion is based on [Meyer and Wand 1985](http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.44.7943&rep=rep1&type=pdf).
 199
 200 Let's say we're working in the simply-typed lambda calculus.
 201 Then if the original term is well-typed, the CPS xform will also be
 202 well-typed.  But what will the type of the transformed term be?
 203
 204 The transformed terms all have the form `\k.blah`.  The rule for the
 205 CBN xform of a variable appears to be an exception, but instead of
 206 writing `[x] = x`, we can write `[x] = \k.xk`, which is
 207 eta-equivalent.  The `k`'s are continuations: functions from something
 208 to a result.  Let's use &sigma; as the result type.  The each `k` in
 209 the transform will be a function of type &rho; --> &sigma; for some
 210 choice of &rho;.
 211
 212 We'll need an ancilliary function ': for any ground type a, a' = a;
 213 for functional types a->b, (a->b)' = ((a' -> &sigma;) -> &sigma;) -> (b' -> &sigma;) -> &sigma;.
 214
 215     Call by name transform
 216
 217     Terms                            Types
 218
 219     [x] = \k.xk                      [a] = (a'->&sigma;)->&sigma;
 220     [\xM] = \k.k(\x[M])              [a->b] = ((a->b)'->&sigma;)->&sigma;
 221     [MN] = \k.[M](\m.m[N]k)          [b] = (b'->&sigma;)->&sigma;
 222
 223 Remember that types associate to the right.  Let's work through the
 224 application xform and make sure the types are consistent.  We'll have
 225 the following types:
 226
 227     M:a->b
 228     N:a
 229     MN:b
 230     k:b'->&sigma;
 231     [N]:(a'->&sigma;)->&sigma;
 232     m:((a'->&sigma;)->&sigma;)->(b'->&sigma;)->&sigma;
 233     m[N]:(b'->&sigma;)->&sigma;
 234     m[N]k:&sigma;
 235     [M]:((a->b)'->&sigma;)->&sigma; = ((((a'->&sigma;)->&sigma;)->(b'->&sigma;)->&sigma;)->&sigma;)->&sigma;
 236     [M](\m.m[N]k):&sigma;
 237     [MN]:(b'->&sigma;)->&sigma;
 238
 239 Be aware that even though the transform uses the same symbol for the
 240 translation of a variable (i.e., `[x] = x`), in general the variable
 241 in the transformed term will have a different type than in the source
 242 term.
 243
 244 Excercise: what should the function ' be for the CBV xform?  Hint:
 245 see the Meyer and Wand abstract linked above for the answer.