cps.mdwn

   1 Gaining control over order of evaluation
   2 ----------------------------------------
   3
   4 We know that evaluation order matters.  We're beginning to learn how
   5 to gain some control over order of evaluation (think of Jim's abort handler).
   6 We continue to reason about order of evaluation.
   7
   8 A lucid discussion of evaluation order in the
   9 context of the lambda calculus can be found here:
  10 [Sestoft: Demonstrating Lambda Calculus Reduction](http://www.itu.dk/~sestoft/papers/mfps2001-sestoft.pdf).
  11 Sestoft also provides a lovely on-line lambda evaluator:
  12 [Sestoft: Lambda calculus reduction workbench]
  13 (http://www.itu.dk/~sestoft/lamreduce/index.html),
  14 which allows you to select multiple evaluation strategies,
  15 and to see reductions happen step by step.
  16
  17 Evaluation order matters
  18 ------------------------
  19
  20 We've seen this many times.  For instance, consider the following
  21 reductions.  It will be convenient to use the abbreviation `w =
  22 \x.xx`.  I'll indicate which lambda is about to be reduced with a *
  23 underneath:
  24
  25 <pre>
  26 (\x.y)(ww)
  27  *
  28 y
  29 </pre>
  30
  31 Done!  We have a normal form.  But if we reduce using a different
  32 strategy, things go wrong:
  33
  34 <pre>
  35 (\x.y)(ww) =
  36 (\x.y)((\x.xx)w) =
  37         *
  38 (\x.y)(ww) =
  39 (\x.y)((\x.xx)w) =
  40         *
  41 (\x.y)(ww)
  42 </pre>
  43
  44 Etc.
  45
  46 As a second reminder of when evaluation order matters, consider using
  47 `Y = \f.(\h.f(hh))(\h.f(hh))` as a fixed point combinator to define a recursive function:
  48
  49 <pre>
  50 Y (\f n. blah) =
  51 (\f.(\h.f(hh))(\h.f(hh))) (\f n. blah)
  52      *
  53 (\f.f((\h.f(hh))(\h.f(hh)))) (\f n. blah)
  54        *
  55 (\f.f(f((\h.f(hh))(\h.f(hh))))) (\f n. blah)
  56          *
  57 (\f.f(f(f((\h.f(hh))(\h.f(hh)))))) (\f n. blah)
  58 </pre>
  59
  60 And we never get the recursion off the ground.
  61
  62
  63 Using a Continuation Passing Style transform to control order of evaluation
  64 ---------------------------------------------------------------------------
  65
  66 We'll exhibit and explore the technique of transforming a lambda term
  67 using a Continuation Passing Style transform (CPS), then we'll explore
  68 what the CPS is doing, and how.
  69
  70 In order for the CPS to work, we have to adopt a new restriction on
  71 beta reduction: beta reduction does not occur underneath a lambda.
  72 That is, `(\x.y)z` reduces to `z`, but `\w.(\x.y)z` does not, because
  73 the `\w` protects the redex in the body from reduction.
  74
  75 Start with a simple form that has two different reduction paths:
  76
  77 reducing the leftmost lambda first: `(\x.y)((\x.z)w)  ~~> y'
  78
  79 reducing the rightmost lambda first: `(\x.y)((\x.z)w)  ~~> (x.y)z ~~> y'
  80
  81 After using the following call-by-name CPS transform---and assuming
  82 that we never evaluate redexes protected by a lambda---only the first
  83 reduction path will be available: we will have gained control over the
  84 order in which beta reductions are allowed to be performed.
  85
  86 Here's the CPS transform:
  87
  88     [x] => x
  89     [\xM] => \k.k(\x[M])
  90     [MN] => \k.[M](\m.m[N]k)
  91
  92 Here's the result of applying the transform to our problem term:
  93
  94     [(\x.y)((\x.z)w)]
  95     \k.[\x.y](\m.m[(\x.z)w]k)
  96     \k.(\k.k(\x.[y]))(\m.m(\k.[\x.z](\m.m[w]k))k)
  97     \k.(\k.k(\x.y))(\m.m(\k.(\k.k(\x.z))(\m.mwk))k)
  98
  99 Because the initial `\k` protects the entire transformed term,
 100 we can't perform any reductions.  In order to see the computation
 101 unfold, we have to apply the transformed term to a trivial
 102 continuation, usually the identity function `I = \x.x`.
 103
 104     [(\x.y)((\x.z)w)] I
 105     \k.[\x.y](\m.m[(\x.z)w]k) I
 106     [\x.y](\m.m[(\x.z)w] I)
 107     (\k.k(\x.y))(\m.m[(\x.z)w] I)
 108     (\x.y)[(\x.z)w] I
 109     y I
 110
 111 The application to `I` unlocks the leftmost functor.  Because that
 112 functor (`\x.y`) throws away its argument, we never need to expand the
 113 CPS transform of the argument.
 114
 115 Compare with a call-by-value xform:
 116
 117     <x> => \k.kx
 118     <\aM> => \k.k(\a<M>)
 119     <MN> => \k.<M>(\m.<N>(\n.mnk))
 120
 121 This time the reduction unfolds in a different manner:
 122
 123     <(\x.y)((\x.z)w)> I
 124     (\k.<\x.y>(\m.<(\x.z)w>(\n.mnk))) I
 125     <\x.y>(\m.<(\x.z)w>(\n.mnI))
 126     (\k.k(\x.<y>))(\m.<(\x.z)w>(\n.mnI))
 127     <(\x.z)w>(\n.(\x.<y>)nI)
 128     (\k.<\x.z>(\m.<w>(\n.mnk)))(\n.(\x.<y>)nI)
 129     <\x.z>(\m.<w>(\n.mn(\n.(\x.<y>)nI)))
 130     (\k.k(\x.<z>))(\m.<w>(\n.mn(\n.(\x.<y>)nI)))
 131     <w>(\n.(\x.<z>)n(\n.(\x.<y>)nI))
 132     (\k.kw)(\n.(\x.<z>)n(\n.(\x.<y>)nI))
 133     (\x.<z>)w(\n.(\x.<y>)nI)
 134     <z>(\n.(\x.<y>)nI)
 135     (\k.kz)(\n.(\x.<y>)nI)
 136     (\x.<y>)zI
 137     <y>I
 138     (\k.ky)I
 139     I y
 140
 141 Both xforms make the following guarantee: as long as redexes
 142 underneath a lambda are never evaluated, there will be at most one
 143 reduction avaialble at any step in the evaluation.
 144 That is, all choice is removed from the evaluation process.
 145
 146 Questions and excercises:
 147
 148 1. Why is the CBN xform for variables `[x] = x' instead of something
 149 involving kappas?
 150
 151 2. Write an Ocaml function that takes a lambda term and returns a
 152 CPS-xformed lambda term.
 153
 154     type form = Var of char | Abs of char * form | App of form * form;;
 155
 156 3. What happens (in terms of evaluation order) when the application
 157 rule for CBN CPS is changed to `[MN] = \k.[N](\n.[M]nk)`?  Likewise,
 158 What happens when the application rule for CBV CPS is changed to `<MN>
 159 = \k.[N](\n.[M](\m.mnk))'?
 160
 161 4. What happens when the application rules for the CPS xforms are changed to
 162
 163     [MN] = \k.<M>(\m.m<N>k)
 164     <MN> = \k.[M](\m.[N](\n.mnk))
 165
 166
 167 Thinking through the types
 168 --------------------------
 169
 170 This discussion is based on [Meyer and Wand 1985](http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.44.7943&rep=rep1&type=pdf).
 171
 172 Let's say we're working in the simply-typed lambda calculus.
 173 Then if the original term is well-typed, the CPS xform will also be
 174 well-typed.  But what will the type of the transformed term be?
 175
 176 The transformed terms all have the form `\k.blah`.  The rule for the
 177 CBN xform of a variable appears to be an exception, but instead of
 178 writing `[x] => x`, we can write `[x] => \k.xk`, which is
 179 eta-equivalent.  The `k`'s are continuations: functions from something
 180 to a result.  Let's use $sigma; as the result type.  The each `k` in
 181 the transform will be a function of type `&rho; --> &sigma;` for some
 182 choice of &rho;.
 183
 184 We'll need an ancilliary function ': for any ground type a, a' = a;
 185 for functional types a->b, (a->b)' = a' -> (b' -> o) -> o.
 186
 187     Call by name transform
 188
 189     Terms                            Types
 190
 191     [x] => \k.xk                     [a] => (a'->o)->o
 192     [\xM] => \k.k(\x[M])             [a->b] => ((a->b)'->o)->o
 193     [MN] => \k.[M](\m.m[N]k)         [b] => (b'->o)->o
 194
 195 Remember that types associate to the right.  Let's work through the
 196 application xform and make sure the types are consistent.  We'll have
 197 the following types:
 198
 199     M:a->b
 200     N:a
 201     MN:b
 202     k:b'->o
 203     [N]:a'
 204     m:a'->(b'->o)->o
 205     m[N]:(b'->o)->o
 206     m[N]k:o
 207     [M]:((a->b)'->o)->o = ((a'->(b'->o)->o)->o)->o
 208     [M](\m.m[N]k):o
 209     [MN]:(b'->o)->o
 210
 211 Note that even though the transform uses the same symbol for the
 212 translation of a variable, in general it will have a different type in
 213 the transformed term.
 214