topics/_week7_eval_cl.mdwn

   1 <!-- λ Λ ∀ ≡ α β ω Ω -->
   2
   3 [[!toc levels=2]]
   4
   5 # Reasoning about evaluation order in Combinatory Logic
   6
   7 We've discussed evaluation order before, primarily in connection with
   8 the untyped lambda calculus.  Whenever a term has more than one redex,
   9 we have to choose which one to reduce, and this choice can make a
  10 difference.  For instance, in the term `((\x.I)(ωω)`, if we reduce the
  11 leftmost redex first, the term reduces to the normal form `I` in one
  12 step.  But if we reduce the left most redex instead (namely, `(ωω)`),
  13 we do not arrive at a normal form, and are in danger of entering an
  14 infinite loop.
  15
  16 Thanks to the introduction of sum types (disjoint union), we are now
  17 in a position to gain a deeper understanding by writing a program that
  18 allows us to experiment with different evaluation order strategies.
  19
  20 One thing we'll see is that it is all too easy for the evaluation
  21 order properties of an evaluator to depend on the evaluation order
  22 properties of the programming language in which the evaluator is
  23 written.  We will seek to write an evaluator in which the order of
  24 evaluation is insensitive to the evaluator language.  The goal is to
  25 find an order-insensitive way to reason about evaluation order.
  26
  27 The first evaluator we develop will evaluate terms in Combinatory
  28 Logic.  That will significantly simplify the program, since we won't
  29 need to worry about variables or substitution.  As we develop and
  30 extend our evaluator in future weeks, we'll switch to lambdas, but for
  31 now, working with the elegant simplicity of Combinatory Logic will
  32 make the evaluation order issues easier to grasp.
  33
  34 A brief review: a term in CL is the combination of three basic
  35 expressions, `S`, `K`, and `I`, governed by the following reduction
  36 rules:
  37
  38     Ia ~~> a
  39     Kab ~~> b
  40     Sabc ~~> ac(bc)
  41
  42 where `a`, `b`, and `c` stand for an arbitrary term of CL.  We've seen
  43 that how to embed the untyped lambda calculus in CL, so it's no
  44 surprise that the same evaluation order issues arise in CL.
  45
  46     skomega = SII(SII)
  47             ~~> I(SII)(I(SII))
  48             ~~> SII(SII)
  49
  50 Instead of the lambda term `Ω`, we'll use the CL term skomega, defined
  51 here, though we'll use the same symbol, `Ω`.  If we consider the term
  52
  53     KIΩ == KI(SII(SII))
  54
  55 we can choose to reduce the leftmost redex by firing the reduction
  56 rule for `K`, in which case the term reduces to the normal form `I` in
  57 one step; or we can choose to reduce the skomega part, by firing the
  58 reduction rule fo `S`, in which case we do not get a normal form,
  59 we're headed towards an infinite loop.
  60
  61 With sum types, we can define terms in CL in OCaml as follows:
  62
  63     type term = I | S | K | FA of (term * term)
  64
  65     let skomega = FA (FA (FA (S,I), I), FA (FA (S,I), I))
  66     let test1 = FA (FA (K,I), skomega)
  67
  68 This recursive type definition says that a term in CL is either one of
  69 the three simple expressions, or else a pair of CL expressions.
  70 Following Heim and Kratzer, `FA` stands for Functional Application.
  71 With this type definition, we can encode skomega, as well as other
  72 terms whose reduction behavior we want to control.
  73
  74 Using pattern matching, it is easy to code the one-step reduction
  75 rules for CL:
  76
  77     let reduce_one_step (t:term):term = match t with
  78         FA(I,a) -> a
  79       | FA(FA(K,a),b) -> a
  80       | FA(FA(FA(S,a),b),c) -> FA(FA(a,c),FA(b,c))
  81       | _ -> t
  82
  83     # reduce_one_step (FA(FA(K,S),I));;
  84     - : term = S
  85     # reduce_one_step skomega;;
  86     - : term = FA (FA (I, FA (FA (S, I), I)), FA (I, FA (FA (S, I), I)))
  87
  88 The need to explicitly insert the type constructor `FA` obscures
  89 things a bit, but it's still possible to see how the one-step
  90 reduction function is just the reduction rules for CL.  The
  91 OCaml interpreter shows us that the function faithfully recognizes
  92 that `KSI ~~> S`, and `skomega ~~> I(SII)(I(SII))`.
  93
  94 We can now say precisely what it means to be a redex in CL.
  95
  96     let is_redex (t:term):bool = not (t = reduce_one_step t)
  97
  98     # is_redex K;;
  99     - : bool = false
 100     # is_redex (FA(K,I));;
 101     - : bool = false
 102     # is_redex (FA(FA(K,I),S));;
 103     - : bool = true
 104     # is_redex skomega;;
 105     - : book = true
 106
 107 Warning: this definition relies on the fact that the one-step
 108 reduction of a CL term is never identical to the original term.  This
 109 would not work for the untyped lambda calculus, since
 110 `((\x.xx)(\x.xx)) ~~> ((\x.xx)(\x.xx))` in one step.  Note that in
 111 order to decide whether two terms are equal, OCaml has to recursively
 112 compare the elements of complex CL terms.  It is able to figure out
 113 how to do this because we provided an explicit definition of the
 114 datatype Term.
 115
 116 As you would expect, a term in CL is in normal form when it contains
 117 no redexes.
 118
 119 In order to fully reduce a term, we need to be able to reduce redexes
 120 that are not at the top level of the term.
 121
 122     (KKI)SI ~~> KSI ~~> S
 123
 124 That is, we want to be able to first evaluate the redex `KKI` that is
 125 a proper subpart of the larger term to produce a new intermediate term
 126 that we can then evaluate to the final normal form.
 127
 128 Because we need to process subparts, and because the result after
 129 processing a subpart may require further processing, the recursive
 130 structure of our evaluation function has to be quite subtle.  To truly
 131 understand it, you will need to do some sophisticated thinking about
 132 how recursion works.  We'll show you how to keep track of what is
 133 going on by constructing an recursive execution trace of inputs and
 134 outputs.
 135
 136 We'll develop our full reduction function in stages.  Once we have it
 137 working, we'll then consider some variants.  Just to be precise, we'll
 138 distinguish each microvariant with its own index number embedded in
 139 its name.
 140
 141     let rec reduce1 (t:term):term =
 142       if (is_redex t) then reduce1 (reduce_one_step t)
 143                       else t
 144
 145 If the input is a redex, we ship it off to `reduce_one_step` for
 146 processing.  But just in case the result of the one-step reduction is
 147 itself a redex, we recursively call `reduce1`.  The recursion will
 148 continue until the result is no longer a redex.
 149
 150     #trace reduce1;;
 151     reduce1 is now traced.
 152     # reduce1 (FA (I, FA (I, K)));;
 153     reduce1 <-- FA (I, FA (I, K))
 154       reduce1 <-- FA (I, K)
 155         reduce1 <-- K
 156         reduce1 --> K
 157       reduce1 --> K
 158     reduce1 --> K
 159     - : term = K
 160
 161 Since the initial input (`I(IK)`) is a redex, the result after the
 162 one-step reduction is `IK`.  We recursively call `reduce1` on this
 163 input.  Since `IK` is itself a redex, the result after one-step
 164 reduction is `K`.  We recursively call `reduce1` on this input.  Since
 165 `K` is not a redex, the recursion bottoms out, and we simply return
 166 it.
 167
 168 But this function doesn't do enough reduction.
 169
 170     # reduce1 (FA (FA (I, I), K));;
 171     - : term = FA (FA (I, I), K)
 172
 173 Because the top-level term is not a redex, `reduce1` returns it
 174 without any evaluation.  What we want is to evaluate the subparts of a
 175 complex term.
 176
 177     let rec reduce2 (t:term):term = match t with
 178         I -> I
 179       | K -> K
 180       | S -> S
 181       | FA (a, b) ->
 182           let t' = FA (reduce2 a, reduce2 b) in
 183             if (is_redex t') then reduce2 (reduce_one_step t')
 184                              else t'
 185
 186 Since we need access to the subterms, we do pattern matching on the
 187 input term.  If the input is simple, we return it.  If the input is
 188 complex, we first process the subexpressions, and only then see if we
 189 have a redex.  To understand how this works, follow the trace
 190 carefully:
 191
 192     # reduce2 (FA(FA(I,I),K));;
 193     reduce2 <-- FA (FA (I, I), K)
 194
 195       reduce2 <-- K          ; first main recursive call
 196       reduce2 --> K
 197
 198       reduce2 <-- FA (I, I)  ; second main recursive call
 199         reduce2 <-- I
 200         reduce2 --> I
 201         reduce2 <-- I
 202         reduce2 --> I
 203       reduce2 <-- I
 204
 205       reduce2 --> I           ; third main recursive call
 206       reduce2 --> I
 207
 208       reduce2 <-- K           ; fourth
 209       reduce2 --> K
 210     reduce2 --> K
 211     - : term = K
 212
 213 Ok, there's a lot going on here.  Since the input is complex, the
 214 first thing the function does is construct `t'`.  In order to do this,
 215 it must reduce the two main subexpressions, `II` and `K`.  But we see
 216 from the trace that it begins with the right-hand expression, `K`.  We
 217 didn't explicitly tell it to begin with the right-hand subexpression,
 218 so control over evaluation order is starting to spin out of our
 219 control.  (We'll get it back later, don't worry.)
 220
 221 In any case, in the second main recursive call, we evaluate `II`.  The
 222 result is `I`.  The third main recursive call tests whether this
 223 result needs any further processing; it doesn't.
 224
 225 At this point, we have constructed `t' == FA(I,K)`.  Since that's a
 226 redex, we ship it off to reduce_one_step, getting the term `K` as a
 227 result.  The fourth recursive call checks that there is no more
 228 reduction work to be done (there isn't), and that's our final result.
 229
 230 You can see in more detail what is going on by tracing both reduce2
 231 and reduce_one_step, but that makes for some long traces.
 232
 233 So we've solved our first problem: reduce2 recognizes that `IIK ~~>
 234 K`, as desired.
 235
 236 Because the OCaml interpreter evaluates the rightmost expression
 237 in the course of building `t'`, however, it will always evaluate the
 238 right hand subexpression, whether it needs to or not.  And sure
 239 enough,
 240
 241     # reduce2 (FA(FA(K,I),skomega));;
 242       C-c C-cInterrupted.
 243
 244 instead of performing the leftmost reduction first, and recognizing
 245 that this term reduces to the normal form `I`, we get lost endlessly
 246 trying to reduce skomega.
 247
 248 To emphasize that our evaluation order here is at the mercy of the
 249 evaluation order of OCaml, here is the exact same program translated
 250 into Haskell.  We'll put them side by side to emphasize the exact parallel.
 251
 252 <pre>
 253 OCaml                                                          Haskell
 254 ==========================================================     =========================================================
 255
 256 type term = I | S | K | FA of (term * term)                    data Term = I | S | K | FA Term Term deriving (Eq, Show)
 257
 258 let skomega = FA (FA (FA (S,I), I), FA (FA (S,I), I))          skomega = (FA (FA (FA S I) I) (FA (FA S I) I))
 259
 260                                                                reduce_one_step :: Term -> Term
 261 let reduce_one_step (t:term):term = match t with               reduce_one_step t = case t of
 262     FA(I,a) -> a                                                 FA I a -> a
 263   | FA(FA(K,a),b) -> a                                           FA (FA K a) b -> a
 264   | FA(FA(FA(S,a),b),c) -> FA(FA(a,c),FA(b,c))                   FA (FA (FA S a) b) c -> FA (FA a c) (FA b c)
 265   | _ -> t                                                       _ -> t
 266
 267                                                                is_redex :: Term -> Bool
 268 let is_redex (t:term):bool = not (t = reduce_one_step t)       is_redex t = not (t == reduce_one_step t)
 269
 270                                                                reduce2 :: Term -> Term
 271 let rec reduce2 (t:term):term = match t with                   reduce2 t = case t of
 272     I -> I                                                       I -> I
 273   | K -> K                                                       K -> K
 274   | S -> S                                                       S -> S
 275   | FA (a, b) ->                                                 FA a b ->
 276       let t' = FA (reduce2 a, reduce2 b) in                        let t' = FA (reduce2 a) (reduce2 b) in
 277         if (is_redex t') then reduce2 (reduce_one_step t')           if (is_redex t') then reduce2 (reduce_one_step t')
 278                          else t'                                                      else t'
 279 </pre>
 280
 281 There are some differences in the way types are made explicit, and in
 282 the way terms are specified (`FA(a,b)` for Ocaml versus `FA a b` for
 283 Haskell).  But the two programs are essentially identical.
 284
 285 Yet the Haskell program finds the normal form for KI -->
 286
 287     *Main> reduce2 (FA (FA K I) skomega)
 288     I
 289
 290 Woa!  First of all, this is wierd.  Haskell's evaluation strategy is
 291 called "lazy".  Apparently, Haskell is so lazy that even after we've
 292 asked it to construct t' by evaluating `reduce2 a` and `reduce2 b`, it
 293 doesn't bother computing `reduce2 b`.  Instead, it waits to see if we
 294 ever really need to use the result.
 295
 296 So despite intuitions, the program as written does NOT determine
 297 evaluation order behavior.  At this stage, we have defined an
 298 evaluation order that still depends on the evaluation order of the
 299 underlying interpreter.
 300
 301 There are two questions we could ask: Can we adjust the OCaml
 302 evaluator to exhibit lazy behavior?  and Can we adjust the Haskell
 303 evaluator to exhibit eager behavior?  The answer to the first question
 304 is easy and interesting, and we'll give it right away.  The answer to
 305 the second question is also interesting, but not easy.  There are
 306 various tricks available in Haskell we could use (such as the `seq`
 307 operator), but a fully general, satisifying resolution will have to
 308 wait until we have Continuation Passing Transforms.
 309
 310 The answer to the first question (Can we adjust the OCaml evaluator to
 311 exhibit lazy behavior?) is quite simple:
 312
 313 <pre>
 314 let rec reduce3 (t:term):term = match t with
 315     I -> I
 316   | K -> K
 317   | S -> S
 318   | FA (a, b) ->
 319       let t' = FA (reduce3 a, b) in
 320         if (is_redex t') then reduce3 (reduce_one_step t')
 321                          else t'
 322 </pre>
 323
 324 There is only one small difference: instead of setting `t'` to
 325 `FA (reduce a, reduce b)`, we have `FA (reduce a, b)`.  That is, we
 326 don't evaluate the right-hand subexpression at all.  Ever!  The only
 327 way to get evaluated is to somehow get into functor position.
 328
 329     # reduce3 (FA(FA(K,I),skomega));;
 330     - : term = I
 331     # reduce3 skomega;;
 332     C-c C-cInterrupted.
 333
 334 The evaluator now has no trouble finding the normal form for `KIΩ`,
 335 but evaluating skomega still gives an infinite loop.
 336
 337 As a final note, we can clarify the larger question at the heart of
 338 this discussion:
 339
 340 *How can we can we specify the evaluation order of a computational
 341 system in a way that is completely insensitive to the evaluation order
 342 of the specification language?*