X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=blobdiff_plain;f=topics%2Fweek7_combinatory_evaluator.mdwn;fp=topics%2Fweek7_combinatory_evaluator.mdwn;h=9c3ac87ac569806e31bda24712704af5146e2978;hp=0000000000000000000000000000000000000000;hb=d198ef88b87d24fb20d2153a5c49be6d1285668e;hpb=65dc3facfaa2756eae422eb4bf0a65966e20cbc6 diff --git a/topics/week7_combinatory_evaluator.mdwn b/topics/week7_combinatory_evaluator.mdwn new file mode 100644 index 00000000..9c3ac87a --- /dev/null +++ b/topics/week7_combinatory_evaluator.mdwn @@ -0,0 +1,402 @@ + + +[[!toc levels=2]] + +## Reasoning about evaluation order in Combinatory Logic + +We've discussed [[evaluation order|topics/week3_evaluation_order]] +before, primarily in connection with the untyped lambda calculus. +Whenever a term contains more than one redex, we have to choose which +one to reduce, and this choice can make a difference. For instance, +recall that: + + Ω == ωω == (\x.xx)(\x.xx) + +so: + + ((\x.I)Ω) == ((\x.I)((\x.xx)(\x.xx))) + * * + +There are two redexes in this term; we've marked the operative lambdas +with a star. If we reduce the leftmost redex first, the term reduces +to the normal form `I` in one step. But if we reduce the rightmost +redex instead, the "reduced" form is `(\x.I)Ω` again, and we're starting off on an infinite loop. + +Thanks to the introduction of sum types (disjoint union) in the last lecture, we +are now in a position to gain a deeper understanding of evaluation +order by writing a program that allows us to reason explicitly about evaluation. + +One thing we'll see is that it is all too easy for the evaluation +order properties of an evaluator to depend on the evaluation order +properties of the programming language in which the evaluator is +written. We would like to write an evaluator in which the order of +evaluation is insensitive to the evaluator language. That is, the goal is to +find an order-insensitive way to reason about evaluation order. We +will not fully succeed in this first attempt, but we will make good +progress. + +The first evaluator we will develop will evaluate terms in Combinatory +Logic. This significantly simplifies the discussion, since we won't +need to worry about variables or substitution. As we develop and +extend our evaluator in homework and other lectures, we'll switch to lambdas, but for +now, working with the simplicity of Combinatory Logic will make it +easier to highlight evaluation order issues. + +A brief review of Combinatory Logic: a term in CL is the combination +of three basic expressions, `S`, `K`, and `I`, governed by the +following reduction rules: + + Ia ~~> a + Kab ~~> a + Sabc ~~> ac(bc) + +where `a`, `b`, and `c` stand for an arbitrary term of CL. We've seen +how to embed the untyped lambda calculus in CL, so it's no surprise +that evaluation order issues arise in CL. To illustrate, we'll use +the following definition: + + skomega = SII + Skomega = skomega skomega == SII(SII) + ~~> I(SII)(I(SII)) + ~~> SII(SII) + +We'll use the same symbol, `Ω`, for Omega and Skomega: in a lambda +term, `Ω` refers to Omega, but in a CL term, `Ω` refers to Skomega as +defined here. + +Just as in the corresponding term in the lambda calculus, CL terms can +contain more than one redex: + + KIΩ == KI(SII(SII)) + * * + +We can choose to reduce the leftmost redex by applying the reduction +rule for `K`, in which case the term reduces to the normal form `I` in +one step; or we can choose to reduce the Skomega part, by applying the +reduction rule `S`, in which case we do not get a normal form, and +we're headed into an infinite loop. + +With sum types, we can define CL terms in OCaml as follows: + + type term = I | K | S | App of (term * term) + + let skomega = App (App (App (S, I), I), App (App (S, I), I)) + +This type definition says that a term in CL is either one of the three +atomic expressions (`I`, `K`, or `S`), or else a pair of CL +expressions. `App` stands for Functional Application. With this type +definition, we can encode Skomega, as well as other terms whose +reduction behavior we want to try to control. We can *control* it because +the `App` variant of our datatype merely *encodes* the application of the +head to the argument, and doesn't actually *perform* that application. +We have to explicitly model the application ourself. + +Using pattern matching, it is easy to code the one-step reduction +rules for CL: + + let reduce_if_redex (t:term) : term = match t with + | App(I,a) -> a + | App(App(K,a),b) -> a + | App(App(App(S,a),b),c) -> App(App(a,c),App(b,c)) + | _ -> t + + # reduce_if_redex (App(App(K,S),I));; + - : term = S + # reduce_if_redex skomega;; + - : term = App (App (I, App (App (S, I), I)), App (I, App (App (S, I), I))) + +The definition of `reduce_if_redex` explicitly says that it expects +its input argument `t` to have type `term`, and the second `: term` +says that the result the function delivers will also be of +type `term`. + +The type constructor `App` obscures things a bit, but it's still +possible to see how the one-step reduction function is just the +reduction rules for CL. The OCaml interpreter responses given above show us that the +function faithfully recognizes that `KSI ~~> S`, and that `Skomega ~~> +I(SII)(I(SII))`. + +As you would expect, a term in CL is in normal form when it contains +no redexes (analogously for head normal form, weak head normal form, etc.) + +How can we tell whether a term is a redex? Here's one way: + + let is_redex (t:term):bool = not (t = reduce_if_redex t) + + # is_redex K;; + - : bool = false + # is_redex (App(K,I));; + - : bool = false + # is_redex (App(App(K,I),S));; + - : bool = true + # is_redex skomega;; + - : book = true + +In order to decide whether two terms are equal, OCaml has to +recursively compare the elements of complex CL terms. It is able to +figure out how to do this because we provided an explicit definition +of the datatype `term`. + +Warning: this method for telling whether a term is a redex relies on the accidental fact that the +one-step reduction of a CL term is never identical to the original +term. This would not work for the untyped lambda calculus, since +`((\x.xx)(\x.xx)) ~~> ((\x.xx)(\x.xx))` in one step. Neither would it work if +we had chosen some other combinators as primitives (`W W1 W2` reduces to +`W1 W2 W2`, so if they are all `W`s we'd be in trouble.) We will discuss +some alternative strategies in other notes. + +So far, we've only asked whether a term _is_ a redex, not whether it _contains_ other redexes as subterms. But +in order to fully reduce a term, we need to be able to reduce redexes +that are not at the top level of the term. Because we need to process subterms, and because the result after +processing a subterm may require further processing, the recursive +structure of our evaluation function has to be somewhat subtle. To +truly understand, we will need to do some sophisticated thinking +about how recursion works. + +We'll develop our full reduction function in two stages. Once we have +it working, we'll then consider a variant. + + let rec reduce_try1 (t:term) : term = + if (is_redex t) then let t' = reduce_if_redex t + in reduce_try1 t' + else t + +If the input is a redex, we ship it off to `reduce_if_redex` for +processing. But just in case the result of the one-step reduction is +itself a redex, we recursively apply `reduce_try1` to the result. The recursion +will continue until the result is no longer a redex. We're aiming at +allowing the evaluator to recognize that + + I (I K) ~~> I K ~~> K + +When trying to understand how recursive functions work, it can be +extremely helpful to examine an execution trace of inputs and +outputs. + + # #trace reduce_try1;; + reduce_try1 is now traced. + +The first `#` there is OCaml's prompt. The text beginning `#trace ...` is what we typed. Now OCaml will report on all the input to, and results from, the `reduce_try1` function. Watch: + + # reduce_try1 (App (I, App (I, K)));; + reduce_try1 <-- App (I, App (I, K)) + reduce_try1 <-- App (I, K) + reduce_try1 <-- K + reduce_try1 --> K + reduce_try1 --> K + reduce_try1 --> K + - : term = K + +In the trace, "`<--`" shows the input argument to a call to +`reduce_try1`, and "`-->`" shows the output result. + +Since the initial input (`I(IK)`) is a redex, the result after the +one-step reduction is `IK`. We recursively call `reduce_try1` on +this input. Since `IK` is itself a redex, the result after one-step +reduction is `K`. We recursively call `reduce_try1` on this input. Since +`K` is not a redex, the recursion bottoms out, and we return the +result. + +But this function doesn't do enough reduction. We want to recognize +the following reduction path: + + I I K ~~> I K ~~> K + +But the reduction function as written above does not deliver this result: + + # reduce_try1 (App (App (I, I), K));; + - : term = App (App (I, I), K) + +The reason is that the top-level term is not a redex to start with, +so `reduce_try1` returns it without any evaluation. + +What we want is to evaluate the subterms of a complex term. We'll do this by pattern matching our +top-level term to see when it _has_ subterms: + + let rec reduce_try2 (t : term) : term = match t with + | I -> I + | K -> K + | S -> S + | App (a, b) -> + let t' = App (reduce_try2 a, reduce_try2 b) in + if (is_redex t') then let t'' = reduce_if_redex t' + in reduce_try2 t'' + else t' + +Since we need access to the subterms, we do pattern matching on the +input. If the input is simple (the first three `match` cases), we +return it without further processing. But if the input is complex, we +first process the subexpressions, and only then see if we have a redex +at the top level. + +To understand how this works, follow the trace +carefully: + + # reduce_try2 (App(App(I,I),K));; + reduce_try2 <-- App (App (I, I), K) + + reduce_try2 <-- K ; first main recursive call + reduce_try2 --> K + + reduce_try2 <-- App (I, I) ; second main recursive call + reduce_try2 <-- I + reduce_try2 --> I + reduce_try2 <-- I + reduce_try2 --> I + reduce_try2 <-- I + reduce_try2 --> I + reduce_try2 --> I + + reduce_try2 <-- K ; third + reduce_try2 --> K + reduce_try2 --> K + - : term = K + +Ok, there's a lot going on here. Since the input is complex, the +first thing the function does is construct `t'`. In order to do this, +it must reduce the two main subexpressions, `II` and `K`. + +There are three recursive calls to the `reduce` function, each of +which gets triggered during the processing of this example. They have +been marked in the trace. + +The don't quite go in the order in which they appear in the code, +however! We see from the trace that it begins with the right-hand +expression, `K`. We didn't explicitly tell it to begin with the +right-hand subexpression, so control over evaluation order is starting +to spin out of our control. (We'll get it back later, don't worry.) + +In any case, in the second main recursive call, we evaluate `II`. The +result is `I`. + +At this point, we have constructed `t' == App(I,K)`. Since that's a +redex, we ship it off to reduce_if_redex, getting the term `K` as a +result. The third recursive call checks that there is no more +reduction work to be done (there isn't), and that's our final result. + +You can see in more detail what is going on by tracing both reduce +and reduce_if_redex, but that makes for some long traces. + +So we've solved our first problem: `reduce` now recognizes that `IIK ~~> +K`, as desired. + +Because the OCaml interpreter evaluates each subexpression in the +course of building `t'`, however, it will always evaluate the right +hand subexpression, whether it needs to or not. And sure enough, + + # reduce_try2 (App(App(K,I),skomega));; + C-c C-cInterrupted. + +Running the evaluator with this input leads to an infinite loop, and +the only way to get out is to kill the interpreter with control-c. + +Instead of performing the leftmost reduction first, and recognizing +that this term reduces to the normal form `I`, we get lost endlessly +trying to reduce Skomega. + +## Laziness is hard to overcome + +To emphasize that our evaluation order here is at the mercy of the +evaluation order of OCaml, here is the exact same program translated +into Haskell. We'll put them side by side to emphasize the exact parallel. + +
+OCaml                                                          Haskell
+==========================================================     =========================================================
+
+type term = I | S | K | App of (term * term)                   data Term = I | S | K | App Term Term deriving (Eq, Show)      
+                                                                                                                             
+let skomega = App (App (App (S,I), I), App (App (S,I), I))     skomega = (App (App (App S I) I) (App (App S I) I))                      
+                                                                                                                             
+                                                               reduce_if_redex :: Term -> Term                                      
+let reduce_if_redex (t:term):term = match t with               reduce_if_redex t = case t of                                      
+  | App(I,a) -> a                                                App I a -> a                                                      
+  | App(App(K,a),b) -> a                                         App (App K a) b -> a                                              
+  | App(App(App(S,a),b),c) -> App(App(a,c),App(b,c))             App (App (App S a) b) c -> App (App a c) (App b c)                      
+  | _ -> t                                                       _ -> t                                                      
+                                                                                                                             
+                                                               is_redex :: Term -> Bool                                      
+let is_redex (t:term):bool = not (t = reduce_if_redex t)       is_redex t = not (t == reduce_if_redex t)                      
+                                                                                                                             
+                                                               reduce_try2 :: Term -> Term                                              
+let rec reduce_try2 (t : term) : term = match t with           reduce_try2 t = case t of                                              
+  | I -> I                                                       I -> I                                                      
+  | K -> K                                                       K -> K                                                      
+  | S -> S                                                       S -> S                                                      
+  | App (a, b) ->                                                App a b ->                                                       
+      let t' = App (reduce_try2 a, reduce_try2 b) in                 let t' = App (reduce_try2 a) (reduce_try2 b) in                      
+      if (is_redex t') then let t'' = reduce_if_redex t'             if (is_redex t') then reduce_try2 (reduce_if_redex t')      
+                            in reduce_try2 t''                                        else t'                                
+                       else t'
+
+ +There are some differences in the way types are made explicit, and in +the way terms are specified (`App(a,b)` for Ocaml versus `App a b` for +Haskell). But the two programs are essentially identical. + +Yet the Haskell program finds the normal form for `KIΩ`: + + *Main> reduce_try2 (App (App K I) skomega) + I + +Woa! First of all, this is wierd. Haskell's evaluation strategy is +called "lazy". Apparently, Haskell is so lazy that even after we've +asked it to construct `t'` by evaluating `reduce_try2 a` and `reduce_try2 b`, it +doesn't bother computing `reduce_try2 b`. Instead, it waits to see if we +ever really need to use the result. + +So the program as written does _not_ fully determine evaluation order +behavior. At this stage, we have defined an evaluation order that +still depends on the evaluation order of the underlying interpreter. + +There are two questions we could ask: + +* Can we adjust the OCaml evaluator to exhibit lazy behavior? + +* Can we adjust the Haskell evaluator to exhibit eager behavior? + +The answer to the first question is easy and interesting, and we'll +give it right away. The answer to the second question is also +interesting, but not easy. There are various tricks available in +Haskell we could use (such as the `seq` operator, or the `deepseq` +operator), but a fully general, satisifying resolution will have to +wait until we have Continuation Passing Style transforms. + +The answer to the first question (Can we adjust the OCaml evaluator to +exhibit lazy behavior?) is quite simple: + + let rec reduce_lazy (t : term) : term = match t with + | I -> I + | K -> K + | S -> S + | App (a, b) -> + let t' = App (reduce_lazy a, b) in + if (is_redex t') then let t'' = reduce_if_redex t' + in reduce_lazy t'' + else t' + +There is only one small difference from `reduce_try2`: instead of setting `t'` to `App +(reduce_try3 a, reduce_try3 b)`, we omit one of the recursive calls, and have +`App (reduce_try3 a, b)`. That is, we don't evaluate the right-hand +subexpression at all. Ever! The only way to get evaluated is to +somehow get into functor position. + + # reduce3 (App(App(K,I),skomega));; + - : term = I + # reduce3 skomega;; + C-c C-cInterrupted. + +The evaluator now has no trouble finding the normal form for `KIΩ`, +but evaluating skomega still gives an infinite loop. + +We can now clarify the larger question at the heart of +this discussion: + +*How can we can we specify the evaluation order of a computational +system in a way that is completely insensitive to the evaluation order +of the specification language?* + +As a final note, we should mention that the evaluators given here are +absurdly inefficient computationally. Some computer scientists have +trouble even looking at code this inefficient, but the emphasis here +is on getting the concepts across as simply as possible.