exercises/assignment7.mdwn

   1 There is no separate assignment 6. (There was a single big assignment for weeks 5 and 6, and
   2 we're going to keep the assignment numbers in synch with the weeks.)
   3
   4 [[!toc]]
   5
   6 ## Evaluation order in Combinatory Logic
   7
   8 1. Give a term that the lazy evaluators (either [[the Haskell
   9 evaluator|code/ski_evaluator.hs]], or the lazy version of [[the OCaml
  10 evaluator|code/ski_evaluator.ml]]) do not evaluate all the way to a
  11 normal form, that is, that contains a redex somewhere inside of it after
  12 it has been reduced.
  13
  14 2. One of the [[criteria we established for classifying reduction
  15 strategies|topics/week3_evaluation_order]] strategies is whether they
  16 reduce subterms hidden under lambdas.  That is, for a term like
  17 `(\x y. x z) (\x. x)`, do we reduce to `\y.(\x.x) z` and stop, or do
  18 we reduce further to `\y.z`?  Explain what the corresponding question
  19 would be for CL. Using the eager version of the OCaml CL evaluator (`reduce_try2`),
  20 prove that the evaluator does reduce terms inside of at least
  21 some "functional" CL terms.  Then provide a modified evaluator
  22 that does not perform reductions in those positions. (Just give the
  23 modified version of your recursive reduction function.)
  24
  25
  26 ## Evaluation in the untyped lambda calculus: substitute-and-repeat
  27
  28 Having sketched the issues with a discussion of Combinatory Logic,
  29 we're going to begin to construct an evaluator for a simple language
  30 that includes lambda abstraction.  In this problem set, we're going to
  31 work through the issues twice: once with a function that does
  32 substitution in the obvious way, and keeps reducing-and-repeating until
  33 there are no more eligible redexes. You'll see it's somewhat
  34 complicated.  The complications come from the need to worry about
  35 variable capture. (Seeing these complications should give you an
  36 inkling of why we presented the evaluation order discussion using
  37 Combinatory Logic, since we don't need to worry about variables in
  38 CL.)
  39
  40 We're not going to ask you to write the entire program yourself.
  41 Instead, we're going to give you almost the complete program, with a few gaps
  42 in it that you have to complete. You have to understand enough to
  43 add the last pieces to make the program function.
  44
  45 You can find the skeleton code [[here|/code/untyped_evaluator.ml]].
  46
  47 The first place you need to add code is in the `free_in` function. You already
  48 wrote such a function [[in a previous homework|assignment5#occurs_free]], so this
  49 part should be easy. The intended behavior is for the function to return results
  50 like this:
  51
  52     # free_in "x" (App (Lambda("x, Var "x"), Var "x"));;
  53     - : bool = true
  54
  55 The second place you need to add code is in the `reduce_head_once` function. As we
  56 explain in the code comments, this is similar to the `reduce_if_redex` function in the
  57 combinatory logic interpreters. Three of the clauses near the end of this function are incomplete.
  58
  59 As we explain in the code comments, this interpreter uses an eager/call-by-value reduction
  60 strategy. What are its other evaluation order properties? Does it perform beta-reduction
  61 underneath lambdas? Does it perform eta-reduction?
  62
  63
  64 ## Evaluation in the untyped lambda calculus: environments
  65
  66 The previous interpreter strategy is nice because it corresponds so closely to the
  67 reduction rules we give when specifying our lambda calculus. (Including
  68 specifying evaluation order, which redexes it's allowed to reduce, and
  69 so on.) But keeping track of free and bound variables, computing fresh
  70 variables when needed, that's all a pain.
  71
  72 Here's a better strategy. Instead of keeping all of the information
  73 about which variables have been bound or are still free implicitly
  74 inside of the terms, we'll keep a separate scorecard, which we will call an "environment".  This is a
  75 familiar strategy for philosophers of language and for linguists,
  76 since it amounts to evaluating terms relative to an assignment
  77 function. The difference between the substitute-and-repeat approach
  78 above, and this approach, is one huge step towards monads.
  79
  80 The skeleton code for this is at the [[same link as before|/code/untyped_evaluator.ml]].
  81 This part of the exercise is the "VB" part of that code.
  82
  83 You'll see that in the `eval` function, a new kind of value `Closure (ident) (term) (env)`
  84 is used. What's that about?
  85
  86 The `Closure`s are for handling the binding of terms that have locally free variables in them. Let's
  87 see how this works. For exposition, I'll pretend that the code you're working with has
  88 primitive numbers in it, though it doesn't. (But the fuller code it's simplified from does; see below.)
  89 Now consider:
  90
  91     term                             environment
  92     ----                             -----------
  93     (\w.(\y.y) w) 2                  []
  94     (\y.y) w                         [w->2]
  95     y                                [y->w, w->2]
  96
  97 In the first step, we bind `w` to the term `2`, by saving this association in our environment. In the second step, we bind `y` to the term `w`. In the third step, we would like to replace `y` with whatever its current value is according to our scorecard/environment. But a naive handling of this would replace `y` with `w`; and that's not the right result, because `w` should itself be mapped onto `2`. On the other hand, in:
  98
  99     term                             environment
 100     ----                             -----------
 101     (\x w. x) w 2                    []
 102     (\w. x) 2                        [x->w]
 103     x                                [w->2, x->w]
 104
 105 Now our final term _should_ be replaced with `w`, not with `2`. So evidently it's not so simple as just recursively re-looking up variables in the environment.
 106
 107 A good strategy for handling these cases would be not to bind `y` to the term `w`, but rather to bind it to _what the term `w` then fully evaluates to_. Then in the first case we'd get:
 108
 109 <pre>
 110 term                             environment
 111 ----                             -----------
 112 (\w.(\y.y) w) 2                  []
 113 (\y.y) w                         [w->2]
 114 y                                [<b>y->2</b>, w->2]
 115 </pre>
 116
 117 And at the next step, `y` will evaluate directly to `2`, as desired. In the other example, though, `x` gets bound to `w` when `w` is already free. (In fact the code skeleton we gave you won't permit that to happen; it refuses to perform any applications except when the arguments are "result values", and it doesn't count free variables as such. As a result, other variables can never get bound to free variables.)
 118
 119 So far, so good.
 120
 121 But now consider the term:
 122
 123     (\f. x) ((\x y. y x) 0)
 124
 125 Since the outermost head `(\f. x)` is already a `Lambda`, we begin by evaluating the argument `((\x y. y x) 0)`. This results in:
 126
 127     term                             environment
 128     ----                             -----------
 129     (\f. x) (\y. y x)                [x->0]
 130
 131 But that's not right, since it will result in the variable `x` in the head also being associated with the argument `0`. Instead, we want the binding of `x` to `0` to be local to the argument term `(\y. y x)`. For the moment, let's notate that like this:
 132
 133 <pre>
 134 term                             environment
 135 ----                             -----------
 136 (\f. x)<sub>1</sub> (\y. y x)<sub>2</sub>              []<sub>1</sub>   [x->0]<sub>2</sub>
 137 </pre>
 138
 139 Now, let's suppose the head is more complex, like so:
 140
 141     (\f. (\x. f x) I) ((\x y. y x) 0)
 142
 143 That might be easier to understand if you transform it to:
 144
 145     let f = (let x = 0 in \y. y x) in
 146     let x = I in
 147     f x
 148
 149 Since the outermost head `(\f. (\x. f x) I)` is already a `Lambda`, we begin by evaluating the argument `((\x y. y x) 0)`. This results in:
 150
 151 <pre>
 152 term                             environment
 153 ----                             -----------
 154 (\f. (\x. f x) I)<sub>1</sub> (\y. y x)<sub>2</sub>    []<sub>1</sub> [x->0]<sub>2</sub>
 155 </pre>
 156
 157 Now the argument is not itself any longer an `App`, and so we are ready to evaluate the outermost application, binding `f` to the argument term. So we get:
 158
 159 <pre>
 160 term                             environment
 161 ----                             -----------
 162 ((\x. f x) I)<sub>1</sub>                   [f->(\y. y x)<sub>2</sub>]<sub>1</sub> [x->0]<sub>2</sub>
 163 </pre>
 164
 165 Next we bind `x` to the argument `I`, getting:
 166
 167 <pre>
 168 term                             environment
 169 ----                             -----------
 170 (f x)<sub>1</sub>                           [x->I, f->(\y. y x)<sub>2</sub>]<sub>1</sub> [x->0]<sub>2</sub>
 171 </pre>
 172
 173 Now we have to apply the value that `f` is bound to to the value that `x` is bound to. But notice there is a free variable `x` in the term that `f` is bound to. How should we interpret that term? Should the evaluation proceed:
 174
 175 <pre>
 176 (\y. y <b>x<sub>1</sub></b>) x<sub>1</sub>                    [x->I, ...]<sub>1</sub> [x->0]<sub>2</sub>
 177 y <b>x<sub>1</sub></b>                             [y->I, x->I, ...]<sub>1</sub> [x->0]<sub>2</sub>
 178 I I
 179 I
 180 </pre>
 181
 182 using the value that `x` is bound to in context<sub>1</sub>, _where the `f` value is applied_? Or should it proceed:
 183
 184 <pre>
 185 (\y. y <b>x<sub>2</sub></b>) x<sub>1</sub>                    [x->I, ...]<sub>1</sub> [x->0]<sub>2</sub>
 186 y <b>x<sub>2</sub></b>                             [y->I, x->I, ...]<sub>1</sub> [x->0]<sub>2</sub>
 187 I 0
 188 0
 189 </pre>
 190
 191 using the value that `x` was bound to in context<sub>2</sub>, _where `f` was bound_?
 192
 193 In fact, when we specified rules for the Lambda Calculus, we committed ourselves to taking the second of these strategies. But both of the kinds of binding described here are perfectly coherent. The first is called "dynamic binding" or "dynamic scoping" and the second is called "lexical or static binding/scoping". Neither is intrinsically more correct or appropriate than the other. The first is somewhat easier for the people who write implementations of programming languages to handle; so historically it used to predominate. But the second is easier for programmers to reason about. In Scheme, variables are bound in the lexical/static way by default, just as in the Lambda Calculus; but there is special vocabulary for dealing with dynamic binding too, which is useful in some situations. (As far as I'm aware, Haskell and OCaml only provide the lexical/static binding.)
 194
 195 In any case, if we're going to have the same semantics as the untyped Lambda Calculus, we're going to have to make sure that when we bind the variable `f` to the value `\y. y x`, that (locally) free variable `x` remains associated with the value `2` that `x` was bound to in the context where `f` is bound, not the possibly different value that `x` may be bound to later, when `f` is applied. One thing we might consider doing is _evaluating the body_ of the abstract that we want to bind `f` to, using the then-current environment to evaluate the variables that the abstract doesn't itself bind. But that's not a good idea. What if the body of that abstract never terminates? The whole program might be OK, because it might never go on to apply `f`. But we'll be stuck trying to evaluate `f`'s body anyway, and will never get to the rest of the program. Another thing we could consider doing is to substitute the `2` in for the variable `x`, and then bind `f` to `\y. y 2`. That would work, but the whole point of this evaluation strategy is to avoid doing those complicated (and inefficient) substitutions. Can you think of a third idea?
 196
 197 What we will do is have our environment associate `f` not just with a `Lambda` term, but also with the environment that was in place _when_ `f` was bound. Then later, if we ever do use `f`, we have that saved environment around to look up any free variables in. More specifically, we'll associate `f` not with the _term_:
 198
 199     Lambda("y", BODY)
 200
 201 but instead with the `Closure` structure:
 202
 203     Closure("y", BODY, SAVED_ENV)
 204
 205 where `BODY` is the term that constituted the body of the corresponding `Lambda`, and `SAVED_ENV` is the environment in place when `f` is being bound. (In this simple implementation, we will just save the _whole environment_ then in place. But in a more efficient implementation, we'd sift through it and only keep those bindings for variables that are free in `BODY`. That would take up less space.)
 206
 207 So that's what's going on with those `Closure`s. In the simple code we gave you to work with, we just made these another clause in the `term` datatype. That's really not correct. `Closure`s aren't terms. The syntax for our language doesn't have any constituents that get parsed into `Closure`s. `Closure`s are only created *during the course of evaluating terms*: specifically, when a variable gets bound to an abstract, which may itself contain variables that are locally free (not bound by the abstract itself). So really we should have two datatypes, one for terms, and another for the *results* (sometimes called "values") that terms can evaluate to. `Closure`s are results, but they aren't terms. `App`s are terms, but not results. If we had primitive numbers or other constants in our language, they might be both terms and results. In the fuller code from which your homework is simplified, this is how the types are in fact defined. But it makes things more complicated. So to keep things simple for the homework, we just pretended like `Closure`s were a new, exotic kind of `term`.
 208
 209 In any case, now you know what's going on with the `Closure`s, and you should be able to complete the missing pieces of the `eval` function in the code skeleton linked above.
 210
 211 If you've completed all the missing parts correctly (there are six gaps for the previous stage of the homework, and two gaps for this stage), then you should be able to compile the code skeleton, and use it as described in the comments at the start of the code.
 212
 213
 214 ## Fuller interpreter
 215
 216 We've also prepared a fuller version of the interpreter, that has user-friendly input
 217 and printing of results. We'll provide a link to that shortly. It
 218 will be easiest for you to understand that code if you've
 219 completed the gaps in the simplified skeleton linked above.
 220
 221 There's nothing you need to do with this; it's just for you to play with. If you're interested,
 222 you can compare the code you completed for the previous two segments of the homework
 223 to the (only moderately more complex) code in the `engine.ml` file of this fuller program.
 224
 225 ## Monads
 226
 227 Mappables (functors), MapNables (applicative functors), and Monads
 228 (composables) are ways of lifting computations from unboxed types into
 229 boxed types.  Here, a "boxed type" is a type function with one unsaturated
 230 hole (which may have several occurrences, as in `(α,α) tree`). We can think of the box type
 231 as a function from a type to a type.
 232
 233 Recall that a monad requires a singleton function <code>⇧ (\* mid \*) : P-> <u>P</u></code>, and a
 234 composition operator like <code>&gt;=&gt; : (P-><u>Q</u>) -> (Q-><u>R</u>) -> (P-><u>R</u>)</code>.
 235
 236 As we said in the notes, we'll move freely back and forth between using `>=>` and using `<=<` (aka `mcomp`), which
 237 is just `>=>` with its arguments flipped. `<=<` has the virtue that it corresponds more
 238 closely to the ordinary mathematical symbol `○`. But `>=>` has the virtue
 239 that its types flow more naturally from left to right.
 240
 241 Anyway, `mid`/`⇧` and (let's say) `<=<` have to obey the Monad Laws:
 242
 243     ⇧ <=< k == k
 244     k <=< ⇧ == k
 245     j <=< (k <=< l) == (j <=< k) <=< l
 246
 247 For example, the Identity monad has the identity function `I` for `⇧`
 248 and ordinary function composition `○` for `<=<`.  It is easy to prove
 249 that the laws hold for any terms `j`, `k`, and `l` whose types are
 250 suitable for `⇧` and `<=<`:
 251
 252     ⇧ <=< k == I ○ k == \p. I (k p) ~~> \p. k p ~~> k
 253     k <=< ⇧ == k ○ I == \p. k (I p) ~~> \p. k p ~~> k
 254
 255     (j <=< k) <=< l == (\p. j (k p)) ○ l == \q. (\p. j (k p)) (l q) ~~> \q. j (k (l q))
 256     j <=< (k <=< l) == j ○ (k ○ l) == j ○ (\p. k (l p)) == \q. j ((\p. k (l p)) q) ~~> \q. j (k (l q))
 257
 258 1.  On a number of occasions, we've used the Option/Maybe type to make our
 259 conceptual world neat and tidy (for instance, think of [[our discussion
 260 of Kaplan's Plexy|topics/week6_plexy]]).  As we learned in class, there is a natural monad
 261 for the Option type. Using the vocabulary of OCaml, let's say that
 262 `'a option` is the type of a boxed `'a`, whatever type `'a` is.
 263 More specifically,
 264
 265         type 'a option = None | Some 'a
 266
 267     (If you have trouble keeping straight what is the OCaml terminology for this and what is the Haskell terminology, don't worry, we do too.)
 268
 269     Now the obvious singleton for the Option monad is `\p. Some p`.  Give
 270 (or reconstruct) either of the composition operators `>=>` or `<=<`.
 271 Show that your composition operator obeys the Monad Laws.
 272
 273 2.  Do the same with lists.  That is, given an arbitrary type
 274 `'a`, let the boxed type be `['a]` or `'a list`, that is, lists of values of type `'a`.  The `⇧`
 275 is the singleton function `\p. [p]`, and the composition operator is:
 276
 277         let (>=>) (j : 'a -> 'b list) (k : 'b -> 'c list) : 'a -> 'c list =
 278           fun a -> List.flatten (List.map k (j a))
 279
 280     For example:
 281
 282         let j a = [a; a+1];;
 283         let k b = [b*b; b+b];;
 284         (j >=> k) 7
 285         (* which OCaml evaluates to:
 286         - : int list = [49; 14; 64; 16]
 287         *)
 288
 289     Show that these obey the Monad Laws.