week10.mdwn

   1 [[!toc]]
   2
   3 In class on November 22, we just reviewed mutation and aliasing, especially in relation to the Groenendijk, Stokhof and Veltman paper. Below, we bring in more material. This takes the form of making gradual improvements to the calculator we developed in [week7](/reader_monad_for_variable_binding). Part of what we do below is a review of the mutation techniques developed in [[Week9]]; but we also do other things we haven't discussed in class, like defining new complex functions in our calculator.
   4
   5
   6 ##Original Calculator##
   7
   8 In a real programming application, one would usually start with a string that needs to be parsed and interpreted, such as:
   9
  10         let x = 1 in let y = x + 2 in x * y
  11
  12 The parsing phase converts this to an "abstract syntax tree" (AST), which in this case might be:
  13
  14         Let ('x', Constant 1,
  15                  Let ('y', Addition (Variable 'x', Constant 2),
  16                           Multiplication (Variable 'x', Variable 'y')))
  17
  18 Then the interpreter (or "evaluator") would convert that AST into an "expressed value": in this case, to the integer 3. We're not concerning ourselves with the parsing phase here, so we're just thinking about how to interpret expressions that are already in AST form.
  19
  20 The language we had in week 7 looked like this:
  21
  22         type term =
  23           Constant of int
  24         | Multiplication of (term * term)
  25         | Addition of (term * term)
  26         | Variable of char
  27         | Let of (char * term * term)
  28         ;;
  29
  30 and the evaluation function looked like this:
  31
  32         let rec eval (t : term) (e: (char * int) list) = match t with
  33           Constant x -> x
  34         | Multiplication (t1, t2) -> (eval t1 e) * (eval t2 e)
  35         | Addition (t1, t2) -> (eval t1 e) + (eval t2 e)
  36         | Variable c ->
  37                 (* lookup the value of c in the current environment
  38                    This will fail if c isn't assigned anything by e *)
  39                 List.assoc c e
  40         | Let (c, t1, t2) ->
  41                 (* evaluate t2 in a new environment where c has been associated
  42                    with the result of evaluating t1 in the current environment *)
  43                 eval t2 ((c, eval t1 e) :: e)
  44         ;;
  45
  46
  47 ##Adding Booleans and Pairs##
  48
  49 Let's tweak this a bit.
  50
  51 First, let's abstract away from the assumption that our terms always evaluate to `int`s. Let's suppose they evaluate to a more general type, which might have an `int` payload, or might have, for example, a `bool` payload.
  52
  53         type expressed_value = Int of int | Bool of bool;;
  54
  55 We'll add one boolean predicate, `Iszero`, and an `If...` construction.
  56
  57 Similarly, we might allow for some terms to express pairs of other terms:
  58
  59         type expressed_value = Int of int | Bool of bool | Pair of expressed_value * expressed_value;;
  60
  61 We'd then want to add the ability to construct pairs, and extract their components.
  62
  63 We won't try here to catch any type errors, such as attempts to add a `bool` to an `int`, or attempts to check whether a `bool` iszero. Neither will we try here to monadize anything: these will be implementations of a calculator with all the plumbing exposed. What we will do is add more and more features to the calculator.
  64
  65 We'll switch over to using variable `g` for assignment functions, which is a convention many of you seem familiar with. As we mentioned a few times in [[week9]], for some purposes it's easier to implement environment or assignment functions as functions from `char`s to `int`s (or whatever variables are bound to), rather than as lists of pairs. However, we'll stick with this implementation for now. We will however abstract out the type that the variables are bound to. For now, we'll suppose that they're bound to the same types that terms can express.
  66
  67         type bound_value = expressed_value;;
  68         type assignment = (char * bound_value) list;;
  69
  70 Here's where we should be now. We'll work with the language:
  71
  72         type term =
  73           Intconstant of int
  74         | Multiplication of (term * term)
  75         | Addition of (term * term)
  76         | Variable of char
  77         | Let of (char * term * term)
  78         | Iszero of term
  79         | If of (term * term * term)
  80         | Makepair of (term * term)
  81         | First of term
  82         ;;
  83
  84 Here is our evaluation function. We expand some of the clauses and rename a few variables for clarity. Our implementation should make it clear how to add additional constants or native predicates, such as a `Second` predicate for extracting the second element of a pair.
  85
  86         let rec eval (t : term) (g : assignment) = match t with
  87           Intconstant x -> Int x
  88         | Multiplication (t1, t2) ->
  89                 (* we don't handle cases where the subterms don't evaluate to Ints *)
  90                 let Int i1 = eval t1 g
  91                 in let Int i2 = eval t2 g
  92                 (* Multiplication (t1, t2) should evaluate to an Int *)
  93                 in Int (i1 * i2)
  94         | Addition (t1, t2) ->
  95                 let Int i1 = eval t1 g
  96                 in let Int i2 = eval t2 g
  97                 in Int (i1 + i2)
  98         | Variable (var) ->
  99                 (* we don't handle cases where g doesn't bind var to any value *)
 100                 List.assoc var g
 101         | Let (var_to_bind, t2, t3) ->
 102                 (* evaluate t3 under a new assignment where var_to_bind has been bound to
 103                    the result of evaluating t2 under the current assignment *)
 104                 let value2 = eval t2 g
 105                 in let g' = (var_to_bind, value2) :: g
 106                 in eval t3 g'
 107         | Iszero (t1) ->
 108                 (* we don't handle cases where t1 doesn't evaluate to an Int *)
 109                 let Int i1 = eval t1 g
 110                 (* Iszero t1 should evaluate to a Bool *)
 111                 in Bool (i1 = 0)
 112         | If (t1, t2, t3) ->
 113                 (* we don't handle cases where t1 doesn't evaluate to a boolean *)
 114                 let Bool b1 = eval t1 g
 115                 in if b1 then eval t2 g
 116                 else eval t3 g
 117         | Makepair (t1, t2) ->
 118                 let value1 = eval t1 g
 119                 in let value2 = eval t2 g
 120                 in Pair (value1, value2)
 121         | First (t1) ->
 122                 (* we don't handle cases where t1 doesn't evaluate to a Pair *)
 123                 let Pair (value1, value2) = eval t1 g
 124                 in value1
 125         ;;
 126
 127 The complete code is available [here](/code/calculator/calc1.ml).
 128
 129 ##Adding Function Values##
 130
 131 Now we want to add function values to our language, so that we can interpret (the abstract syntax trees of) expressions like this:
 132
 133         let x = 1 in let f = lambda y -> y + x in apply f 2
 134
 135 What changes do we need to handle this?
 136
 137 We can begin with our language:
 138
 139         type term =
 140           Intconstant of int
 141         | Multiplication of (term * term)
 142         | Addition of (term * term)
 143         | Variable of char
 144         | Let of (char * term * term)
 145         | Iszero of term
 146         | If of (term * term * term)
 147         | Makepair of (term * term)
 148         | First of term
 149         | Lambda of (char * term)
 150         | Apply of (term * term)
 151         ;;
 152
 153 Next, we need to expand our stock of `expressed_value`s to include function values as well. How should we think of these? We've several times mentioned the issue of how to handle free variables in a function's body, like the `x` in `lambda y -> y + x`. We'll follow the usual functional programming standard for these (known as "lexical scoping"), which keeps track of what value `x` has in the function declaration's lexical environment. That shouldn't get shadowed by any different value `x` may have when the function value is later applied. So:
 154
 155         let x = 1 in let f = lambda y -> y + x in let x = 2 in apply f 2
 156
 157 should evaluate to `3` not to `4`. To handle this, the function values we construct need to keep track of the present values of all free variables in the function's body. The combination of the function's body and the values of its free variables is called a "function closure." We'll implement these closures in a straightforward though inefficient way: we'll just stash away a copy of the assignment in effect when the function value is being constructed. Our function values also need to keep track of which of their variables are to be bound to the arguments they get applied to. All together, then, we need three pieces of information: which variables are to be bound to arguments, what the function's body is, and something that keeps track of the right values for the free variables in the function body. We'll pack this all together into an additional variant for our `expressed_value` type:
 158
 159         type expressed_value = Int of int | Bool of bool | Pair of expressed_value * expressed_value | Closure of char * term * assignment;;
 160
 161 We'd like to define `bound_value`s and `assignment`s just as before:
 162
 163         type bound_value = expressed_value;;
 164         type assignment = (char * bound_value) list;;
 165
 166 However, note that we have a recursive relation between these types: `expressed_value` is defined partly in terms of `assignment`, which is defined partly in terms of `bound_value`, which is equivalent to `expressed_value`. In OCaml one has to define such types using the following form:
 167
 168         type bound_value = expressed_value
 169         and assignment = (char * bound_value) list
 170         and expressed_value = Int of int | Bool of bool | Pair of expressed_value * expressed_value | Closure of char * term * assignment;;
 171
 172 Now our evaluation function needs two further clauses to interpret the two new expression forms `Lambda (...)` and `Apply (...)`:
 173
 174         let rec eval (t : term) (g : assignment) = match t with
 175         ...
 176         | Lambda (arg_var, t2) -> Closure (arg_var, t2, g)
 177         | Apply (t1, t2) ->
 178                 (* we don't handle cases where t1 doesn't evaluate to a function value *)
 179                 let Closure (arg_var, body, savedg) = eval t1 g
 180                 in let value2 = eval t2 g
 181                 (* evaluate body under savedg, except with arg_var bound to value2 *)
 182                 in let savedg' = (arg_var, value2) :: savedg
 183                 in eval body savedg'
 184         ;;
 185
 186 The complete code is available [here](/code/calculator/calc2.ml).
 187
 188 ##Adding Recursive Functions##
 189
 190 There are different ways to include recursion in our calculator. First, let's imagine our language expanded like this:
 191
 192         let x = 1 in letrec f = lambda y -> if iszero y then x else y * apply f (y - 1) in apply f 3
 193
 194 where the AST would be:
 195
 196         Let ('x', Intconstant 1,
 197                 Letrec ('f',
 198                         Lambda ('y',
 199                                 If (Iszero (Variable 'y'),
 200                                         Variable 'x',
 201                                         Multiplication (Variable 'y',
 202                                                 Apply (Variable 'f',
 203                                                         Addition (Variable 'y', Intconstant (-1)))))),
 204                         Apply (Variable 'f', Intconstant 3)))
 205
 206 Here is the expanded definition for our language type:
 207
 208         type term =
 209           Intconstant of int
 210         | Multiplication of (term * term)
 211         | Addition of (term * term)
 212         | Variable of char
 213         | Let of (char * term * term)
 214         | Iszero of term
 215         | If of (term * term * term)
 216         | Makepair of (term * term)
 217         | First of term
 218         | Lambda of (char * term)
 219         | Apply of (term * term)
 220         | Letrec of (char * term * term)
 221         ;;
 222
 223 Now consider what we'll need to do when evaluating a term like `Letrec ('f', Lambda (...), t2)`. The subterm `Lambda (...)` will evaluate to something of the form `Closure ('y', body, savedg)`, where `Variable 'f'` may occur free in `body`. What we'll want to do is to ensure that when `body` is applied, it's applied using not the assignment `savedg` but a modified assignment `savedg'` which binds `'f'` to this very function value. That is, we want to bind `'f'` not to:
 224
 225         Closure ('y', body, savedg)
 226
 227 but instead to:
 228
 229         let orig_closure = Closure ('y', body, savedg)
 230         in let savedg' = ('f', orig_closure) :: savedg
 231         in let new_closure = Closure ('y', body, savedg')
 232         in new_closure
 233
 234 Except, this isn't quite right. It's almost what we want, but not exactly. Can you see the flaw?
 235
 236 The flaw is this: inside `new_closure`, what is `'f'` bound to? It's bound by `savedg'` to `orig_closure`, which in turn leaves `'f'` free (or bound to whatever existing value it had according to `savedg`). This isn't what we want. It'll break if we need to make applications of `Variable 'f'` which recurse more than once.
 237
 238 What we really want is for `'f'` to be bound to `new_closure`, something like this:
 239
 240         let rec new_closure = Closure ('y', body, ('f', new_closure) :: savedg)
 241         in new_closure
 242
 243 And as a matter of fact, OCaml *does* permit us to recursively define cyclical lists in this way. So a minimal change to our evaluation function would suffice:
 244
 245         let rec eval (t : term) (g : assignment) = match t with
 246         ...
 247         | Letrec (var_to_bind, t2, t3) ->
 248                 (* we don't handle cases where t2 doesn't evaluate to a function value *)
 249                 let Closure (arg_var, body, savedg) = eval t2 g
 250         in let rec new_closure = Closure (arg_var, body, (var_to_bind, new_closure) :: savedg)
 251         in let g' = (var_to_bind, new_closure) :: g
 252                 in eval t3 g'
 253         ;;
 254
 255 However, this is a somewhat exotic ability in a programming language, so it would be good to work out how to interpret `Letrec (...)` forms without relying on it.
 256
 257 If we implemented assignments as functions rather than as lists of pairs, the corresponding move would be less exotic. In that case, our `Let (...)` and `Letrec (...)` clauses would look something like this:
 258
 259
 260         | Let (var_to_bind, t2, t3) ->
 261                 let value2 = eval t2 g
 262                 in let g' = fun var -> if var = var_to_bind then value2 else g var
 263                 in eval t3 g'
 264         ...
 265         | Letrec (var_to_bind, t2, t3) ->
 266                 let Closure (arg_var, body, savedg) = eval t2 g
 267                 in let rec savedg' = fun var -> if var = var_to_bind then Closure (arg_var, body, savedg') else savedg var
 268                 in let g' = fun var -> if var = var_to_bind then Closure (arg_var, body, savedg') else g var
 269                 in eval t3 g'
 270         ;;
 271
 272 and this is just a run-of-the-mill use of recursive functions. However, for this exercise we'll continue using lists of pairs, and work out how to interpret `Letrec (...)` forms using them.
 273
 274 The way we'll do this is that, when we bind a variable to a value, we'll keep track of whether the term was bound via `Let` or `Letrec`. We'll rely on that to interpret pairs of terms like these differently:
 275
 276         Let ('f',
 277                 Intconstant 1,
 278                 Let ('f', Lambda ('y', Variable 'f')),
 279                 ...)
 280
 281         Let ('f',
 282                 Intconstant 1,
 283                 Letrec ('f', Lambda ('y', Variable 'f')),
 284                 ...)
 285
 286 In the first case, an application of `Variable 'f'` to any argument should evaluate to `Int 1`; in the second case, it should evaluate to the same function closure that `Variable 'f'` evaluates to. We'll keep track of which way a variable was bound by expanding our `bound_value` type:
 287
 288         type bound_value = Nonrecursive of expressed_value |
 289                 Recursive_Closure of char * char * term * assignment
 290         and assignment = (char * bound_value) list
 291         and expressed_value = Int of int | Bool of bool | Pair of expressed_value * expressed_value | Closure of char * term * assignment;;
 292
 293
 294 Since we're not permitting ourselves OCaml's ability to recursively define cyclical lists, we're not going to be able to update the saved assignment in a closure when that closure is recursively bound to a variable. Instead, we'll just make a note that variable `'f'` is supposed to be the recursively bound one---by binding it not to `Nonrecursive (Closure (arg_var, body, savedg))` but rather to `Recursive_Closure ('f', arg_var, body, savedg)`. We'll do the work to make the saved assignment recursive in the right way *later*, when we *evaluate* `Variable 'f'`. The result will look like this:
 295
 296         let rec eval (t : term) (g : assignment) = match t with
 297         ...
 298         | Variable (var) -> (
 299                 (* we don't handle cases where g doesn't bind var to any value *)
 300                 match List.assoc var g with
 301           | Nonrecursive value -> value
 302           | Recursive_Closure (self_var, arg_var, body, savedg) as rec_closure ->
 303                           (* we update savedg to bind self_var to rec_closure here *)
 304               let savedg' = (self_var, rec_closure) :: savedg
 305               in Closure (arg_var, body, savedg')
 306         )
 307         | Let (var_to_bind, t2, t3) ->
 308                 (* evaluate t3 under a new assignment where var_to_bind has been bound to
 309            the result of evaluating t2 under the current assignment *)
 310                 let value2 = eval t2 g
 311                 (* we have to wrap value2 in Nonrecursive *)
 312                 in let g' = (var_to_bind, Nonrecursive value2) :: g
 313                 in eval t3 g'
 314         ...
 315         | Lambda (arg_var, t2) -> Closure (arg_var, t2, g)
 316         | Apply (t1, t2) ->
 317                 (* we don't handle cases where t1 doesn't evaluate to a function value *)
 318                 let Closure (arg_var, body, savedg) = eval t1 g
 319                 in let value2 = eval t2 g
 320                 (* evaluate body under savedg, except with arg_var bound to Nonrecursive value2 *)
 321                 in let savedg' = (arg_var, Nonrecursive value2) :: savedg
 322                 in eval body savedg'
 323         | Letrec (var_to_bind, t2, t3) ->
 324                 (* we don't handle cases where t2 doesn't evaluate to a function value *)
 325                 let Closure (arg_var, body, savedg) = eval t2 g
 326         (* evaluate t3 under a new assignment where var_to_bind has been recursively bound to that function value *)
 327                 in let g' = (var_to_bind, Recursive_Closure (var_to_bind, arg_var, body, savedg)) :: g
 328                 in eval t3 g'
 329         ;;
 330
 331 The complete code is available [here](/code/calculator/calc3.ml).
 332
 333 ##Adding Mutable Cells##
 334
 335 Next, we'll add mutable cells (explicit-style mutation) to our calculator, as we did in [[week9]].
 336
 337 We'll add a few more syntactic forms to the language:
 338
 339         type term =
 340           Intconstant of int
 341         | Multiplication of (term * term)
 342         | Addition of (term * term)
 343         | Variable of char
 344         | Let of (char * term * term)
 345         | Iszero of term
 346         | If of (term * term * term)
 347         | Makepair of (term * term)
 348         | First of term
 349         | Lambda of (char * term)
 350         | Apply of (term * term)
 351         | Letrec of (char * term * term)
 352         | Newref of term
 353         | Deref of term
 354         | Setref of (term * term)
 355         ;;
 356
 357 And we now have to allow for `Mutcell`s as an additional kind of `expressed_value`. These are implemented as wrappers around an index into a `store`:
 358
 359         type index = int;;
 360
 361         type bound_value = Nonrecursive of expressed_value |
 362                 Recursive_Closure of char * char * term * assignment
 363         and assignment = (char * bound_value) list
 364         and expressed_value = Int of int | Bool of bool | Pair of expressed_value * expressed_value | Closure of char * term * assignment | Mutcell of index;;
 365
 366         type store = expressed_value list;;
 367
 368 Our evaluation function will now expect a `store` argument as well as an `assignment`, and will return an `expressed_value * store` pair:
 369
 370         let rec eval (t : term) (g : assignment) (s : store) = match t with
 371           Intconstant x -> (Int x, s)
 372           ...
 373         | Variable (var) -> ((
 374                 (* we don't handle cases where g doesn't bind var to any value *)
 375                 match List.assoc var g with
 376           | Nonrecursive value -> value
 377           | Recursive_Closure (self_var, arg_var, body, savedg) as rec_closure ->
 378                           (* we update savedg to bind self_var to rec_closure here *)
 379               let savedg' = (self_var, rec_closure) :: savedg
 380               in Closure (arg_var, body, savedg')
 381         ), s)
 382           ...
 383         | Lambda (arg_var, t2) -> (Closure (arg_var, t2, g), s)
 384           ...
 385
 386 also, we'll need to be sure to thread the store argument through the evaluation of any subterms, as here:
 387
 388         ...
 389         | Multiplication (t1, t2) ->
 390                 (* we don't handle cases where the subterms don't evaluate to Ints *)
 391                 let (Int i1, s') = eval t1 g s
 392                 in let (Int i2, s'') = eval t2 g s'
 393                 (* Multiplication (t1, t2) should evaluate to an Int *)
 394                 in (Int (i1 * i2), s'')
 395         | Addition (t1, t2) ->
 396                 let (Int i1, s') = eval t1 g s
 397                 in let (Int i2, s'') = eval t2 g s'
 398                 in (Int (i1 + i2), s'')
 399         ...
 400         | Let (var_to_bind, t2, t3) ->
 401                 (* evaluate t3 under a new assignment where var_to_bind has been bound to
 402            the result of evaluating t2 under the current assignment *)
 403                 let (value2, s') = eval t2 g s
 404                 (* we have to wrap value2 in Nonrecursive *)
 405                 in let g' = (var_to_bind, Nonrecursive value2) :: g
 406                 in eval t3 g' s'
 407         | Iszero (t1) ->
 408                 (* we don't handle cases where t1 doesn't evaluate to an Int *)
 409                 let (Int i1, s') = eval t1 g s
 410                 (* Iszero t1 should evaluate to a Bool *)
 411                 in (Bool (i1 = 0), s')
 412         ...
 413         | Makepair (t1, t2) ->
 414                 let (value1, s') = eval t1 g s
 415                 in let (value2, s'') = eval t2 g s'
 416                 in (Pair (value1, value2), s'')
 417         | First (t1) ->
 418                 (* we don't handle cases where t1 doesn't evaluate to a Pair *)
 419                 let (Pair (value1, value2), s') = eval t1 g s
 420                 in (value1, s')
 421         ...
 422         | Apply (t1, t2) ->
 423                 (* we don't handle cases where t1 doesn't evaluate to a function value *)
 424                 let (Closure (arg_var, body, savedg), s') = eval t1 g s
 425                 in let (value2, s'') = eval t2 g s'
 426                 (* evaluate body under savedg, except with arg_var bound to Nonrecursive value2 *)
 427                 in let savedg' = (arg_var, Nonrecursive value2) :: savedg
 428                 in eval body savedg' s''
 429         | Letrec (var_to_bind, t2, t3) ->
 430                 (* we don't handle cases where t2 doesn't evaluate to a function value *)
 431                 let (Closure (arg_var, body, savedg), s') = eval t2 g s
 432         (* evaluate t3 under a new assignment where var_to_bind has been recursively bound to that function value *)
 433                 in let g' = (var_to_bind, Recursive_Closure (var_to_bind, arg_var, body, savedg)) :: g
 434                 in eval t3 g' s'
 435         ...
 436
 437 The clause for `If (...)` is notable:
 438
 439         ...
 440         | If (t1, t2, t3) ->
 441                 (* we don't handle cases where t1 doesn't evaluate to a boolean *)
 442                 let (Bool b1, s') = eval t1 g s
 443         (* note we thread s' through only one of the then/else clauses *)
 444                 in if b1 then eval t2 g s'
 445                 else eval t3 g s'
 446         ...
 447
 448 Now we need to formulate the clauses for evaluating the new forms `Newref (...)`, `Deref (...)`, and `Setref (...)`.
 449
 450         ...
 451         | Newref (t1) ->
 452                 let (value1, s') = eval t1 g s
 453                 (* note that s' may be different from s, if t1 itself contained any mutation operations *)
 454                 (* now we want to retrieve the next free index in s' *)
 455                 in let new_index = List.length s'
 456                 (* now we want to insert value1 there; the following is an easy but inefficient way to do it *)
 457                 in let s'' = List.append s' [value1]
 458                 (* now we return a pair of a wrapped new_index, and the new store *)
 459                 in (Mutcell new_index, s'')
 460         | Deref (t1) ->
 461                 (* we don't handle cases where t1 doesn't evaluate to a Mutcell *)
 462                 let (Mutcell index1, s') = eval t1 g s
 463                 (* note that s' may be different from s, if t1 itself contained any mutation operations *)
 464                 in (List.nth s' index1, s')
 465         | Setref (t1, t2) ->
 466                 (* we don't handle cases where t1 doesn't evaluate to a Mutcell *)
 467                 let (Mutcell index1, s') = eval t1 g s
 468                 (* note that s' may be different from s, if t1 itself contained any mutation operations *)
 469                 in let (value2, s'') = eval t2 g s'
 470                 (* now we create a list which is just like s'' except it has value2 in index1 *)
 471                 in let rec replace_nth lst m =
 472                         match lst with
 473                         | [] -> failwith "list too short"
 474                         | x::xs when m = 0 -> value2 :: xs
 475                         | x::xs -> x :: replace_nth xs (m - 1)
 476                 in let s''' = replace_nth s'' index1
 477                 (* we'll arbitrarily return Int 42 as the expressed_value of a Setref operation *)
 478                 in (Int 42, s''')
 479         ;;
 480
 481 The complete code is available [here](/code/calculator/calc4.ml).
 482
 483 ##Adding Mutable Pairs##
 484
 485 Suppose we wanted to work with pairs where we could mutate either component of the pair. (If you've got a copy of *The Seasoned Schemer*, which we recommended for the seminar, see the discussion of mutable lists at pp. 143-153.)
 486
 487 Well, we've already given ourselves pairs, and mutable cells, so we could just work here with pairs of mutable cells. But it might sometimes be more wieldy to work with a structure that fused these two structures together, to give us a mutable pair. With the mutable pair, we wouldn't ask for the first element, and then apply `Deref` to it to get the value it then temporarily contains. Instead, asking for the first element would *constitute* asking for the value the mutable pair then temporarily contains in its first position.
 488
 489 This means a mutable pair is an interesting hybrid between explicit-style and implicit-style mutation. Looked at one way, it's just a generalization of an explicit mutable cell: it's just that where the mutable cells we implemented before were boxes with only one position, now we have boxes with two positions. Looked at another way, though, mutable pairs are similar to implicit-style mutation: for we don't have separate ways of referring to the first position of the mutable pair, and its dereferenced value. Peeking at the first position *just will be* peeking at its current dereferenced value.
 490
 491 To keep our codebase smaller, we'll implement mutable pairs instead of, not in addition to, the mutable cells from the previous section. Also, we'll leave out the immutable pairs we've been working with up to this point; in this implementation, all pairs will be mutable.
 492
 493 This implementation will largely parallel the previous one. Here are the differences. First, we remove the `Newref`, `Deref`, and `Setref` forms from the language. Our existing form `Makepair` will serve to create mutable pairs, and so will take over a role analogous to `Newref`. Our existing form `First` will take over a role analogous to `Deref`. We'll introduce one new form `Setfirst` that will take over a role analogous to `Setref`:
 494
 495         type term =
 496           Intconstant of int
 497         | Multiplication of (term * term)
 498         | Addition of (term * term)
 499         | Variable of char
 500         | Let of (char * term * term)
 501         | Iszero of term
 502         | If of (term * term * term)
 503         | Makepair of (term * term)
 504         | First of term
 505         | Lambda of (char * term)
 506         | Apply of (term * term)
 507         | Letrec of (char * term * term)
 508         | Setfirst of (term * term)
 509         ;;
 510
 511 Our `expressed_value` type changes in two ways: first, we eliminate the `Mutcell` variant added in the previous implementation. Instead, we now have our `Pair` variant wrap `index`es into the `store`:
 512
 513         type index = int;;
 514
 515         type bound_value = Nonrecursive of expressed_value |
 516                 Recursive_Closure of char * char * term * assignment
 517         and assignment = (char * bound_value) list
 518         and expressed_value = Int of int | Bool of bool | Pair of index * index | Closure of char * term * assignment;;
 519
 520         type store = expressed_value list;;
 521
 522 Finally, here are the changed or added clauses to the evaluation function:
 523
 524         let rec eval (t : term) (g : assignment) (s : store) = match t with
 525         ...
 526         | Makepair (t1, t2) ->
 527                 let (value1, s') = eval t1 g s
 528                 in let (value2, s'') = eval t2 g s'
 529         (* now we want to retrieve the next free index in s'' *)
 530         in let new_index = List.length s''
 531         (* now we want to insert value1 and value2 there; the following is an easy but inefficient way to do it *)
 532         in let s''' = List.append s'' [value1; value2]
 533                 in (Pair (new_index, new_index + 1), s''')
 534         | First (t1) ->
 535                 (* we don't handle cases where t1 doesn't evaluate to a Pair *)
 536                 let (Pair (index1, index2), s') = eval t1 g s
 537         (* note that s' may be different from s, if t1 itself contained any mutation operations *)
 538         in (List.nth s' index1, s')
 539         ...
 540     | Setfirst (t1, t2) ->
 541         (* we don't handle cases where t1 doesn't evaluate to a Pair *)
 542         let (Pair (index1, index2), s') = eval t1 g s
 543         (* note that s' may be different from s, if t1 itself contained any mutation operations *)
 544         in let (value2, s'') = eval t2 g s'
 545         (* now we create a list which is just like s'' except it has value2 in index1 *)
 546         in let rec replace_nth lst m =
 547             match lst with
 548             | [] -> failwith "list too short"
 549             | x::xs when m = 0 -> value2 :: xs
 550             | x::xs -> x :: replace_nth xs (m - 1)
 551         in let s''' = replace_nth s'' index1
 552         in (Int 42, s''')
 553         ;;
 554
 555 Compare these to the clauses for `Newref`, `Deref`, and `Setref` in the previous implementation.
 556
 557 The complete code is available [here](/code/calculator/calc5.ml).
 558
 559 ##Adding Implicit Mutation##
 560
 561 Next we implement implicit-style mutation, as we did in [[week9]]. Here we don't have any explicit reference cells or mutable pairs; we'll return pairs back to their original immutable form. Instead, all variables will have mutable bindings. New reference cells will be implicitly introduced by the `Let` form. They'll also be implicitly introduced by the `Apply` form---we didn't have function values on the table during the [[week9]] discussion, so this didn't come up then. The reason we introduce new reference cells when `Apply`ing a function value to arguments is that we don't want mutation of those arguments inside the body of the function to propagate out and affect the reference cell that may have supplied the argument. When we call functions in this implementation, we just want to supply them with *values*, not with the reference cells we may be drawing those values from. Below, after we discuss *aliases*, we'll consider another strategy, where function bodies are given the ability to mutate the reference cells implicitly associated with the arguments they're supplied.
 562
 563 Our language for the present implementation will be the language for the calculator with recursive functions, with one added syntactic form, `Change (...)`:
 564
 565         type term =
 566           Intconstant of int
 567         | Multiplication of (term * term)
 568         | Addition of (term * term)
 569         | Variable of char
 570         | Let of (char * term * term)
 571         | Iszero of term
 572         | If of (term * term * term)
 573         | Makepair of (term * term)
 574         | First of term
 575         | Lambda of (char * term)
 576         | Apply of (term * term)
 577         | Letrec of (char * term * term)
 578         | Change of (char * term * term)
 579         ;;
 580
 581 In the present implementation, we separate the roles of the `bound_value` and `expressed_value` types. As we discussed in [[week9]], our assignment will bind all variables to indexes in the store, and the latter will contain the `expressed_value`s that the variables evaluate to. A consequence of this is that our definitions of the `bound_value` and `expressed_value` types no longer need to be mutually recursive:
 582
 583         type index = int;;
 584
 585         type bound_value = index;;
 586         type assignment = (char * bound_value) list;;
 587         type expressed_value = Int of int | Bool of bool | Pair of expressed_value * expressed_value | Closure of char * term * assignment;;
 588
 589         type store = expressed_value list;;
 590
 591 Our evaluation function still interacts with a `store` argument in much the same way it did with explicit-style mutation. The clause for `Variable (...)` works differently, because all `expressed_value`s now need to be retrieved from the `store`:
 592
 593         let rec eval (t : term) (g : assignment) (s : store) = match t with
 594         ...
 595         | Variable (var) ->
 596                 (* we don't handle cases where g doesn't bind var to any value *)
 597                 let index = List.assoc var g
 598                 (* get value stored at location index in s *)
 599                 in let value = List.nth s index
 600                 in (value, s)
 601         ...
 602
 603 So this clause takes over the roles that were separately played by `Variable` and `Deref` in the calculator with mutable cells. The role played by `Newref` is absorbed into `Let`, `Letrec`, and `Apply`:
 604
 605         ...
 606         | Let (var_to_bind, t2, t3) ->
 607                 let (value2, s') = eval t2 g s
 608                 (* note that s' may be different from s, if t2 itself contained any mutation operations *)
 609                 (* get next free index in s' *)
 610                 in let new_index = List.length s'
 611                 (* now we want to insert value2 there; the following is an easy but inefficient way to do it *)
 612                 in let s'' = List.append s' [value2]
 613                 (* bind var_to_bind to location new_index in the store *)
 614                 in let g' = ((var_to_bind, new_index) :: g)
 615                 in eval t3 g' s''
 616         ...
 617         | Apply (t1, t2) ->
 618                 (* we don't handle cases where t1 doesn't evaluate to a function value *)
 619                 let (Closure (arg_var, body, savedg), s') = eval t1 g s
 620                 in let (value2, s'') = eval t2 g s'
 621                 (* evaluate body under savedg, except with arg_var bound to a new location containing value2 *)
 622                 in let new_index = List.length s''
 623                 in let s''' = List.append s'' [value2]
 624                 in let savedg' = (arg_var, new_index) :: savedg
 625                 in eval body savedg' s'''
 626         ...
 627
 628 `Letrec` requires some reworking from what we had before. Earlier, we resorted to a `Recursive_Closure` variant on `bound_value`s because it gave us a non-exotic way to update the `savedg` component of a `Closure` to refer to a `new_closure` that contained that very updated `savedg`. Now that we we've got a mutation-supporting infrastructure in place, we can do this directly, without needing the unwieldy `Recursive_Closure` wrapper:
 629
 630         ...
 631         | Letrec (var_to_bind, t2, t3) ->
 632                 (* we don't handle cases where t2 doesn't evaluate to a function value *)
 633                 let (Closure (arg_var, body, savedg), s') = eval t2 g s
 634                 in let new_index = List.length s'
 635                 in let savedg' = (var_to_bind, new_index) :: savedg
 636                 in let new_closure = Closure (arg_var, body, savedg')
 637                 in let s'' = List.append s' [new_closure]
 638                 in let g' = (var_to_bind, new_index) :: g
 639                 in eval t3 g' s''
 640         ...
 641
 642 Finally, here is the clause for `Change (...)`, which takes over the role earlier played by `Setref`:
 643
 644         ...
 645         | Change (var, t2, t3) ->
 646                 (* we don't handle cases where g doesn't bind var to any value *)
 647                 let index = List.assoc var g
 648                 in let (value2, s') = eval t2 g s
 649                 (* note that s' may be different from s, if t2 itself contained any mutation operations *)
 650                 (* now we create a list which is just like s' except it has value2 at index *)
 651                 in let rec replace_nth lst m =
 652                         match lst with
 653                         | [] -> failwith "list too short"
 654                         | x::xs when m = 0 -> value2 :: xs
 655                         | x::xs -> x :: replace_nth xs (m - 1)
 656                 in let s'' = replace_nth s' index
 657                 (* evaluate t3 using original assignment function and new store *)
 658                 in eval t3 g s''
 659         ;;
 660
 661 Note that because the `savedg` component of a `Closure` keeps track of which `index`es in the store---rather than which values---free variables were bound to, the values at those `index`es can later be changed, and later applications of the `Closure` will use the changed values.
 662
 663 The complete code is available [here](/code/calculator/calc6.ml).
 664
 665 ##Adding Aliasing and Passing by Reference##
 666
 667 Next we'll add aliasing as described at the end of [[week9]]. We'll also add the ability to pass (implicit) reference cells as arguments to a function, which lets changes made within the function body be effective in the outside environment. When we discussed this in [[week9]], we proposed a different syntactic form for the function values that get called in this way. Instead of:
 668
 669         let f = lambda (y) -> ...
 670         ...
 671         in f x
 672
 673 one would write:
 674
 675         let f = lambda (alias y) -> ...
 676         ...
 677         in f x
 678
 679 Real programming languages that have this ability, such as C++, do something analagous. Here the function is declared so that *all* of its applications are expected to alias the supplied argument. You can always work around that in a particular case, though, like this:
 680
 681         let f = lambda (alias y) -> ...
 682         ...
 683         in let y = x ; creates new (implicit) reference cell with x's value
 684         in f y
 685
 686 In our present framework, it will be easier to do things differently. We will
 687 introduce a new syntactic form at the location where a function value is
 688 applied, rather than in the function's declaration. We say:
 689
 690         Let ('f',
 691                 Lambda ('y', ...),
 692                 ...
 693                 Apply(Variable 'f', Variable 'x')...)
 694
 695 for the familiar, passing-by-value behavior, and will instead say:
 696
 697         Let ('f',
 698                 Lambda ('y', ...),
 699                 ...
 700                 Applyalias(Variable 'f', 'x')...)
 701
 702 for the proposed new, passing-by-reference behavior. (Besides being easier to implement here, this strategy also has the advantage of more closely aligning with the formal system Jim discusses in his "Hyper-evaluativity" paper.) Note that the second parameter to the `Applyalias` form is just `'x'`, not `Variable 'x'`. This is because (1) only variables are acceptable there, not arbitrary expressions, and (2) we don't need at that point to compute the variable's present value.
 703
 704 Here is our expanded language:
 705
 706         type term =
 707           Intconstant of int
 708         | Multiplication of (term * term)
 709         | Addition of (term * term)
 710         | Variable of char
 711         | Let of (char * term * term)
 712         | Iszero of term
 713         | If of (term * term * term)
 714         | Makepair of (term * term)
 715         | First of term
 716         | Lambda of (char * term)
 717         | Apply of (term * term)
 718         | Letrec of (char * term * term)
 719         | Change of (char * term * term)
 720         | Alias of (char * char * term)
 721         | Applyalias of (term * char)
 722         ;;
 723
 724 The definitions of `index`, `bound_value`, `assignment`, `expressed_value`, and `store` can remain as they were in the implementation of implicit-style mutation. Here are the changes to our evaluation function:
 725
 726         let rec eval (t : term) (g : assignment) (s : store) = match t with
 727         ...
 728         | Alias (var_to_bind, orig_var, t3) ->
 729                 (* we don't handle cases where g doesn't bind orig_var to any value *)
 730                 let index = List.assoc orig_var g
 731                 (* bind var_to_bind to the same index in the store *)
 732                 in let g' = ((var_to_bind, index) :: g)
 733                 in eval t3 g' s
 734         | Applyalias (t1, var) ->
 735                 (* we don't handle cases where t1 doesn't evaluate to a function value *)
 736                 let (Closure (arg_var, body, savedg), s') = eval t1 g s
 737                 (* we don't handle cases where g doesn't bind var to any value *)
 738                 in let index = List.assoc var g
 739                 (* evaluate body under savedg, except with arg_var bound to existing index *)
 740                 in let savedg' = (arg_var, index) :: savedg
 741                 in eval body savedg' s'
 742         ;;
 743
 744 The complete code is available [here](/code/calculator/calc7.ml).
 745