various cleanup

[lambda.git] / reader_monad.mdwn
diff --git a/reader_monad.mdwn b/reader_monad.mdwn

index ef049d5..784d0a4 100644 (file)
--- a/reader_monad.mdwn
+++ b/reader_monad.mdwn
@@ -7,7 +7,7 @@ Let's step back and consider the semantics we've assumed so far for our lambda c
  
  where `M {x <- N}` is the result of substituting N in for free occurrences of `x` in `M`.
  
-What do I mean by calling this a "semantics"? Wasn't this instead part of our proof-theory? Haven't we been neglected to discuss any model theory for the lambda calculus?
+What do I mean by calling this a "semantics"? Wasn't this instead part of our proof-theory? Haven't we neglected to discuss any model theory for the lambda calculus?
  
  Well, yes, that's right, we haven't much discussed what computer scientists call *denotational semantics* for the lambda calculus. That's what philosophers and linguists tend to think of as "semantics."
  
@@ -21,7 +21,7 @@ deriving what a formula's denotation is. But it's not necessary to think of the
  
  In any case, the lambda evaluator we use on our website does evaluate expressions using the kind of operational semantics described above. We can call that a "substitution-based" semantics.
  
-Let's consider a different kind of operational semantics. Instead of substituting `N` in for `x`, why don't we keep some extra data-structure on the side, where we note that `x` should now be considered to evaluate to whatever `N` does? In computer science jargon, such a data-structure is called an **environment**. Philosophers and linguists would tend to call it an **assignment** (though there are some subtleties about whether these are the same thing, which we'll discuss).
+Let's consider a different kind of operational semantics. Instead of substituting `N` in for `x`, why don't we keep some extra data-structure on the side, where we note that `x` should now be understood to evaluate to whatever `N` does? In computer science jargon, such a data-structure is called an **environment**. Philosophers and linguists would tend to call it an **assignment** (though there are some subtleties about whether these are the same thing, which we'll discuss).
  
  [Often in computer science settings, the lambda expression to be evaluated is first translated into **de Bruijn notation**, which we discussed in the first week of term. That is, instead of:
  
@@ -46,7 +46,7 @@ Now with the environment-based semantics, instead of evaluating terms using subs
  
  we'll instead evaluate them by manipulating their environment. (To intepret `(\x. M) N` in environment `e`, we'll interpret `M` in an environment like `e {x:= N}` where `x` may have been changed to now be assigned to `N`.
  
-A few comments. First, what the environment associates with the variable `x` will be expressions of the lambda calculus we're evaluating. If we understand the evaluation to be call-by-name, these expressions may be complexes of the form `N P`. If on the other hand, we understand the evaluation to be call-by-value, then these expressions will always be fully evaluated before being inserted into the environment. That means they'll never be of the form `N P`; but onlu of the form `y` or `\y. P`. The latter kinds of expressions are called "values." But "values" here are just certain kinds of expressions. (They could be the denotations of lambda expressions, if one thinks of the lambda expressions as all having (preferably) normal-form lambda terms as their denotations.)
+A few comments. First, what the environment associates with the variable `x` will be expressions of the lambda calculus we're evaluating. If we understand the evaluation to be call-by-name, these expressions may be complexes of the form `N P`. If on the other hand, we understand the evaluation to be call-by-value, then these expressions will always be fully evaluated before being inserted into the environment. That means they'll never be of the form `N P`; but only of the form `y` or `\y. P`. The latter kinds of expressions are called "values." But "values" here are just certain kinds of expressions. (They *could* be the denotations of lambda expressions, if one thinks of the lambda expressions as all having normal-form lambda terms as their denotations, when possible.)
  
  Second, there is a subtlety here we haven't yet discussed. Consider what should be the result of this:
  
@@ -66,11 +66,11 @@ operational semantics for the lambda calculus, or the underpinnings of how Schem
  
  With these ideas settled, then, we can present an environment-based operational semantics for the untyped lambda calculus. Here is a call-by-value version, which assumes that expressions are always fully evaluated before being inserted into the environment.
  
-1.     When `e` assigns some term `v` to `x`, then `x`reduces in environment `e` to `v`. We write that as: `(e |- x) ==> v`.
+1.     When `e` assigns some term `v` to `x`, then `x`fully (that is, terminally) reduces in environment `e` to `v`. We write that as: `(e |- x) ==> v`.
  
  2.     `(e |- \x. M) ==> closure(e, \x. M)`, where a closure is some data-structure (it might just be a pair of the data-structure `e` and the formula `\x. M`).
  
-3.     If `(e |- M) ==> closure(e2, \x. M2)` and `(e |- N) ==> v` and `(e2 {x:=v} |- M2) ==> u`, then `(e |- M N) ==> u`. Here `e {x:=v}` is the environment which is just like `e` except it assigns `v` to `x`.
+3.     If `(e |- M) ==> closure(e2, \x. M2)` and `(e |- N) ==> v` and `(e2 {x:=v} |- M2) ==> u`, then `(e |- M N) ==> u`. Here `e2 {x:=v}` is the environment which is just like `e2` except it assigns `v` to `x`.
  
  
  Explicitly manipulating the environment
@@ -80,7 +80,11 @@ In the machinery we just discussed, the environment handling was no part of the
  
  For example, a common programming exercise when students are learning languages like OCaml is to implement a simple arithmetic calculator. You'll suppose you're given some expressions of the following type:
  
-       type term = Constant of int | Multiplication of (term * term) | Addition of (term*term) | Variable of char | Let of (char*term*term);;
+       type term = Constant of int
+               | Multiplication of (term * term)
+               | Addition of (term*term)
+               | Variable of char
+               | Let of (char*term*term);;
  
  and then you'd evaluate it something like this:
  
@@ -91,9 +95,11 @@ and then you'd evaluate it something like this:
                 | Variable c -> ... (* we'll come back to this *)
                 | Let (c,t1,t2) -> ... (* this too *)
  
-With that in hand, you could then evaluate complex terms like `Addition(Constant 1, Multiplication(2, 3))`.
+With that in hand, you could then evaluate complex terms like `Addition(Constant 1, Multiplication(Constant 2, Constant 3))`.
  
-But then how should you evaluate terms like `Let('x',Constant 1,Addition(Variable 'x', Constant 2))`? We'd want to carry along an environment that recorded that 'x' had been associated with the term `Constant 1`, so that we could retrieve that value when evaluating `Addition(Variable 'x', Constant 2)`.
+But then how should you evaluate terms like `Let('x',Constant 1,Addition(Variable 'x', Constant 2))`? We'd want to carry along an environment that recorded that `'x'` had been associated with the term `Constant 1`, so that we could retrieve that value when evaluating `Addition(Variable 'x', Constant 2)`.
+
+Notice that here our environments associate variables with (what from the perspective of our calculator language are) *real* values, like `2`, not just value-denoting terms like `Constant 2`.
  
  We'll work with a simple model of environments. They'll just be lists. So the empty environment is `[]`. To modify an environment `e` like this: `e {x:=1}`, we'll use:
  
@@ -133,7 +139,7 @@ As the calculator gets more complex though, it will become more tedious and unsa
  
  have to explicitly pass around an environment that they're not themselves making any use of. Would there be any way to hide that bit of plumbing behind the drywall?
  
-Yes! You can do with a monad, in much the same way we did with our checks for divide-by-zero failures.
+Yes! You can do so with a monad, in much the same way we did with our checks for divide-by-zero failures.
  
  Here we'll use a different monad. It's called the **reader monad**. We define it like this:
  
@@ -179,7 +185,7 @@ Now if we try:
         # let result = eval (Let('x',Constant 1,Addition(Variable 'x',Constant 2)));;
         - : int reader = <fun>
  
-Now how do we see what integer that evaluates to? Well, it's an int-reader monad, which is a function from an `env` to an `int`, so we need to give it some `env` to operate on. We can give it the empty environment:
+How do we see what integer that evaluates to? Well, it's an int-reader monad, which is a function from an `env` to an `int`, so we need to give it some `env` to operate on. We can give it the empty environment:
  
         # result [];;
         - : int = 3
@@ -219,7 +225,7 @@ Another exception is that Heim and Kratzer have to postulate a special rule to h
         \[[man who(i): Alice spurned i]]
  
  is the result of combining \[[man]], an `e->t` type predicate value, with the adjective-type value of \[[who(i): Alice spurned i]]. As I said, we'll ignore complexities about their treatment of adjectives. But how is \[[who(i): Alice spurned i]] derived? Heim and Kratzer say this is defined only relative to an
-assignment g, and it's defined to be a function from objects x in the domain to the value \[[Alice spurned i]] has relative to shifted assignment g{i:=x}, which is like g except for assigning object x to variable i. So this is not the result of taking some value \[[who(i)]], and some separate value \[[Alice spurned i]], and supplying one of them to the other as argument to function.
+assignment g, and it's defined to be a function from objects x in the domain to the value \[[Alice spurned i]] has relative to shifted assignment g {i:=x}, which is like g except for assigning object x to variable i. So this is not the result of taking some value \[[who(i)]], and some separate value \[[Alice spurned i]], and supplying one of them to the other as argument to function.
  
  Getting any ideas?
  
@@ -236,7 +242,7 @@ We have to lift the semantic values of predicates into the reader-monad. So if b
  Finally, we set \[[who(i)]] to be:
  
         fun (u : bool reader) (v : entity reader) ->
-               fun (g : env) -> u (g{i:=v g})
+               fun (g : env) -> u (g {i:=v g})
  
  That is, it takes as arguments a clause-type reader-monad `u`, and an entity-type reader-monad `v`, and returns a reader-monad that evaluates `u` in an environment that's modified to assign `v`'s value in that environment to the variable `i`. In other words, this is:
  
@@ -244,7 +250,7 @@ That is, it takes as arguments a clause-type reader-monad `u`, and an entity-typ
  
  You can trace through what happens then if we apply \[[who(i)]] to (\[[spurned]] applied to \[[Alice]] and \[[i]]):
  
-       [[Alice spurned i]] = [[spurned]] [[Alice]] [[i]]
+       \[[Alice spurned i]] = \[[spurned]] \[[Alice]] \[[i]]
                 = (lift2 S) (unit Alice) (lookup i)
                 = bind (unit Alice) (fun x -> bind (lookup i) (fun y -> unit (S x y)))
  
@@ -263,7 +269,7 @@ Substituting in the definition of `unit`, this is:
  
  And now supplying \[[Alice spurned i]] as an argument to \[[who(i)]], we get:
  
-       [[who(i): Alice spurned i]] = [[who(i)]] [[Alice spurned i]]
+       \[[who(i): Alice spurned i]] = \[[who(i)]] \[[Alice spurned i]]
                 = (fun u v -> shift i v u) (fun e -> S Alice (lookup i e))
                 = fun v -> shift i v (fun e -> S Alice (lookup i e))
  
@@ -283,7 +289,7 @@ Well, some of our semantic values here are assignment-shifters:
  
         \[[who(i)]] = fun u v -> shift i v u
  
-and some philosophers count assignment-shifters as "monsters" and think there can't be any such thing. Clearly they're wrong.
+and some philosophers count assignment-shifters as "monsters" and think there can't be any such thing. Well, everyone has their own issues they need to work through.
  
  Later, techniques parallel to what we do here can be used to implement semantics for mutation and dynamic predicate logic. And then again, parallel techniques can be used to implement the "coordinated" semantics that Kit Fine and Jim Pryor favor. We just need different monads in each case.