X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=blobdiff_plain;f=topics%2Fweek8_reader_monad.mdwn;fp=topics%2Fweek8_reader_monad.mdwn;h=4dc0da64fe6cea3a6a223e85f588f872323094f3;hp=0000000000000000000000000000000000000000;hb=c50085365cca4f5c5064ab4d1bc9cde71ff5961e;hpb=d2d95e464a412673aa952740608f5e96e8fa9848 diff --git a/topics/week8_reader_monad.mdwn b/topics/week8_reader_monad.mdwn new file mode 100644 index 00000000..4dc0da64 --- /dev/null +++ b/topics/week8_reader_monad.mdwn @@ -0,0 +1,682 @@ + + +[[!toc levels=2]] + +The Reader Monad +================ + +The goal for this part is to introduce the Reader Monad, and present +two linguistics applications: binding and intensionality. Along the +way, we'll continue to think through issues related to order, and a +related notion of flow of information. + +At this point, we've seen monads in general, and three examples of +monads: the identity monad (invisible boxes), the Maybe monad (option +types), and the List monad. + +We've also seen an application of the Maybe monad to safe division. +The starting point was to allow the division function to return an int +option instead of an int. If we divide 6 by 2, we get the answer Just +3. But if we divide 6 by 0, we get the answer Nothing. + +The next step was to adjust the other arithmetic functions to teach +them what to do if they received Nothing instead of a boxed integer. +This meant changing the type of their input from ints to int options. +But we didn't need to do this piecemeal; rather, we "lift"ed the +ordinary arithmetic operations into the monad using the various tools +provided by the monad. + +We'll go over this lifting operation in detail in the next section. + +## Tracing the effect of safe-div on a larger computation + +So let's see how this works in terms of a specific computation. + + +
+(+ 1 (* (/ 6 2) 4)) in tree format:
+
+ ___________
+ |         |
+_|__    ___|___
+|  |    |     |
++  1  __|___  4
+      |    |
+      *  __|___
+         |    |
+        _|__  2
+        |  |
+        /  6
+
+ +No matter which arithmetic operation we begin with, this computation +should eventually reduce to 13. Given a specific reduction strategy, +we can watch the order in which the computation proceeds. Following +on the lambda evaluator developed during the previous homework, let's +adopt the following reduction strategy: + + In order to reduce an expression of the form (head arg), do the following in order: + 1. Reduce head to h' + 2. Reduce arg to a'. + 3. If (h' a') is a redex, reduce it. + +There are many details left unspecified here, but this will be enough +for today. The order in which the computation unfolds will be + + 1. Reduce head (+ 1) to itself + 2. Reduce arg ((* ((/ 6) 2)) 3) + 1. Reduce head (* ((/ 6) 2)) + 1. Reduce head * + 2. Reduce arg ((/ 6) 2) + 1. Reduce head (/ 6) to itself + 2. Reduce arg 2 to itself + 3. Reduce ((/ 6) 2) to 3 + 3. Reduce (* 3) to itself + 2. Reduce arg 4 to itself + 3. Reduce ((* 3) 4) to 12 + 3. Reduce ((+ 1) 12) to 13 + +This reduction pattern follows the structure of the original +expression exactly, at each node moving first to the left branch, +processing the left branch, then moving to the right branch, and +finally processing the results of the two subcomputation. (This is +called depth-first postorder traversal of the tree.) + +[diagram with arrows traversing the tree] + +It will be helpful to see how the types change as we make adjustments. + + type num = int + type contents = Num of num | Op2 of (num -> num -> num) + type tree = Leaf of contents | Branch of tree * tree + +Never mind that these types will allow us to construct silly +arithmetric trees such as `+ *` or `2 3`. Note that during the +reduction sequence, the result of reduction was in every case a +well-formed subtree. So the process of reduction could be animated by +replacing subtrees with the result of reduction on that subtree, till +the entire tree is replaced by a single integer (namely, 13). + +Now we replace the number 2 with 0: + + +
+ ___________
+ |         |
+_|__    ___|___
+|  |    |     |
++  1  __|___  4
+      |    |
+      *  __|___
+         |    |
+        _|__  0
+        |  |
+        /  6
+
+ +When we reduce, we get quite a ways into the computation before things +break down: + + 1. Reduce head (+ 1) to itself + 2. Reduce arg ((* ((/ 6) 0)) 3) + 1. Reduce head (* ((/ 6) 0)) + 1. Reduce head * + 2. Reduce arg ((/ 6) 0) + 1. Reduce head (/ 6) to itself + 2. Reduce arg 0 to itself + 3. Reduce ((/ 6) 0) to ACKKKK + +This is where we replace `/` with `safe-div`. +Safe-div returns not an int, but an int option. If the division goes +through, the result is Just n, where n is the integer result. +But if the division attempts to divide by zero, the result is Nothing. + +We could try changing the type of the arithmetic operators from `int +-> int -> int` to `int -> int -> int option`; but since we now have to +anticipate the possibility that any argument might involve division by +zero inside of it, it would be better to prepare for the possibility +that any subcomputation might return here is the net result for our +types. The easy way to do that is to change (only) the type of num +from int to int option, leaving everying else the same: + + type num = int option + type contents = Num of num | Op2 of (num -> num -> num) + type tree = Leaf of contents | Branch of tree * tree + +The only difference is that instead of defining our numbers to be +simple integers, they are now int options; and so Op is an operator +over int options. + +At this point, we bring in the monadic machinery. In particular, here +is the ⇧ and the map2 function from the notes on safe division: + + ⇧ (a: 'a) = Just a;; + + map2 (g : 'a -> 'b -> 'c) (u : 'a option) (v : 'b option) = + match u with + | None -> None + | Some x -> + (match v with + | None -> None + | Some y -> Some (g x y));; + +Then we lift the entire computation into the monad by applying ⇧ to +the integers, and by applying `map2` to the operators: + + +
+     ___________________
+     |                 |
+  ___|____         ____|_____
+  |      |         |        |
+map2 +  ⇧1    _____|_____  ⇧4
+              |         |
+            map2 *  ____|____
+                    |       |
+                 ___|____  ⇧0
+                 |      |
+               map2 /  ⇧6
+
+ +With these adjustments, the faulty computation now completes smoothly: + + 1. Reduce head ((map2 +) ⇧1) + 2. Reduce arg (((map2 *) (((map2 /) ⇧6) ⇧2)) ⇧3) + 1. Reduce head ((map2 *) (((map2 /) ⇧6) ⇧2)) + 1. Reduce head * + 2. Reduce arg (((map2 /) ⇧6) ⇧0) + 1. Reduce head ((map2 /) ⇧6) + 2. Reduce arg ⇧0 + 3. Reduce (((map2 /) ⇧6) ⇧0) to Nothing + 3. Reduce ((map2 *) Nothing) to Nothing + 2. Reduce arg ⇧4 + 3. Reduce (((map2 *) Nothing) ⇧4) to Nothing + 3. Reduce (((map2 +) ⇧1) Nothing) to Nothing + +As soon as we try to divide by 0, safe-div returns Nothing. +Thanks to the details of map2, the fact that Nothing has been returned +by one of the arguments of a map2-ed operator guarantees that the +map2-ed operator will pass on the Nothing as its result. So the +result of each enclosing computation will be Nothing, up to the root +of the tree. + +It is unfortunate that we need to continue the computation after +encountering our first Nothing. We know immediately at the result of +the entire computation will be Nothing, yet we continue to compute +subresults and combinations. It would be more efficient to simply +jump to the top as soon as Nothing is encoutered. Let's call that +strategy Abort. We'll arrive at an Abort operator later in the semester. + +So at this point, we can see how the Maybe/option monad provides +plumbing that allows subcomputations to send information from one part +of the computation to another. In this case, the safe-div function +can send the information that division by zero has been attempted +throughout the rest of the computation. If you think of the plumbing +as threaded through the tree in depth-first, postorder traversal, then +safe-div drops Nothing into the plumbing half way through the +computation, and that Nothing travels through the rest of the plumbing +till it comes out of the result faucet at the top of the tree. + +## Information flowing in the other direction: top to bottom + +We can think of this application as facilitating information flow. +In the save-div example, a subcomputation created a message that +propagated upwards to the larger computation: + +
+                 message: Division by zero occurred!
+                      ^
+ ___________          |
+ |         |          |
+_|__    ___|___       |
+|  |    |     |       |
++  1  __|___  4       |
+      |    |          |
+      *  __|___  -----|
+         |    |
+        _|__  0
+        |  |
+        /  6
+
+ +We might want to reverse the direction of information flow, making +information available at the top of the computation available to the +subcomputations: + +
+                    [λx]  
+ ___________          |
+ |         |          |
+_|__    ___|___       |
+|  |    |     |       |
++  1  __|___  4       |
+      |    |          |
+      *  __|___       |
+         |    |       |
+        _|__  x  <----|
+        |  |
+        /  6
+
+ +We've seen exactly this sort of configuration before: it's exactly +what we have when a lambda binds a variable that occurs in a deeply +embedded position. Whatever the value of the argument that the lambda +form combines with, that is what will be substituted in for free +occurrences of that variable within the body of the lambda. + +## Binding + +So our next step is to add a (primitive) version of binding to our +computation. We'll allow for just one binding dependency for now, and +then generalize later. + +Binding is independent of the safe division, so we'll return to a +situation in which the Maybe monad hasn't been added. One of the nice +properties of programming with monads is that it is possible to add or +subtract layers according to the needs of the moment. Since we need +simplicity, we'll set the Maybe monad aside for now. + +So we'll go back to the point where the num type is simple int, not +int options. + + type num = int + +And we'll start with the computation the map2 or the ⇧ from the option +monad. + +As you might guess, the technique we'll use to arrive at binding will +be to use the Reader monad, defined here in terms of m-identity and bind: + + α ==> int -> α (* The ==> is a Kleisli arrow *) + ⇧a = \x.a + u >>= f = \x.f(ux)x + map f u = \x.f(ux) + map2 f u v = \x.f(ux)(vx) + + +A boxed type in this monad will be a function from an integer to an +object in the original type. The unit function ⇧ lifts a value `a` to +a function that expects to receive an integer, but throws away the +integer and returns `a` instead (most values in the computation don't +depend on the input integer). + +The bind function in this monad takes a monadic object `u`, a function +`f` lifting non-monadic objects into the monad, and returns a function +that expects an integer `x`. It feeds `x` to `u`, which delivers a +result in the orginal type, which is fed in turn to `f`. `f` returns +a monadic object, which upon being fed an integer, returns an object +of the orginal type. + +The map2 function corresponding to this bind operation is given +above. It should look familiar---we'll be commenting on this +familiarity in a moment. + +Lifing the computation into the monad, we have the following adjusted +types: + +type num = int -> int + +That is, `num` is once again replaced with the type of a boxed int. +When we were dealing with the Maybe monad, a boxed int had type `int +option`. In this monad, a boxed int has type `int -> int`. + + +
+     __________________
+     |                |
+  ___|____        ____|_____
+  |      |        |        |
+map2 +  ⇧1    ____|_____  ⇧4
+              |        |
+            map2 *  ___|____
+                    |      |
+                 ___|____  x
+                 |      |
+               map2 /  ⇧6
+
+ +It remains only to decide how the variable `x` will access the value input +at the top of the tree. Since the input value is supposed to be the +value put in place of the variable `x`. Like every leaf in the tree +in argument position, the code we want in order to represent the +variable will have the type of a boxed int, namely, `int -> int`. So +we have the following: + + x = (\fun (i:int) = i) + +That is, variables in this system denote the indentity function! + +The result of evaluating this tree will be a boxed integer: a function +from any integer `x` to `(+ 1 (* (/ 6 x)) 4)`. + +Take a look at the definition of the reader monad again. The +midentity takes some object `a` and returns `\x.a`. In other words, +`⇧a = Ka`, so `⇧ = K`. Likewise, the reason the `map2` function +looked familiar is that it is essentially the `S` combinator. + +We've seen this before as a strategy for translating a binding +construct into a set of combinators. To remind you, here is a part of +the general scheme for translating a lambda abstract into Combinatory +Logic. The translation function `[.]` translates a lambda term into a +term in Combinatory Logic: + + [(MN)] = ([M] [N]) + [\a.a] = I + [\a.M] = K[M] (assuming a not free in M) + [\a.(MN)] = S[\a.M][\a.N] + +The reason we can make do with this subset of the full translation +scheme is that we're making the simplifying assumption that there is +at most a single lambda involved. So once again we have the identity +function I as the translation of the bound variable, K as the function +governing expressions that don't contain an instance of the bound +variable, S as the operation that manages the combination of complex +expressions. + +## Jacobson's Variable Free Semantics as a Reader Monad + +We've designed the presentation above to make it as easy as possible +to show that Jacobson's Variable Free Semantics (e.g., Jacobson 1999, +[Towards a Variable-Free +Semantics](http://www.springerlink.com/content/j706674r4w217jj5/)) +implements a reader monad. + +More specifically, it will turn out that Jacobson's geach combinator +*g* is exactly our `lift` operator, and her binding combinator *z* is +exactly our `bind` (though with the arguments reversed). + +Jacobson's system contains two main combinators, *g* and *z*. She +calls *g* the Geach rule, and *z* performs binding. Here is a typical +computation. This implementation is based closely on email from Simon +Charlow, with beta reduction as performed by the on-line evaluator: + +
+; Jacobson's analysis of "Everyone_i thinks he_i left"
+let g = \f u. \x. f (u x) in
+let z = \f u. \x. f (u x) x in
+let he = \x. x in
+let everyone = \P. FORALL x (P x) in
+
+everyone (z thinks (g left he))
+
+~~>  FORALL x (thinks (left x) x)
+
+ +Two things to notice: First, pronouns once again denote identity +functions, just as we saw in the reader monad in the previous section. + +Second, *g* plays the role of transmitting a binding dependency for an +embedded constituent to a containing constituent. + +The basic recipe in Jacobson's system is that you transmit the +dependence of a pronoun upwards through the tree using *g* until just +before you are about to combine with the binder, when you finish off +with *z*. Here is an example with a longer chain of *g*'s: + +
+everyone (z thinks (g (t bill) (g said (g left he))))
+
+~~> FORALL x (thinks (said (left x) bill) x)
+
+ +If you compare Jacobson's values for *g* and *z* to the functions in +the reader monad given above, you'll see that Jacobson's *g* +combinator is exactly our `map` operator. Furthermore, Jacobson's *z* +combinator is identital to `>>=`, except with the order of the +arguments reversed (i.e., `(z f u) == (u >>= f)`). + +(The `t` combinator in the derivations above is given by `t x = +\xy.yx`; it handles situations in which English word order reverses +the usual function/argument order.) + +In other words, +Jacobson's variable-free semantics is essentially a Reader monad. + +One of the peculiar aspects of Jacobson's system is that binding is +accomplished not by applying *z* to the element that will (in some +pre-theoretic sense) bind the pronoun, here, *everyone*, but rather by +applying *z* instead to the predicate that will take *everyone* as an +argument, here, *thinks*. + + +## The Reader monad for intensionality + +[This section has not been revised since 2010, so there may be a few +places where it doesn't follow the convetions we've adopted this time; +nevertheless, it should be followable.] + +Now we'll look at using the Reader monad to do intensional function application. + +In Shan (2001) [Monads for natural +language semantics](http://arxiv.org/abs/cs/0205026v1), Ken shows that +making expressions sensitive to the world of evaluation is conceptually +the same thing as making use of the Reader monad. +This technique was beautifully re-invented +by Ben-Avi and Winter (2007) in their paper [A modular +approach to +intensionality](http://parles.upf.es/glif/pub/sub11/individual/bena_wint.pdf), +though without explicitly using monads. + +All of the code in the discussion below can be found here: [[code/intensionality-monad.ml]]. +To run it, download the file, start OCaml, and say + + # #use "intensionality-monad.ml";; + +Note the extra `#` attached to the directive `use`. + +First, the familiar linguistic problem: + + Bill left. + Cam left. + Ann believes [Bill left]. + Ann believes [Cam left]. + +We want an analysis on which the first three sentences can be true at +the same time that the last sentence is false. If sentences denoted +simple truth values or booleans, we have a problem: if the sentences +*Bill left* and *Cam left* are both true, they denote the same object, +and Ann's beliefs can't distinguish between them. + +The traditional solution to the problem sketched above is to allow +sentences to denote a function from worlds to truth values, what +Montague called an intension. So if `s` is the type of possible +worlds, we have the following situation: + + +
+Extensional types              Intensional types       Examples
+-------------------------------------------------------------------
+
+S         t                    s->t                    John left
+DP        e                    s->e                    John
+VP        e->t                 (s->e)->s->t            left
+Vt        e->e->t              (s->e)->(s->e)->s->t    saw
+Vs        t->e->t              (s->t)->(s->e)->s->t    thought
+
+ +This system is modeled on the way Montague arranged his grammar. +There are significant simplifications compared to Montague: for +instance, determiner phrases are thought of here as corresponding to +individuals rather than to generalized quantifiers. + +The main difference between the intensional types and the extensional +types is that in the intensional types, the arguments are functions +from worlds to extensions: intransitive verb phrases like "left" now +take so-called "individual concepts" as arguments (type s->e) rather than plain +individuals (type e), and attitude verbs like "think" now take +propositions (type s->t) rather than truth values (type t). +In addition, the result of each predicate is an intension. +This expresses the fact that the set of people who left in one world +may be different than the set of people who left in a different world. + +Normally, the dependence of the extension of a predicate to the world +of evaluation is hidden inside of an evaluation coordinate, or built +into the the lexical meaning function, but we've made it explicit here +in the way that the intensionality monad makes most natural. + +The intensional types are more complicated than the extensional +types. Wouldn't it be nice to make the complicated types available +for those expressions like attitude verbs that need to worry about +intensions, and keep the rest of the grammar as extensional as +possible? This desire is parallel to our earlier desire to limit the +concern about division by zero to the division function, and let the +other functions, like addition or multiplication, ignore +division-by-zero problems as much as possible. + +So here's what we do: + +In OCaml, we'll use integers to model possible worlds. Characters (characters in the computational sense, i.e., letters like `'a'` and `'b'`, not Kaplanian characters) will model individuals, and OCaml booleans will serve for truth values: + + type s = int;; + type e = char;; + type t = bool;; + + let ann = 'a';; + let bill = 'b';; + let cam = 'c';; + + let left1 (x:e) = true;; + let saw1 (x:e) (y:e) = y < x;; + + left1 ann;; (* true *) + saw1 bill ann;; (* true *) + saw1 ann bill;; (* false *) + +So here's our extensional system: everyone left, including Ann; +and Ann saw Bill (`saw1 bill ann`), but Bill didn't see Ann. (Note that the word +order we're using is VOS, verb-object-subject.) + +Now we add intensions. Because different people leave in different +worlds, the meaning of *leave* must depend on the world in which it is +being evaluated: + + let left (x:e) (w:s) = match (x, w) with ('c', 2) -> false | _ -> true;; + left ann 1;; (* true: Ann left in world 1 *) + left cam 2;; (* false: Cam didn't leave in world 2 *) + +This new definition says that everyone always left, except that +in world 2, Cam didn't leave. + +Note that although this general *left* is sensitive to world of +evaluation, it does not have the fully intensionalized type given in +the chart above, which was `(s->e)->s->t`. This is because +*left* does not exploit the additional resolving power provided by +making the subject an individual concept. In semantics jargon, we say +that *leave* is extensional with respect to its first argument. + +Therefore we will adopt the general strategy of defining predicates +in a way that they take arguments of the lowest type that will allow +us to make all the distinctions the predicate requires. When it comes +time to combine this predicate with monadic arguments, we'll have to +make use of various lifting predicates. + +Likewise, although *see* depends on the world of evaluation, it is +extensional in both of its syntactic arguments: + + let saw x y w = (w < 2) && (y < x);; + saw bill ann 1;; (* true: Ann saw Bill in world 1 *) + saw bill ann 2;; (* false: no one saw anyone in world 2 *) + +This (again, partially) intensionalized version of *see* coincides +with the `saw1` function we defined above for world 1; in world 2, no +one saw anyone. + +Just to keep things straight, let's review the facts: + +
+     World 1: Everyone left.
+              Ann saw Bill, Ann saw Cam, Bill saw Cam, no one else saw anyone.              
+     World 2: Ann left, Bill left, Cam didn't leave.
+              No one saw anyone.
+
+ +Now we are ready for the intensionality monad: + +
+type 'a intension = s -> 'a;;
+let unit x = fun (w:s) -> x;;
+(* as before, bind can be written more compactly, but having
+   it spelled out like this will be useful down the road *)
+let bind u f = fun (w:s) -> let a = u w in let u' = f a in u' w;;
+
+ +Then the individual concept `unit ann` is a rigid designator: a +constant function from worlds to individuals that returns `'a'` no +matter which world is used as an argument. This is a typical kind of +thing for a monad unit to do. + +Then combining a predicate like *left* which is extensional in its +subject argument with an intensional subject like `unit ann` is simply bind +in action: + + bind (unit ann) left 1;; (* true: Ann left in world 1 *) + bind (unit cam) left 2;; (* false: Cam didn't leave in world 2 *) + +As usual, bind takes a monad box containing Ann, extracts Ann, and +feeds her to the extensional *left*. In linguistic terms, we take the +individual concept `unit ann`, apply it to the world of evaluation in +order to get hold of an individual (`'a'`), then feed that individual +to the extensional predicate *left*. + +We can arrange for a transitive verb that is extensional in both of +its arguments to take intensional arguments: + + let lift2' f u v = bind u (fun x -> bind v (fun y -> f x y));; + +This is almost the same `lift2` predicate we defined in order to allow +addition in our division monad example. The difference is that this +variant operates on verb meanings that take extensional arguments but +returns an intensional result. Thus the original `lift2` predicate +has `unit (f x y)` where we have just `f x y` here. + +The use of `bind` here to combine *left* with an individual concept, +and the use of `lift2'` to combine *see* with two intensional +arguments closely parallels the two of Montague's meaning postulates +(in PTQ) that express the relationship between extensional verbs and +their uses in intensional contexts. + +
+lift2' saw (unit bill) (unit ann) 1;;  (* true *)
+lift2' saw (unit bill) (unit ann) 2;;  (* false *)
+
+ +Ann did see bill in world 1, but Ann didn't see Bill in world 2. + +Finally, we can define our intensional verb *thinks*. *Think* is +intensional with respect to its sentential complement, though still extensional +with respect to its subject. (As Montague noticed, almost all verbs +in English are extensional with respect to their subject; a possible +exception is "appear".) + + let thinks (p:s->t) (x:e) (w:s) = + match (x, p 2) with ('a', false) -> false | _ -> p w;; + +Ann disbelieves any proposition that is false in world 2. Apparently, +she firmly believes we're in world 2. Everyone else believes a +proposition iff that proposition is true in the world of evaluation. + + bind (unit ann) (thinks (bind (unit bill) left)) 1;; + +So in world 1, Ann thinks that Bill left (because in world 2, Bill did leave). + + bind (unit ann) (thinks (bind (unit cam) left)) 1;; + +But in world 1, Ann doesn't believe that Cam left (even though he +did leave in world 1: `bind (unit cam) left 1 == true`). Ann's thoughts are hung up on +what is happening in world 2, where Cam doesn't leave. + +*Small project*: add intersective ("red") and non-intersective + adjectives ("good") to the fragment. The intersective adjectives + will be extensional with respect to the nominal they combine with + (using bind), and the non-intersective adjectives will take + intensional arguments. + +notes: cascade, general env