<!-- λ Λ ∀ ≡ α β γ ρ ω Ω ○ μ η δ ζ ξ ⋆ ★ • ∙ ● 𝟎 𝟏 𝟐 𝟘 𝟙 𝟚 𝟬 𝟭 𝟮 ¢ ⇧ -->

[[!toc levels=2]]

The Reader Monad
================

The goal for this part is to introduce the Reader Monad, and present
two linguistics applications: binding and intensionality.  Along the
way, we'll continue to think through issues related to order, and a
related notion of flow of information.

At this point, we've seen monads in general, and three examples of
monads: the identity monad (invisible boxes), the Maybe monad (option
types), and the List monad.  

We've also seen an application of the Maybe monad to safe division.
The starting point was to allow the division function to return an int
option instead of an int.  If we divide 6 by 2, we get the answer Just
3.  But if we divide 6 by 0, we get the answer Nothing.  

The next step was to adjust the other arithmetic functions to teach them what to do if they received Nothing instead of a boxed integer.
This meant
changing the type of their input from ints to int options.  But we
didn't need to do this piecemeal; rather, we could "lift" the ordinary
arithmetic operations into the monad using the various tools provided
by the monad.  We'll go over this lifting operation in detail in the next section.

## Tracing the effect of safe-div on a larger computation

So let's see how this works in terms of a specific computation.

<pre>
\tree ((((+) (1)) (((*) (((/) (6)) (2))) (4))))

 ___________
 |         |
_|__    ___|___
|  |    |     |
+  1  __|___  4
      |    |
      *  __|___
         |    |
        _|__  2
        |  |
        /  6
</pre>

This computation should reduce to 13.  But given a specific reduction
strategy, we can watch the order in which the computation proceeds.
Following on the lambda evaluator developed during the previous
homework, let's adopt the following reduction strategy:

    In order to reduce (head arg), do the following in order:
    1. Reduce head to h'
    2. Reduce arg to a'.
    3. If (h' a') is a redex, reduce it.

There are many details left unspecified here, but this will be enough
for today.  The order in which the computation unfolds will be

    1. Reduce head (+ 1) to itself
    2. Reduce arg ((* ((/ 6) 2)) 3)
         1. Reduce head (* ((/ 6) 2))
              1. Reduce head * 
              2. Reduce arg ((/ 6) 2)
                   1. Reduce head (/ 6) to itself
                   2. Reduce arg 2 to itself
                   3. Reduce ((/ 6) 2) to 3
              3. Reduce (* 3) to itself
         2. Reduce arg 4 to itself
         3. Reduce ((* 3) 4) to 12
     3. Reduce ((+ 1) 12) to 13

This reduction pattern follows the structure of the original
expression exactly, at each node moving first to the left branch,
processing the left branch, then moving to the right branch, and
finally processing the results of the two subcomputation.  (This is
called depth-first postorder traversal of the tree.)

It will be helpful to see how the types change as we make adjustments.

    type num = int
    type contents = Num of num   | Op of (num -> num -> num)
    type tree = Leaf of contents | Branch of tree * tree

Never mind that these types will allow us to construct silly
arithmetric trees such as `+ *` or `2 3`.  Note that during the
reduction sequence, the result of reduction was in every case a
well-formed subtree.  So the process of reduction could be animated by
replacing subtrees with the result of reduction on that subtree, till
the entire tree is replaced by a single integer (namely, 13).

Now we replace the number 2 with 0:

<pre>
\tree ((((+) (1)) (((*) (((/) (6)) (0))) (4))))

 ___________
 |         |
_|__    ___|___
|  |    |     |
+  1  __|___  4
      |    |
      *  __|___
         |    |
        _|__  0
        |  |
        /  6
</pre>

When we reduce, we get quite a ways into the computation before things
go south:

    1. Reduce head (+ 1) to itself
    2. Reduce arg ((* ((/ 6) 0)) 3)
         1. Reduce head (* ((/ 6) 0))
              1. Reduce head * 
              2. Reduce arg ((/ 6) 0)
                   1. Reduce head (/ 6) to itself
                   2. Reduce arg 0 to itself
                   3. Reduce ((/ 6) 0) to ACKKKK

This is where we replace `/` with `safe-div`.  This means changing the
type of the arithmetic operators from `int -> int -> int` to 
`int -> int -> int option`; and since we now have to anticipate the
possibility that any argument might involve division by zero inside of
it, here is the net result for our types:

    type num = int option
    type contents = Num of num   | Op of (num -> num -> num)
    type tree = Leaf of contents | Branch of tree * tree

The only difference is that instead of defining our numbers to be
simple integers, they are now int options; and so Op is an operator
over int options.

At this point, we bring in the monadic machinery.  In particular, here
is the ⇧ and the map2 function from the notes on safe division:

    ⇧ (a: 'a) = Just a;;

    map2 (g : 'a -> 'b -> 'c) (u : 'a option) (v : 'b option) =
      match u with
      | None -> None
      | Some x ->
          (match v with
          | None -> None
          | Some y -> Some (g x y));;

Then we lift the entire computation into the monad by applying ⇧ to
the integers, and by applying `map1` to the operators:

<pre>
\tree ((((map2 +) (⇧1)) (((map2 *) (((map2 /) (⇧6)) (⇧0))) (⇧4))))

     ___________________
     |                 |
  ___|____         ____|_____
  |      |         |        |
map2 +  ⇧1    _____|_____  ⇧4
              |         |
            map2 *  ____|____
                    |       |
                 ___|____  ⇧0
                 |      |
               map2 /  ⇧6
</pre>

With these adjustments, the faulty computation now completes smoothly:

    1. Reduce head ((map2 +) ⇧1)
    2. Reduce arg (((map2 *) (((map2 /) ⇧6) ⇧2)) ⇧3)
         1. Reduce head ((map2 *) (((map2 /) ⇧6) ⇧2))
              1. Reduce head * 
              2. Reduce arg (((map2 /) ⇧6) ⇧0)
                   1. Reduce head ((map2 /) ⇧6)
                   2. Reduce arg ⇧0 
                   3. Reduce (((map2 /) ⇧6) ⇧0) to Nothing
              3. Reduce ((map2 *) Nothing) to Nothing
         2. Reduce arg ⇧4
         3. Reduce (((map2 *) Nothing) ⇧4) to Nothing
     3. Reduce (((map2 +) ⇧1) Nothing) to Nothing

As soon as we try to divide by 0, safe-div returns Nothing.
Thanks to the details of map2, the fact that Nothing has been returned
by one of the arguments of a map2-ed operator guarantees that the
map2-ed operator will pass on the Nothing as its result.  So the
result of each enclosing computation will be Nothing, up to the root
of the tree.

It is unfortunate that we need to continue the computation after
encountering our first Nothing.  We know immediately at the result of
the entire computation will be Nothing, yet we continue to compute
subresults and combinations.  It would be more efficient to simply
jump to the top as soon as Nothing is encoutered.  Let's call that
strategy Abort.  We'll arrive at an Abort operator later in the semester.

So at this point, we can see how the Maybe/option monad provides
plumbing that allows subcomputations to send information from one part
of the computation to another.  In this case, the safe-div function
can send the information that division by zero has been attempted
throughout the rest of the computation.  If you think of the plumbing
as threaded through the tree in depth-first, postorder traversal, then
safe-div drops Nothing into the plumbing half way through the
computation, and that Nothing travels through the rest of the plumbing
till it comes out of the result faucet at the top of the tree.

## Information flowing in the other direction: top to bottom

In the save-div example, a subcomputation created a message that
propagated upwards to the larger computation:

<pre>
                 message: Division by zero occurred!
                      ^
 ___________          |
 |         |          |
_|__    ___|___       |
|  |    |     |       |
+  1  __|___  4       |
      |    |          |
      *  __|___  -----|
         |    |
        _|__  0
        |  |
        /  6
</pre>

We might want to reverse the direction of information flow, making 
information available at the top of the computation available to the
subcomputations: 

<pre>
                    [λx]  
 ___________          |
 |         |          |
_|__    ___|___       |
|  |    |     |       |
+  1  __|___  4       |
      |    |          |
      *  __|___       |
         |    |       |
        _|__  x  <----|
        |  |
        /  6
</pre>

We've seen exactly this sort of configuration before: it's exactly
what we have when a lambda binds a variable that occurs in a deeply
embedded position.  Whatever the value of the argument that the lambda
form combines with, that is what will be substituted in for free
occurrences of that variable within the body of the lambda.

So our next step is to add a (primitive) version of binding to our
computation.  Rather than anticipating any number of binding
operators, we'll allow for just one binding dependency for now.

This example is independent of the safe-div example, so we'll return
to a situation in which the Maybe monad hasn't been added.  So the
types are the ones where numbers are just integers, not int options.
(In a couple of weeks, we'll start combining monads into a single
system; if you're impatient, you might think about how to do that now.)

    type num = int

And the computation will be without the map2 or the ⇧ from the option
monad.

As you might guess, the technique we'll use to arrive at binding will
be to use the Reader monad, defined here in terms of m-identity and bind:

    α --> int -> α
    ⇧a = \x.a
    u >>= f = \x.f(ux)x
    map2 u v = \x.ux(vx)

A boxed type in this monad will be a function from an integer to an
object in the original type.  The unit function ⇧ lifts a value `a` to
a function that expects to receive an integer, but throws away the
integer and returns `a` instead (most values in the computation don't
depend on the input integer).  

The bind function in this monad takes a monadic object `u`, a function
`f` lifting non-monadic objects into the monad, and returns a function
that expects an integer `x`.  It feeds `x` to `u`, which delivers a
result in the orginal type, which is fed in turn to `f`.  `f` returns
a monadic object, which upon being fed an integer, returns an object
of the orginal type.  

The map2 function corresponding to this bind operation is given
above.  It should look familiar---we'll be commenting on this
familiarity in a moment.

Lifing the computation into the monad, we have the following adjusted
types:

type num = int -> int

That is, `num` is once again replaced with the type of a boxed int.
When we were dealing with the Maybe monad, a boxed int had type `int
option`.  In this monad, a boxed int has type `int -> int`.

<pre>
\tree ((((map2 +) (⇧1)) (((map2 *) (((map2 /) (⇧6)) (x))) (⇧4))))

     __________________
     |                |
  ___|____        ____|_____
  |      |        |        |
map2 +  ⇧1    ____|_____  ⇧4
              |        |
            map2 *  ___|____
                    |      |
                 ___|____  x
                 |      |
               map2 /  ⇧6
</pre>

It remains only to decide how the variable `x` will access the value input
at the top of the tree.  Since the input value is supposed to be the
value put in place of the variable `x`.  Like every leaf in the tree
in argument position, the code we want in order to represent the
variable will have the type of a boxed int, namely, `int -> int`.  So
we have the following:

    x (i:int):int = i

That is, variables in this system denote the indentity function!

The result of evaluating this tree will be a boxed integer: a function
from any integer `x` to `(+ 1 (* (/ 6 x)) 4)`.  

Take a look at the definition of the reader monad again.  The
midentity takes some object `a` and returns `\x.a`.  In other words,
`⇧a = Ka`, so `⇧ = K`.  Likewise, `map2` for this monad is the `S`
combinator.  We've seen this before as a strategy for translating a
lambda abstract into a set of combinators.  Here is a part of the
general scheme for translating a lambda abstract into Combinatory
Logic.  The translation function `[.]` translates a lambda term into a
term in Combinatory Logic:

    [(MN)] = ([M] [N])
    [\a.a] = I
    [\a.M] = K[M]       (assuming a not free in M)
    [\a.(MN)] = S[\a.M][\a.N]

The reason we can make do with this subset of the full function is
that we're making the simplifying assumption that there is at most a
single lambda involved.  So here you see the I (the translation of the
bound variable), the K and the S.


## Jacobson's Variable Free Semantics as a Reader Monad

We've designed the discussion so far to make the following claim as
easy to show as possible: Jacobson's Variable Free Semantics
(e.g., Jacobson 1999, [Towards a
Variable-Free
Semantics](http://www.springerlink.com/content/j706674r4w217jj5/))
is a reader monad.

More specifically, it will turn out that Jacobson's geach combinator
*g* is exactly our `lift` operator, and her binding combinator *z* is
exactly our `bind` (though with the arguments reversed)!

Jacobson's system contains two main combinators, *g* and *z*.  She
calls *g* the Geach rule, and *z* performs binding.  Here is a typical
computation.  This implementation is based closely on email from Simon
Charlow, with beta reduction as performed by the on-line evaluator:

<pre>
; Analysis of "Everyone_i thinks he_i left"
let g = \f g x. f (g x) in
let z = \f g x. f (g x) x in
let he = \x. x in
let everyone = \P. FORALL x (P x) in

everyone (z thinks (g left he))

~~>  FORALL x (thinks (left x) x)
</pre>

Several things to notice: First, pronouns once again denote identity functions.
As Jeremy Kuhn has pointed out, this is related to the fact that in
the mapping from the lambda calculus into combinatory logic that we
discussed earlier in the course, bound variables translated to I, the
identity combinator (see additional comments below).  We'll return to
the idea of pronouns as identity functions in later discussions.

Second, *g* plays the role of transmitting a binding dependency for an
embedded constituent to a containing constituent.

Third, one of the peculiar aspects of Jacobson's system is that
binding is accomplished not by applying *z* to the element that will
(in some pre-theoretic sense) bind the pronoun, here, *everyone*, but
rather by applying *z* instead to the predicate that will take
*everyone* as an argument, here, *thinks*.  

The basic recipe in Jacobson's system, then, is that you transmit the
dependence of a pronoun upwards through the tree using *g* until just
before you are about to combine with the binder, when you finish off
with *z*.  (There are examples with longer chains of *g*'s below.)

Jacobson's *g* combinator is exactly our `lift` operator: it takes a
functor and lifts it into the monad.  
Furthermore, Jacobson's *z* combinator, which is what she uses to
create binding links, is essentially identical to our reader-monad
`bind`!  

<pre>
everyone (z thinks (g left he))

~~> forall w (thinks (left w) w)

everyone (z thinks (g (t bill) (g said (g left he))))

~~> forall w (thinks (said (left w) bill) w)
</pre>

(The `t` combinator is given by `t x = \xy.yx`; it handles situations
in which English word order places the argument (in this case, a
grammatical subject) before the predicate.)

So *g* is exactly `lift` (a combination of `bind` and `unit`), and *z*
is exactly `bind` with the arguments reversed.  It appears that
Jacobson's variable-free semantics is essentially a Reader monad.

## The Reader monad for intensionality

Now we'll look at using monads to do intensional function application.
This is just another application of the Reader monad, not a new monad.
In Shan (2001) [Monads for natural
language semantics](http://arxiv.org/abs/cs/0205026v1), Ken shows that
making expressions sensitive to the world of evaluation is conceptually
the same thing as making use of the Reader monad.
This technique was beautifully re-invented
by Ben-Avi and Winter (2007) in their paper [A modular
approach to
intensionality](http://parles.upf.es/glif/pub/sub11/individual/bena_wint.pdf),
though without explicitly using monads.

All of the code in the discussion below can be found here: [[code/intensionality-monad.ml]].
To run it, download the file, start OCaml, and say 

	# #use "intensionality-monad.ml";;

Note the extra `#` attached to the directive `use`.

First, the familiar linguistic problem:

       Bill left.  
	   Cam left.
	   Ann believes [Bill left].
	   Ann believes [Cam left].

We want an analysis on which the first three sentences can be true at
the same time that the last sentence is false.  If sentences denoted
simple truth values or booleans, we have a problem: if the sentences
*Bill left* and *Cam left* are both true, they denote the same object,
and Ann's beliefs can't distinguish between them.

The traditional solution to the problem sketched above is to allow
sentences to denote a function from worlds to truth values, what
Montague called an intension.  So if `s` is the type of possible
worlds, we have the following situation:


<pre>
Extensional types              Intensional types       Examples
-------------------------------------------------------------------

S         t                    s->t                    John left
DP        e                    s->e                    John
VP        e->t                 (s->e)->s->t            left
Vt        e->e->t              (s->e)->(s->e)->s->t    saw
Vs        t->e->t              (s->t)->(s->e)->s->t    thought
</pre>

This system is modeled on the way Montague arranged his grammar.
There are significant simplifications compared to Montague: for
instance, determiner phrases are thought of here as corresponding to
individuals rather than to generalized quantifiers.

The main difference between the intensional types and the extensional
types is that in the intensional types, the arguments are functions
from worlds to extensions: intransitive verb phrases like "left" now
take so-called "individual concepts" as arguments (type s->e) rather than plain
individuals (type e), and attitude verbs like "think" now take
propositions (type s->t) rather than truth values (type t).
In addition, the result of each predicate is an intension.
This expresses the fact that the set of people who left in one world
may be different than the set of people who left in a different world.

Normally, the dependence of the extension of a predicate to the world
of evaluation is hidden inside of an evaluation coordinate, or built
into the the lexical meaning function, but we've made it explicit here
in the way that the intensionality monad makes most natural.

The intensional types are more complicated than the extensional
types.  Wouldn't it be nice to make the complicated types available
for those expressions like attitude verbs that need to worry about
intensions, and keep the rest of the grammar as extensional as
possible?  This desire is parallel to our earlier desire to limit the
concern about division by zero to the division function, and let the
other functions, like addition or multiplication, ignore
division-by-zero problems as much as possible.

So here's what we do:

In OCaml, we'll use integers to model possible worlds. Characters (characters in the computational sense, i.e., letters like `'a'` and `'b'`, not Kaplanian characters) will model individuals, and OCaml booleans will serve for truth values:

	type s = int;;
	type e = char;;
	type t = bool;;

	let ann = 'a';;
	let bill = 'b';;
	let cam = 'c';;

	let left1 (x:e) = true;; 
	let saw1 (x:e) (y:e) = y < x;; 

	left1 ann;; (* true *)
	saw1 bill ann;; (* true *)
	saw1 ann bill;; (* false *)

So here's our extensional system: everyone left, including Ann;
and Ann saw Bill (`saw1 bill ann`), but Bill didn't see Ann.  (Note that the word
order we're using is VOS, verb-object-subject.)

Now we add intensions.  Because different people leave in different
worlds, the meaning of *leave* must depend on the world in which it is
being evaluated:

    let left (x:e) (w:s) = match (x, w) with ('c', 2) -> false | _ -> true;;
    left ann 1;; (* true: Ann left in world 1 *)
    left cam 2;; (* false: Cam didn't leave in world 2 *) 

This new definition says that everyone always left, except that 
in world 2, Cam didn't leave.

Note that although this general *left* is sensitive to world of
evaluation, it does not have the fully intensionalized type given in
the chart above, which was `(s->e)->s->t`.  This is because
*left* does not exploit the additional resolving power provided by
making the subject an individual concept.  In semantics jargon, we say
that *leave* is extensional with respect to its first argument.  

Therefore we will adopt the general strategy of defining predicates
in a way that they take arguments of the lowest type that will allow
us to make all the distinctions the predicate requires.  When it comes
time to combine this predicate with monadic arguments, we'll have to
make use of various lifting predicates.

Likewise, although *see* depends on the world of evaluation, it is
extensional in both of its syntactic arguments:

    let saw x y w = (w < 2) && (y < x);;
    saw bill ann 1;; (* true: Ann saw Bill in world 1 *)
    saw bill ann 2;; (* false: no one saw anyone in world 2 *)

This (again, partially) intensionalized version of *see* coincides
with the `saw1` function we defined above for world 1; in world 2, no
one saw anyone.

Just to keep things straight, let's review the facts:

<pre>
     World 1: Everyone left.
              Ann saw Bill, Ann saw Cam, Bill saw Cam, no one else saw anyone.              
     World 2: Ann left, Bill left, Cam didn't leave.
              No one saw anyone.
</pre>

Now we are ready for the intensionality monad:

<pre>
type 'a intension = s -> 'a;;
let unit x = fun (w:s) -> x;;
(* as before, bind can be written more compactly, but having
   it spelled out like this will be useful down the road *)
let bind u f = fun (w:s) -> let a = u w in let u' = f a in u' w;;
</pre>

Then the individual concept `unit ann` is a rigid designator: a
constant function from worlds to individuals that returns `'a'` no
matter which world is used as an argument.  This is a typical kind of
thing for a monad unit to do.

Then combining a predicate like *left* which is extensional in its
subject argument with an intensional subject like `unit ann` is simply bind
in action:

    bind (unit ann) left 1;; (* true: Ann left in world 1 *)
    bind (unit cam) left 2;; (* false: Cam didn't leave in world 2 *)

As usual, bind takes a monad box containing Ann, extracts Ann, and
feeds her to the extensional *left*.  In linguistic terms, we take the
individual concept `unit ann`, apply it to the world of evaluation in
order to get hold of an individual (`'a'`), then feed that individual
to the extensional predicate *left*.

We can arrange for a transitive verb that is extensional in both of
its arguments to take intensional arguments:

    let lift2' f u v = bind u (fun x -> bind v (fun y -> f x y));;

This is almost the same `lift2` predicate we defined in order to allow
addition in our division monad example.  The difference is that this
variant operates on verb meanings that take extensional arguments but
returns an intensional result.  Thus the original `lift2` predicate
has `unit (f x y)` where we have just `f x y` here.
  
The use of `bind` here to combine *left* with an individual concept,
and the use of `lift2'` to combine *see* with two intensional
arguments closely parallels the two of Montague's meaning postulates
(in PTQ) that express the relationship between extensional verbs and
their uses in intensional contexts.

<pre>
lift2' saw (unit bill) (unit ann) 1;;  (* true *)
lift2' saw (unit bill) (unit ann) 2;;  (* false *)
</pre>

Ann did see bill in world 1, but Ann didn't see Bill in world 2.

Finally, we can define our intensional verb *thinks*.  *Think* is
intensional with respect to its sentential complement, though still extensional
with respect to its subject.  (As Montague noticed, almost all verbs
in English are extensional with respect to their subject; a possible
exception is "appear".)

    let thinks (p:s->t) (x:e) (w:s) = 
      match (x, p 2) with ('a', false) -> false | _ -> p w;;

Ann disbelieves any proposition that is false in world 2.  Apparently,
she firmly believes we're in world 2.  Everyone else believes a
proposition iff that proposition is true in the world of evaluation.

    bind (unit ann) (thinks (bind (unit bill) left)) 1;;

So in world 1, Ann thinks that Bill left (because in world 2, Bill did leave).

    bind (unit ann) (thinks (bind (unit cam) left)) 1;;

But in world 1, Ann doesn't believe that Cam left (even though he
did leave in world 1: `bind (unit cam) left 1 == true`).  Ann's thoughts are hung up on
what is happening in world 2, where Cam doesn't leave.

*Small project*: add intersective ("red") and non-intersective
 adjectives ("good") to the fragment.  The intersective adjectives
 will be extensional with respect to the nominal they combine with
 (using bind), and the non-intersective adjectives will take
 intensional arguments.