[[!toc levels=2]]
# Doing things with monads
## Extended application: Groenendijk, Stokhof and Veltman's *Coreference and Modality*
GSV are interested in developing and establishing a reasonable theory
of discourse update. One way of looking at this paper is like this:
GSV = GS + V, where
GS = Dynamic theories of binding of Groenendijk and Stokhof, e.g.,
Dynamic Predicate Logic L&P 1991: dynamic binding, donkey anaphora
Dynamic Montague Grammar 1990: generalized quantifiers, discourse referents
V = a dynamic theory of epistemic modality, e.g.,
Veltman, Frank. "Data semantics."
In Truth, Interpretation and Information, Foris, Dordrecht (1984): 43-63.
That is, Groenendijk and Stokhof have a well-known theory of dynamic
semantics, and Veltman has a well-known theory of epistemic modality,
and this fragment brings both of those strands together into a single
system.
We will be interested in this paper both from a theoretical point of
view and from a practical engineering point of view. On the
theoretical level, these scholars are proposing a strategy for
managing the connection between variables and the objects they
designate in way that is flexible enough to be useful for describing
natural language.
## Basics of GSV's fragment
The fragment in this paper is unusually elegant. We'll present it on
its own terms, with the exception that we will not use pegs. See the
digression below concerning pegs for an explanation. After presenting
the paper, we'll re-engineering the fragment using explicit monads.
In this fragment, points of evaluation are not just worlds, but a pair
of a world and an assginment function. This is familiar from Heim's
1983 File Change Semantics. We'll follow GSV and call a
world-assignment pair a "possibility". Then a context is a set (an
"information state") is a set of possiblities. Infostates
simultaneously track both information about the world (which possible
worlds are live possibilities?) as well as information about the
discourse (which objects to the variables refer to?).
Worlds in general settle all matters of fact in the world. In
particular, they determine the extensions of predicates and relations.
The formal language the fragment interprets is Predicate Calculus with
equality, existential and universal quantification, along with one
unary modality (box and diamond, corresponding to epistemic necessity
and epistemic possibility).
An implementation in OCaml is available [[here|code/gsv.ml]]; consult
that code for details of syntax, types, and values.
Terms in this language are either individuals such as Alice or Bob, or
else variables. So in general, the referent of a term can depend on a
possibility:
ref(i, t) = t if t is an individual, and
g(t) if t is a variable, where i = (w,g)
Here are the main clauses for update (their definition 3.1).
Following GSV, we'll write `update(s, φ)` (the update of information
state `s` with the information in φ) as `s[φ]`.
s[P(t)] = {i in s | w(P)(ref(i,t))}
So `man(x)` is the set of live possibilities `i = (w,g)` in s such that
the set of men in `w` given by `w(man)` maps the object referred to by
`x`, namely, `g("x")`, to `true`. That is, update with "man(x)"
discards all possibilities in which "x" fails to refer to a man.
s[t1 = t2] = {i in s | ref(i,t1) == ref(i,t2)}
s[φ and ψ] = s[φ][ψ]
When updating with a conjunction, first update with the left conjunct,
then update with the right conjunct.
Existential quantification is somewhat intricate.
s[∃xφ] = Union {{(w, g[x->a]) | (w,g) in s}[φ] | a in ent}
Here's the recipe: given a starting infostate s, choose an object a
from the domain of discourse. Construct a modified infostate s' by
adjusting the assignment function of each possibility so as to map the variable x to a.
Then update s' with φ. Finally, take the union over the results of
doing this for every object a in the domain of discourse.
Negation is natural enough:
s[neg φ] = {i | {i}[φ] = {}}
If updating φ with the information state that contains only the
possibility i returns the empty information state, then not φ is true
with respect to i.
In GSV, disjunction, the conditional, and the universals are defined
in terms of negation and the other connectives (see fact 3.2).
Exercise: assume that there are three entities in the domain of
discourse, Alice, Bob, and Carl. Assume that Alice is a woman, and
Bob and Carl are men.
Compute the following:
1. {(w,g)}[∃x.man(x)]
= {(w,g[n->a])}[man(x)] ++ {(w,g[n->b])}[man(x)]
++ {(w,g[n->c])}[man(x)]
= {} ++ {(w,g[n->b])} ++ {(w,g[n->c])}
= {(w,g[n->a]),(w,g[n->b]),(w,g[n->c])}
-- Bob and Carl are men
2. {(w,g)}[∃x.woman(x)]
3. {(w,g)}[∃x∃y.man(x) and man(y)]
4. {(w,n,r,g)}[∃x∃y.x=y]
Running the [[code|code/gsv.ml]] gives the answers.
## Order and modality
The final remaining update rule concerns modality:
s[◊φ] = {i in s | s[φ] ≠ {}}
This is a peculiar rule: a possibility `i` will survive update just in
case something is true of the information state `s` as a whole. That
means that either every `i` in `s` will survive, or none of them will.
The criterion is that updating `s` with the information in the
prejacent φ does not produce the contradictory information state
(i.e., `{}`).
So let's explore what this means. GSV offer a contrast between two
discourses that differ only in the order in which the updates occur.
The fact that the predictions of the fragment differ depending on
order shows that the system is order-sensitive.
1. Alice isn't hungry. #Alice might be hungry.
According to GSV, the combination of these sentences in this order is
`inconsistent', and they mark the second sentence with the star of
ungrammaticality. We'll say instead that the discourse is
gramamtical, leave the exact way to think about its intuitive status
up for grabs. What is important for our purposes is to get clear on
how the fragment behaves with respect to these sentences.
We'll start with an infostate containing two possibilities. In one
possibility, Alice is hungry (call this possibility "hungry"); in the
other, she is not (call it "full").
{hungry, full}[Alice isn't hungry][Alice might be hungry]
= {full}[Alice might be hungry]
= {}
As usual in dynamic theories, a sequence of sentences is treated as if
the sentence were conjoined. This is the same thing as updating with
the first sentence, then updating with the second sentence.
Update with *Alice isn't hungry* eliminates the possibility in which
Alice is hungry, leaving only the possibility in which she is full.
Subsequent update with *Alice might be hungry* depends on the result
of updating with the prejacent, *Alice is hungry*. Let's do that side
calculation:
{full}[Alice is hungry]
= {}
Because the only possibility in the information state is one in which
Alice is not hungry, update with *Alice is hungry* results in an empty
information state. That means that update with *Alice might be
hungry* will also be empty, as indicated above.
In order for update with *Alice might be hungry* to be non-empty,
there must be at least one possibility in the input state in which
Alice is hungry. That is what epistemic might means in this fragment:
the prejacent must be possible. But update with *Alice isn't hungry*
eliminates all possibilities in which Alice is hungry. So the
prediction of the fragment is that update with the sequence in (1)
will always produce an empty information state.
In contrast, consider the sentences in the opposite order:
2. Alice might be hungry. Alice isn't hungry.
We'll start with the same two possibilities.
= {hungry, full}[Alice might be hungry][Alice isn't hungry]
= {hungry, full}[Alice isn't hungry]
= {full}
GSV comment that a single speaker couldn't possibly be in a position
to utter the discourse in (2). The reason is that in order for the
speaker to appropriately assert that Alice isn't hungry, that speaker
would have to possess knowledge (or sufficient justification,
depending on your theory of the norms for assertion) that Alice isn't
hungry. But if they know that Alice isn't hungry, they couldn't
appropriately assert *Alice might be hungry*, based on the predictions
of the fragment.
Another view is that it can be acceptable to assert a sentence if it
is supported by the information in the common ground. So if the
speaker assumes that as far as the listener knows, Alice might be
hungry, they can utter the discourse in (2). Here's a variant that
makes this thought more vivid:
3. Based on public evidence, Alice might be hungry.
But in fact I have private knowledge that she's not hungry.
The main point to appreciate here is that the update behavior of the
discourses depends on the order in which the updates due to the
individual sentence occur.
Note, incidentally, that there is an asymmetry in the fragment
concerning negation.
4. Alice might be hungry. Alice *is* hungry.
5. Alice is hungry. (So of course) Alice might be hungry.
Both of these discourses lead to the same update effect: all and only
those possibilites in which Alice is hungry survive. You might think
that asserting *might* requires that the prejacent be not only
possible, but undecided. If you like this idea, you can easily write
an update rule for the diamond on which update with the prejacent and
its negation must both be non-empty.
## Order and binding
The GSV fragment differs from the DPL and the DMG dynamic semantics in
important details. Nevertheless, it says something highly similar to
DPL about anaphora, binding, quantificational binding, and donkey
anaphora (at least, when modality is absent, as we'll discuss below).
In particular, continuing the theme of order-based asymmetries,
6. A man^x entered. He_x sat.
7. He_x sat. A man^x entered.
These discourses differ only in the order of the sentences. Yet the
first allows for coreference between the indefinite and the pronoun,
where the second discourse does not. In order to demonstrate, we'll
need an information state whose refsys is defined for at least one
variable.
8. {(w,g[x->b])}
This infostate contains a refsys and an assignment that maps the
variable x to Bob. Here are the facts in world w:
extension w "enter" a = false
extension w "enter" b = true
extension w "enter" c = true
extension w "sit" a = true
extension w "sit" b = true
extension w "sit" c = false
We can now consider the discourses in (6) and (7) (after magically
converting them to the Predicate Calculus):
9. Someone^x entered. He_x sat.
{(w,g[x->b])}[∃x.enter(x)][sit(x)]
= ( {(w,g[x->b][x->a])}[enter(x)]
++ {(w,g[x->b][x->b])}[enter(x)]
++ {(w,g[x->b][x->c])}[enter(x)])[sit(x)]
-- "enter(x)" filters out the possibility in which x refers
-- to Alice, since Alice didn't enter
= ( {}
++ {(w,g[x->b][x->b])}
++ {(w,g[x->b][x->c])})[sit(x)]
-- "sit(x)" filters out the possibility in which x refers
-- to Carl, since Carl didn't sit
= {(w,g[x->b][x->b])}
One of the key facts here is that even though the existential has
scope only over the first sentence, in effect it binds the pronoun in
the following clause. This is characteristic of dynamic theories in
the style of Groenendijk and Stokhof, including DPL and DMG.
The outcome is different if the order of the sentences is reversed.
10. He_x sat. Someone^x entered.
{(w,g[x->b])}[sit(x)][∃x.enter(x)]
-- evaluating `sit(x)` rules out nothing, since (coincidentally)
-- x refers to Bob, and Bob is a sitter
= {(w,g[x->b])}[∃x.enter(x)]
-- Just as before, the existential adds a new peg and assigns
-- it to each object
= {(w,g[x->b][x->a])}[enter(x)]
++ {(w,g[x->b][x->b])}[enter(x)]
++ {(w,g[x->b][x->c])}[enter(x)]
-- enter(x) eliminates all those possibilities in which x did
-- not enter
= {} ++ {(w,g[x->b][x->b])}
++ {(w,g[x->b][x->c])}
= {(w,g[x->b][x->b]), (w,g[x->b][x->c])}
The result is different than before. Before, there was only one
possibility: that x refered to the only person who both entered and
sat. Here, there remain two possibilities: that x refers to Bob, or
that x refers to Carl. This makes predictions about the
interpretation of continuations of the dialogs:
11. A man^x entered. He_x sat. He_x spoke.
12. He_x sat. A man^x entered. He_x spoke.
The construal of (11) as marked entails that the person who spoke also
entered and sat. The construal of (12) guarantees only that the
person who spoke also entered. There is no guarantee that the person
who spoke sat.
Intuitively, there is a strong impression in (12) that the person who
entered and spoke not only should not be identified as the person who
sat, he should be different from the person who sat. Some dynamic
systems, such as Heim's File Change Semantics, guarantee non-identity.
That is not guaranteed by the GSV fragment. If you wanted to add this
as a refinement to the fragment, you could require that the
existential only considers object in the domain that are not in the
range of the starting assignment function.
As usual with dynamic semantics, a point of pride is the ability to
give a good account of donkey anaphora, as in
13. If a woman entered, she sat.
See the paper for details.
## Interactions of binding with modality
At this point, we have a fragment that handles modality, and that
handles indefinites and pronouns. It it only interesting to combine
these two elements if they interact in non-trivial ways. This is
exactly what GSV argue.
The discussion of indefinites in the previous section established the
following dynamic equivalence:
(∃x.enter(x)) and (sit(x)) ≡ ∃x (enter(x) and sit(x))
In words, existentials take effective scope over subsequent clauses.
The presence of modal possibility, however, disrupts this
generalization. GSV illustrate this with the following story.
The Broken Vase:
There are three children: Alice, Bob, and Carl.
One of them broke a vase.
Alice is known to be innocent.
Someone is hiding in the closet.
(∃x.closet(x)) and (◊guilty(x)) ≡/≡ ∃x (closet(x) and ◊guilty(x))
To see this, we'll start with the left hand side. We'll need at least
two worlds.
in closet guilty
--------------- ---------------
w: a true a false
b false b true
c true c false
w': a false a false
b false b false
c true c true
GSV say that (∃x.closet(x)) and (◊guilty(x)) is true if there is at
least one possibility in which a person in the closet is guilty. In
this scenario, world w' is the verifying world: Carl is in the closet,
and he's guilty. It remains possible that there are closet hiders who
are not guilty in any world. Alice fits this bill: she's in the
closet in world w', but she is not guilty in any world.
Let's see how this works out in detail.
14. Someone^x is in the closet. He_x might be guilty.
{(w,g), (w',g}[∃x.closet(x)][◊guilty(x)]
-- existential introduces new peg
= ( {(w,g[x->a])}[closet(x)]
++ {(w,g[x->b])}[closet(x)]
++ {(w,g[x->c])}[closet(x)]
++ {(w',g[x->a])}[closet(x)]
++ {(w',g[x->b])}[closet(x)]
++ {(w',g[x->c])}[closet(x)])[◊guilty(x)]
-- only possibilities in which x is in the closet survive
-- the first update
= {(w,g[x->a]), (w',g[x->c])}[◊guilty(x)]
-- Is there any possibility in which x is guilty?
-- yes: for x = Carl, in world w' Carl broke the vase
-- that's enough for the possiblity modal to allow the entire
-- infostate to pass through unmodified.
= {(w,g[x->a]),(w',g[x->c])}
Now we consider the second half:
15. Someone^x is in the closet who_x might be guilty.
{(w,g), (w',g)}[∃x(closet(x) & ◊guilty(x))]
= {(w,g[x->a])}[closet(x)][◊guilty(x)]
++ {(w,g[x->b])}[closet(x)][◊guilty(x)]
++ {(w,g[x->c])}[closet(x)][◊guilty(x)]
++ {(w',g[x->a])}[closet(x)][◊guilty(x)]
++ {(w',g[x->b])}[closet(x)][◊guilty(x)]
++ {(w',g[x->c])}[closet(x)][◊guilty(x)]
-- filter out possibilities in which x is not in the closet
-- and filter out possibilities in which x is not guilty
-- the only person who was guilty in the closet was Carl in
-- world w'
= {(w',g[x->c])}
The result is different. Fewer possibilities remain.
We have elminated both possible worlds and possible discourses.
So the second formula is more informative.
As we discovered in class, there is considerable work to be done to
decide which expressions in natural language (if any) are capable of
expressing which of the two translations into the GSV fragment. We
can certainly grasp the truth conditions, but that is not the same
thing as discovering that there are natural language sentences that
express one or the other or both.
## Binding, modality, and identity
The fragment correctly predicts the following contrast:
16. Someone^x entered. He_x might be Bob. He_x might not be Bob.
(∃x.enter(x)) & ◊x=b & ◊not(x=b)
-- This discourse requires a possibility in which Bob entered
-- and another possibility in which someone who is not Bob entered
17. Someone^x entered who might be Bob and who might not be Bob.
∃x (enter(x) & ◊x=b & ◊not(x=b))
-- This is a contradition: there is no single person who might be Bob
-- and who simultaneously might be someone else
These formulas are expressing extensional, de-reish intuitions. If we
add individual concepts to the fragment, the ability to express
fancier claims would come along.
## GSV's "Identifiers"
Let α be a term which differs from x. Then α is an identifier if the
following formula is supported by every information state:
∀x(◊(x=α) --> (x=α))
The idea is that α is an identifier just in case there is only one
object that it can refer to. Here is what GSV say:
A term is an identifier per se if no mattter what the information
state is, it cannot fail to decie what the denotation of the term is.
## Digression on pegs
One of the more salient aspects of the technical part of the paper is
that GSV insert an extra level in between the variable and the object:
instead of having an assignment function that maps variables directly
onto objects, GSV provide *pegs*: variables map onto pegs, and pegs
map onto objects. It happens that pegs play no role in the paper
whatsoever. We'll demonstrate this by providing a faithful
implementation of the paper that does not use pegs at all.
Nevertheless, it makes sense to pause here to discuss pegs briefly,
since this technique is highly relevant to one of the main
applications of the course, namely, reference and coreference.
What are pegs? The term harks back to a 1986 paper by Fred Landman
called `Pegs and Alecs'. Pegs are simply hooks for hanging properties
on. Pegs are supposed to be as anonymous as possible. Think of
hanging your coat on a physical peg: you don't care which peg it is,
only that there are enough pegs for everyone's coat to hang from.
Likewise, for the pegs of GSV, all that matters is that there are
enough of them. (Incidentally, there is nothing in Gronendijk and
Stokhof's original DPL paper that corresponds naturally to pegs; but
in their Dynamic Montague Grammar paper, pegs serve a purpose similar
to discourse referents there, though the connection is not simple.)
Pegs can be highly useful for exploring puzzles of reference and
coreference.
Standard assignment function System with Pegs (drefs)
---------------------------- ------------------------
Variable Object Var Peg Object
--------- ------- --- --- ------
x --> a x --> 0 --> a
y -/ y -/
z --> b z --> 1 --> a
A standard assignment function can map two different variables onto
the same object. In the diagram, x and y are both mapped onto the
object a. With discourse referents in view, we can have two different
flavors of coreference. Just as with ordinary assignment functions,
variables can be mapped onto pegs (discourse referents) that are in
turn mapped onto the same object. In the diagram, x is mapped onto
the peg 0, which in turn is mapped onto the object a, and z is mapped
onto a discourse referent that is mapped onto a. On a deeper level,
we can suppose that y is mapped onto the same discourse referent as
x. With a system like this, we are free to reassign the discourse
referent associated with z to a different object, in which case x and
z will no longer refer to the same object. But there is no way to
change the object associated with x without necessarily changing the
object associated with y. They are coreferent in a deeper, less
accidental sense.
GSV could make use of this expressive power. But they don't. In
fact, their system is careful designed to guarantee that every
variable is assigned a discourse referent distinct from all previous
discourse referents.
End of digression on pegs.