[[!toc levels=2]] # Doing things with monads ## Extended application: Groenendijk, Stokhof and Veltman's *Coreference and Modality* GSV are interested in developing and establishing a reasonable theory of discourse update. One way of looking at this paper is like this: GSV = GS + V GS = Dynamic Predicate Logic L&P 1991: dynamic binding, donkey anaphora Dynamic Montague Grammar 1990: generalized quantifiers, discourse referents V = epistemic modality That is, Groenendijk and Stokhof have a well-known theory of dynamic semantics, and Veltman has a well-known theory of epistemic modality, and this fragment brings both of those strands together into a single system. We will be interested in this paper both from a theoretical point of view and from a practical engineering point of view. On the theoretical level, these scholars are proposing a strategy for managing the connection between variables and the objects they designate in way that is flexible enough to be useful for describing natural language. The main way they attempt to do this is by inserting an extra level in between the variable and the object: instead of having an assignment function that maps variables directly onto objects, GSV provide *pegs*: variables map onto pegs, and pegs map onto objects. We'll discuss in considerable detail what pegs allow us to do, since it is highly relevant to one of the main applications of the course, namely, reference and coreference. What are pegs? The term harks back to a paper by Landman called `Pegs and Alecs'. There pegs are simply hooks for hanging properties on. Pegs are supposed to be as anonymous as possible. Think of hanging your coat on a physical peg: you don't care which peg it is, only that there are enough pegs for everyone's coat to hang from. Likewise, for the pegs of GSV, all that matters is that there are enough of them. (Incidentally, there is nothing in Gronendijk and Stokhof's original DPL paper that corresponds naturally to pegs; but in their Dynamic Montague Grammar paper, pegs serve a purpose similar to discourse referents there, though the connection is not simple.) On an engineering level, the fact that GSV are combining anaphora and bound quantification with epistemic quantification means that they are gluing together related but distinct subsystems into a single fragment. These subsystems naturally cleave into separate layers in a way that is obscured in the paper. We will argue in detail that re-engineering GSV using monads will lead to a cleaner system that does all of the same theoretical work. Empirical targets: on the anaphoric side, GSV want to On the epistemic side, GSV aim to account for asymmetries such as It might be raining. It's not raining. #It's not raining. It might be raining. ## Basics There are a lot of formal details in the paper in advance of the empirical discussion. Here are the ones that matter for our purposes: type var = string type peg = int type refsys = var -> peg type ent = Alice | Bob | Carl type assignment = peg -> ent So in order to get from a variable to an object, we have to compose a refsys `r` with an assignment `g`. For instance, we might have r (g ("x")) = Alice. A question to keep in mind as we proceed is why the mapping from variables to objects has been articulated into two functions. Why not map variables directly to objects? (We'll return to this question later.) type pred = string type world = pred -> ent -> bool type pegcount = int type poss = world * pegcount * refsys * assignment type infostate = [poss] Worlds in general settle all matters of fact in the world. In particular, they determine the extensions of predicates and relations. In this discussion, we'll (crudely) approximate worlds by making them a function from predicates such as "man" to a function mapping each entity to a boolean. As we'll see, indefinites as a side effect increase the number of pegs by one. GSV assume that we can determine what integer the next unused peg corresponds to by examining the range of the refsys function. We'll make things easy on ourselves by simply tracking the total number of used pegs in a counter called `pegcount`. So information states track both facts about the world (e.g., which objects count as a man), and facts about the discourse (e.g., how many pegs have been used). The formal language the fragment interprets is Predicate Calculus with equality, existential and universal quantification, and one unary modality (box and diamond, corresponding to epistemic necessity and epistemic possibility). Terms in this language are either individuals such as Alice or Bob, or else variables. So in general, the referent of a term can depend on a possibility: ref(i, t) = t if t is an individual, and g(r(t)) if t is a variable, where i = (w,n,r,g) Here are the main clauses for update (their definition 3.1). Following GSV, we'll write `update(s, φ)` (the update of information state `s` with the information in φ) as `s[φ]`. s[P(t)] = {i in s | w(P)(ref(i,t))} So `man(x)` is the set of live possibilities `i = (w,r,g)` in s such that the set of men in `w` given by `w(man)` maps the object referred to by `x`, namely, `r(g("x"))`, to `true`. That is, update with "man(x)" discards all possibilities in which "x" fails to refer to a man. s[t1 = t2] = {i in s | ref(i,t1) = ref(i,t2)} s[φ and ψ] = s[φ][ψ] When updating with a conjunction, first update with the left conjunct, then update with the right conjunct. Existential quantification is somewhat intricate. s[∃xφ] = Union {{(w, n+1, r[x->n], g[n->a]) | (w,n,r,g) in s}[φ] | a in ent} Here's the recipe: given a starting infostate s, choose an object a from the domain of discourse. Construct a modified infostate s' by adding a peg to each possibility in s and adjusting the refsys and the assignment in order to map the variable x to a. Then update s' with φ, and collect the results of doing this for every object a in the domain of discourse. Negation is natural enough: s[neg φ] = {i | {i}[φ] = {}} If updating φ with the information state that contains only the possibility i returns the empty information state, then not φ is true with respect to i. In GSV, disjunction, the conditional, and the universals are defined in terms of negation and the other connectives (see fact 3.2). Exercise: assume that there are two entities in the domain of discourse, Alice and Bob. Assume that Alice is a woman, and Bob is a man. We're using `++` here to mean set union. 1. {(w,n,r,g)}[∃x.person(x)] = {(w,n+1,r[x->n],g[n->a])}[person(x)] ++ {(w,n+1,r[x->n],g[n->b])}[person(x)] = {(w,n+1,r[x->n],g[n->a])} ++ {(w,n+1,r[x->n],g[n->b])} = {(w,n+1,r[x->n],g[n->a]),(w,n+1,r[x->n],g[n->b])} -- both a and b are people 2. {(w,n,r,g)}[∃x.man(x)] = {(w,n+1,r[x->n],g[n->a])}[man(x)] ++ {(w,n+1,r[x->n],g[n->b])}[man(x)] = {} ++ {(w,n+1,r[x->n],g[n->b])} = {(w,n+1,r[x->n],g[n->b])} -- only b is a man 3. {(w,n,r,g)}[∃x∃y.person(x) and person(y)] = {(w,n+1,r[x->n],g[n->a])}[∃y.person(x) and person(y)] ++ {(w,n+1,r[x->n],g[n->b])}[∃y.person(x) and person(y)] = ( {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a])}[person(x)][person(y)] ++ {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->b])}[person(x)][person(y)]) ++ ( {(w, n+2, r[x->n][y->n+1], g[n->b][n+1->a])}[person(x)][person(y)] ++ {(w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])}[person(x)][person(y)]) = {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a]), (w, n+2, r[x->n][y->n+1], g[n->a][n+1->b])} ++ {(w, n+2, r[x->n][y->n+1], g[n->b][n+1->a]), (w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])} = {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a]), (w, n+2, r[x->n][y->n+1], g[n->a][n+1->b]), (w, n+2, r[x->n][y->n+1], g[n->b][n+1->a]), (w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])} -- there are four ways of assigning x and y to people 4. {(w,n,r,g)}[∃x∃y.x=y] = ( {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a])}[x=y] ++ {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->b])}[x=y] ++ ( {(w, n+2, r[x->n][y->n+1], g[n->b][n+1->a])}[x=y] ++ {(w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])}[x=y] = {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a])} ++ {(w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])} = {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a]), (w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])} -- two ways to assign x and y to the same value ## Order and modality The final remaining update rule concerns modality: s[◊φ] = {i in s | s[φ] ≠ {}} This is a peculiar rule: a possibility `i` will survive update just in case something is true of the information state `s` as a whole. That means that either every `i` in `s` will survive, or none of them will. The criterion is that updating `s` with the information in φ does not produce the contradictory information state (i.e., `{}`). So let's explore what this means. GSV offer a contrast between two discourses that differ only in the order in which the updates occur. The fact that the predictions of the fragment differ depending on order shows that the system is order-sensitive. 1. Alice isn't hungry. #Alice might be hungry. According to GSV, the combination of these sentences in this order is `inconsistent', and they mark the second sentence with the star of ungrammaticality. We'll say instead that the discourse is gramamtical, leave the exact word to use for its intuitive effect up for grabs. What is important for our purposes is to get clear on how the fragment behaves with respect to these sentences. We'll start with an infostate containing two possibilities. In one possibility, Alice is hungry (call this possibility "hungry"); in the other, she is not (call it "full"). {hungry, full}[Alice isn't hungry][Alice might be hungry] = {full}[Alice might be hungry] = {} As usual in dynamic theories, a sequence of sentences is treated as if the sentence were conjoined. This is the same thing as updating with the first sentence, then updating with the second sentence. Update with *Alice isn't hungry* eliminates the possibility in which Alice is hungry, leaving only the possibility in which she is full. Subsequent update with *Alice might be hungry* depends on the result of updating with the prejacent, *Alice is hungry*. Let's do that side calculation: {full}[Alice is hungry] = {} Because the only possibility in the information state is one in which Alice is not hungry, update with *Alice is hungry* results in an empty information state. That means that update with *Alice might be hungry* will also be empty, as indicated above. In order for update with *Alice might be hungry* to be non-empty, there must be at least one possibility in the input state in which Alice is hungry. That is what epistemic might means in this fragment: the prejacent must be possible. But update with *Alice isn't hungry* eliminates all possibilities in which Alice is hungry. So the prediction of the fragment is that update with the sequence in (1) will always produce an empty information state. In contrast, consider the sentences in the opposite order: 2. Alice might be hungry. Alice isn't hungry. We'll start with the same two possibilities. = {hungry, full}[Alice might be hungry][Alice isn't hungry] = {hungry, full}[Alice isn't hungry] = {full} Update with *Alice might be hungry* depends on the result of updating with the prejacent, *Alice is hungry*. Here's the side calculation: {hungry, full}[Alice is hungry] = {hungry} Since this update is non-empty, all of the original possibilities survive update with *Alice might be hungry*. By now it should be obvious that update with a *might* sentence either has no effect, or produces an empty information state. The net result is that we can then go on to update with *Alice isn't hungry*, yielding an updated information state that contains only possibilities in which Alice isn't hungry. GSV comment that a single speaker couldn't possibly be in a position to utter the discourse in (2). The reason is that in order for the speaker to appropriately assert that Alice isn't hungry, that speaker would have to possess knowledge (or sufficient justification, depending on your theory of the norms for assertion) that Alice isn't hungry. But if they know that Alice isn't hungry, they couldn't appropriately assert *Alice might be hungry*, based on the predictions of the fragment. Another view is that it can be acceptable to assert a sentence if it is supported by the information in the common ground. So if the speaker assumes that as far as the listener knows, Alice might be hungry, they can utter the discourse in (2). Here's a variant that makes this thought more vivid: 3. Based on public evidence, Alice might be hungry. But in fact she's not hungry. The main point to appreciate here is that the update behavior of the discourses depends on the order in which the updates due to the individual sentence occur. Note, incidentally, that there is an asymmetry in the fragment concerning negation. 4. Alice might be hungry. Alice *is* hungry. 5. Alice is hungry. (So of course) Alice might be hungry. Both of these discourses lead to the same update effect: all and only those possibilites in which Alice is hungry survive. If you think that asserting *might* requires that the prejacent be undecided, you will have to consider an update rule for the diamond on which update with the prejacent and its negation must both be non-empty. ## Binding The GSV fragment differs from the DPL and the DMG dynamic semantics in important details. Nevertheless, it has more or less the same things to say about anaphora, binding, quantificational binding, and donkey anaphora. In particular, continuing the theme of order-based asymmetries, 6. A man^x entered. He_x sat. 7. He_x sat. A man^x entered. These discourses differ only in the order of the sentences. Yet the first allows for coreference between the indefinite and the pronoun, where the second discourse does not. In order to demonstrate, we'll need an information state whose refsys is defined for at least one variable. 8. {(w,1,r[x->0],g[0->b])} This infostate contains a refsys and an assignment that maps the variable x to Bob. Here are the facts in world w: w "enter" a = false w "enter" b = true w "enter" c = true w "sit" a = true w "sit" b = true w "sit" c = false We can now consider the discourses in (6) and (7) (after magically converting them to the Predicate Calculus): 9. Someone^x entered. He_x sat. {(w,1,r[x->0],g[0->b])}[∃x.enter(x)][sit(x)] -- the existential adds a new peg and assigns it to each -- entity in turn = ( {(w,2,r[x->0][x->1],g[0->b][1->a])}[enter(x)] ++ {(w,2,r[x->0][x->1],g[0->b][1->b])}[enter(x)] ++ {(w,2,r[x->0][x->1],g[0->b][1->c])}[enter(x)])[sit(x)] -- "enter(x)" filters out the possibility in which x refers -- to Alice, since Alice didn't enter = ( {} ++ {(w,2,r[x->0][x->1],g[0->b][1->b])} ++ {(w,2,r[x->0][x->1],g[0->b][1->c])})[sit(x)] -- "sit(x)" filters out the possibility in which x refers -- to Carl, since Carl didn't sit = {(w,2,r[x->0][x->1],g[0->b][1->b])} Note that `r[x->0][x->1]` maps `x` to 1---the outermost adjustment is the operative one. In other words, `r[x->0][x->1] == (r[x->0])[x->1]`. One of the key facts here is that even though the existential has scope only over the first sentence, in effect it binds the pronoun in the following clause. This is characteristic of dynamic theories in the style of Groenendijk and Stokhof, including DPL and DMG. The outcome is different if the order of the sentences is reversed. 10. He_x sat. Someone^x entered. {(w,1,r[x->0],g[0->b])}[sit(x)][∃x.enter(x)] -- evaluating `sit(x)` rules out nothing, since (coincidentally) -- x refers to Bob, and Bob is a sitter = {(w,1,r[x->0],g[0->b])}[∃x.enter(x)] -- Just as before, the existential adds a new peg and assigns -- it to each object = {(w,2,r[x->0][x->1],g[0->b][1->a])}[enter(x)] ++ {(w,2,r[x->0][x->1],g[0->b][1->b])}[enter(x)] ++ {(w,2,r[x->0][x->1],g[0->b][1->c])}[enter(x)] -- enter(x) eliminates all those possibilities in which x did -- not enter = {} ++ {(w,2,r[x->0][x->1],g[0->b][1->b])} ++ {(w,2,r[x->0][x->1],g[0->b][1->c])} = {(w,2,r[x->0][x->1],g[0->b][1->b]), (w,2,r[x->0][x->1],g[0->b][1->c])} The result is different than before. Before, there was only one possibility: that x refered to the only person who both entered and sat. Here, there remain two possibilities: that x refers to Bob, or that x refers to Carl. This makes predictions about the interpretation of continuations of the dialogs: 11. A man^x entered. He_x sat. He_x spoke. 12. He_x sat. A man^x entered. He_x spoke. The construal of (11) as marked entails that the person who spoke also entered and sat. The construal of (12) guarantees only that the person who spoke also entered. There is no guarantee that the person who spoke sat. Intuitively, there is a strong impression in (12) that the person who entered and spoke not only should not be identified as the person who sat, he should be different from the person who sat. Some dynamic systems, such as Heim's File Change Semantics, guarantee non-identity. That is not guaranteed by the GSV fragment. The GSV guarantees that the indefinite introduces a novel peg, but there is no requirement that the peg refers to a novel object. If you wanted to add this as a refinement to the fragment, you could require that whenever a new peg gets added, it must be mapped onto an object that is not in the range of the original assignment function. As usual with dynamic semantics, a point of pride is the ability to give a good account of donkey anaphora, as in 13. If a woman entered, she sat. See the paper for details. ## Interactions of binding with modality At this point, we have a fragment that handles modality, and that handles indefinites and pronouns. It it only interesting to combine these two elements if they interact in non-trivial ways. This is exactly what GSV argue. The discussion of indefinites in the previous section established the following dynamic equivalence: (∃x.enter(x)) and (sit(x)) ≡ ∃x (enter(x) and sit(x)) In words, existentials take effective scope over subsequent clauses. The presence of modal possibility, however, disrupts this generalization. GSV illustrate this with the following story. The Broken Vase: There are three sons, Bob, Carl, and Dave. One of them broke a vase. Bob is known to be innocent. Someone is hiding in the closet. (∃x.closet(x)) and (◊guilty(x)) ≡/≡ ∃x (closet(x) and ◊guilty(x)) To see this, we'll start with the left hand side. We'll need at least two worlds. in closet guilty --------------- --------------- w: b false b false c false c false d true d true w': b false b false c true c false d false d true GSV observe that (∃x.closet(x)) and (◊guilty(x)) is true if there is at least one possibility in which a person in the closet is guilty. In this scenario, world w is the verifying world. It remains possible that there are closet hiders who are not guilty in any world. Carl fits this bill: he's in the closet in world w', but he is not guilty in any world. Let's see how this works out in detail. 14. Someone^x is in the closet. He_x might be guilty. {(w,0,r,g), (w',0,r,g}[∃x.closet(x)][◊guilty(x)] -- existential introduces new peg = ( {(w,1,r[x->0],g[0->b])}[closet(x)] ++ {(w,1,r[x->0],g[0->c])}[closet(x)] ++ {(w,1,r[x->0],g[0->d])}[closet(x)] ++ {(w',1,r[x->0],g[0->b])}[closet(x)] ++ {(w',1,r[x->0],g[0->c])}[closet(x)] ++ {(w',1,r[x->0],g[0->d])}[closet(x)])[◊guilty(x)] -- only possibilities in which x is in the closet survive = {(w,1,r[x->0],g[0->d]), (w',1,r[x->0],g[0->c])}[◊guilty(x)] -- Is there any possibility in which x is guilty? -- yes: for x = Dave, in world w Dave broke the vase = {(w,1,r[x->0],g[0->d]), (w',1,r[x->0],g[0->c])} Now we consider the second half: 14. Someone^x is in the closet who_x might be guilty. {(w,0,r,g), (w',0,r,g)}[∃x(closet(x) & ◊guilty(x))] -- existential introduces new peg = {(w,1,r[x->0],g[0->b])}[closet(x)][◊guilty(x)] ++ {(w,1,r[x->0],g[0->c])}[closet(x)][◊guilty(x)] ++ {(w,1,r[x->0],g[0->d])}[closet(x)][◊guilty(x)] ++ {(w',1,r[x->0],g[0->b])}[closet(x)][◊guilty(x)] ++ {(w',1,r[x->0],g[0->c])}[closet(x)][◊guilty(x)] ++ {(w',1,r[x->0],g[0->d])}[closet(x)][◊guilty(x)] -- filter out possibilities in which x is not in the closet -- and filter out possibilities in which x is not guilty -- the only person who was guilty in the closet was Dave in -- world 1 = {(w,1,r[x->0],g[0->d])} The result is different, and more informative. ## Binding, modality, and identity The fragment correctly predicts the following contrast: 15. Someone^x entered. He_x might be Bob. He_x might not be Bob. (∃x.enter(x)) & ◊x=b & ◊not(x=b) -- This discourse requires a possibility in which Bob entered -- and another possibility in which someone who is not Bob entered 16. Someone^x entered who might be Bob and who might not be Bob. ∃x (enter(x) & ◊x=b & ◊not(x=b)) -- This is a contradition: there is no single person who might be Bob -- and who simultaneously might be someone else These formulas are expressing extensional, de-reish intuitions. If we add individual concepts to the fragment, the ability to express fancier claims would come along. ### Identifiers Let α be a term which differs from x. Then α is an identifier if the following formula is supported by every information state: ∀x(◊(x=α) --> (x=α)) The idea is that α is an identifier just in case there is only one object that it can refer to. Here is what GSV say: A term is an identifier per se if no mattter what the information state is, it cannot fail to decie what the denotation of the term is. ## Why articulate the mapping from variables to objects into two parts? In the current system, variables are associated with values in two steps. Variables Pegs Entities --------- r ---- g -------- x --> 0 --> a y --> 1 --> b z --> 2 --> c Here, r is a refsys mapping variables to pegs, and g is an assignment function mapping pegs to entities. Assignment functions are free to map different pegs to the same entity: Variables Pegs Entities --------- r ---- g -------- x --> 0 --> a y --> 1 --> a z --> 2 --> c But this is possible with ordinary assignment functions as well. It is possible to imagine a refsys that maps more than one variable to the same peg. But the fragment is designed to prevent that from ever happening: the only way to associate a variable with a peg is by evaluating an existential quantifier, and the existential quantifier always introduces a fresh, unused peg. So what does the bipartite system do that ordinary assignment functions can't do?