topics/_week10_gsv.mdwn

   1 <!-- λ ◊ ≠ ∃ Λ ∀ ≡ α β γ ρ ω φ ψ Ω ○ μ η δ ζ ξ ⋆ ★ • ∙ ● ⚫ 𝟎 𝟏 𝟐 𝟘 𝟙 𝟚 𝟬 𝟭 𝟮 ⇧ (U+2e17) ¢ -->
   2
   3 [[!toc levels=2]]
   4
   5 # Doing things with monads
   6
   7 ## Extended application: Groenendijk, Stokhof and Veltman's *Coreference and Modality*
   8
   9 GSV are interested in developing and establishing a reasonable theory
  10 of discourse update.  One way of looking at this paper is like this:
  11
  12   GSV = GS + V
  13
  14 That is, Groenendijk and Stokhof have a well-known theory of dynamic
  15 semantics, and Veltman has a well-known theory of epistemic modality,
  16 and this fragment brings both of those strands together into a single
  17 system.
  18
  19 We will be interested in this paper both from a theoretical point of
  20 view and from a practical engineering point of view.  On the
  21 theoretical level, these scholars are proposing a strategy for
  22 managing the connection between variables and the objects they
  23 designate in way that is flexible enough to be useful for describing
  24 natural language.  The main way they attempt to do this is by
  25 inserting an extra level in between the variable and the object:
  26 instead of having an assignment function that maps variables directly
  27 onto objects, GSV provide *pegs*: variables map onto pegs, and pegs
  28 map onto objects.  We'll discuss in considerable detail what pegs
  29 allow us to do, since it is highly relevant to one of the main
  30 applications of the course, namely, reference and coreference.
  31
  32 What are pegs?  The term harks back to a paper by Landman called `Pegs
  33 and Alecs'.  There pegs are simply hooks for hanging properties on.
  34 Pegs are supposed to be as anonymous as possible.  Think of hanging
  35 your coat on a physical peg: you don't care which peg it is, only that
  36 there are enough pegs for everyone's coat to hang from.  Likewise, for
  37 the pegs of GSV, all that matters is that there are enough of them.
  38 (Incidentally, there is nothing in Gronendijk and Stokhof's original
  39 DPL paper that corresponds naturally to pegs; but in their Dynamic
  40 Montague Grammar paper, pegs serve a purpose similar to discourse
  41 referents there, though the connection is not simple.)
  42
  43 On an engineering level, the fact that GSV are combining anaphora and
  44 bound quantification with epistemic quantification means that they are
  45 gluing together related but distinct subsystems into a single
  46 fragment.  These subsystems naturally cleave into separate layers in a
  47 way that is obscured in the paper.  We will argue in detail that
  48 re-engineering GSV using monads will lead to a cleaner system that
  49 does all of the same theoretical work.
  50
  51 Empirical targets: on the anaphoric side, GSV want to
  52
  53 On the epistemic side, GSV aim to account for asymmetries such as
  54
  55     It might be raining.  It's not raining.
  56     #It's not raining.  It might be raining.
  57
  58 ## Basics
  59
  60 There are a lot of formal details in the paper in advance of the
  61 empirical discussion.  Here are the ones that matter:
  62
  63     type var = string
  64     type peg = int
  65     type refsys = var -> peg
  66     type ent = Alice | Bob | Carl
  67     type assignment = peg -> ent
  68
  69 So in order to get from a variable to an object, we have to compose a
  70 refsys `r` with an assignment `g`.  For instance, we might have
  71 r (g ("x")) = Alice.
  72
  73     type pred = string
  74     type world = pred -> ent -> bool
  75     type pegcount = int
  76     type poss = world * pegcount * refsys * assignment
  77     type infostate = [poss]
  78
  79 Worlds in general settle all matters of fact in the world.  In
  80 particular, they determine the extensions of predicates and relations.
  81 In this discussion, we'll (crudely) approximate worlds by making them
  82 a function from predicates such as "man" to a function mapping each
  83 entity to a boolean.
  84
  85 As we'll see, indefinites as a side effect increase the number of pegs
  86 by one.  GSV assume that we can determine what integer the next unused
  87 peg corresponds to by examining the range of the refsys function.
  88 We'll make things easy on ourselves by simply tracking the total
  89 number of used pegs in a counter called `pegcount`.
  90
  91 So information states track both facts about the world (e.g., which
  92 objects count as a man), and facts about the discourse (e.g., how many
  93 pegs have been used).
  94
  95 The formal language the fragment interprets is Predicate Calculus with
  96 equality, existential and universal quantification, and one unary
  97 modality (box and diamond, corresponding to epistemic necessity and
  98 epistemic possibility).
  99
 100 Terms in this language are either individuals such as Alice or Bob, or
 101 else variables.  So in general, the referent of a term can depend on a
 102 possibility:
 103
 104     ref(i, t) = t if t is an individual, and
 105                 g(r(t)) if t is a variable, where i = (w,n,r,g)
 106
 107 Here are the main clauses for update (their definition 3.1).
 108
 109 Following GSV, we'll write `update(s, φ)` (the update of information
 110 state `s` with the information in φ) as `s[φ]`.
 111
 112     s[P(t)] = {i in s | w(P)(ref(i,t))}
 113
 114 So `man(x)` is the set of live possibilities `i = (w,r,g)` in s such that
 115 the set of men in `w` given by `w(man)` maps the object referred to by
 116 `x`, namely, `r(g("x"))`, to `true`.   That is, update with "man(x)"
 117 discards all possibilities in which "x" fails to refer to a man.
 118
 119     s[t1 = t2] = {i in s | ref(i,t1) = ref(i,t2)}
 120
 121     s[φ and ψ] = s[φ][ψ]
 122
 123 When updating with a conjunction, first update with the left conjunct,
 124 then update with the right conjunct.
 125
 126 Existential quantification requires adding a new peg to the set of
 127 discourse referents.
 128
 129     s[∃xφ] = {(w, n+1, r[x->n], g[n->a]) | (w,n,r,g) in s and a in ent}[φ]
 130
 131 Here's the recipe: for every possibility (w,n,r,g) in s, and for every
 132 entity a in the domain of discourse, construct a new possibility with
 133 the same world w, an incrementd peg count n+1, and a new r and g
 134 adjusted in such a way that the variable x refers to the object a.
 135
 136 Note that this recipe does not examine φ.  This means that this
 137 analysis treats the formula prefix `∃x` as if it were a meaningful
 138 constituent independent of φ.
 139
 140 Negation is natural enough:
 141
 142     s[neg φ] =  {i | {i}[φ] = {}}
 143
 144 If updating φ with the information state that contains only the
 145 possibility i returns the empty information state, then not φ is true
 146 with respect to i.
 147
 148 In GSV, disjunction, the conditional, and the universals are defined
 149 in terms of negation and the other connectives.
 150
 151 Exercise: assume that there are two entities in the domain of
 152 discourse, Alice and Bob.  Assume that Alice is a woman, and Bob is a
 153 man.  Show the following computations, where `i = (w,n,r,g)`:
 154
 155     1. {i}[∃x.person(x)]
 156
 157        = {(w,n+1,r[x->n],g[n->a]),(w,n+1,r[x->n],g[n->b])}[person(x)]
 158        = {(w,n+1,r[x->n],g[n->a]),(w,n+1,r[x->n],g[n->b])}
 159
 160     2. {i}[∃x.man(x)]
 161
 162        = {(w,n+1,r[x->n],g[n->a]),(w,n+1,r[x->n],g[n->b])}[person(x)]
 163        = {(w,n+1,r[x->n],g[n->b])}
 164
 165
 166     3. {i}[∃x∃y.person(x) and person(y)]
 167
 168        = {(w,n+1,r[x->n],g[n->a]),(w,n+1,r[x->n],g[n->b])}[∃y.person(x) and person(y)]
 169        = {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a]),
 170           (w, n+2, r[x->n][y->n+1], g[n->a][n+1->b]),
 171           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->a]),
 172           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])
 173          }[person(x) and person(y)]
 174        = {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a]),
 175           (w, n+2, r[x->n][y->n+1], g[n->a][n+1->b]),
 176           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->a]),
 177           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])
 178          }
 179
 180     4. {i}[∃x∃y.x=x]
 181
 182        = {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a]),
 183           (w, n+2, r[x->n][y->n+1], g[n->a][n+1->b]),
 184           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->a]),
 185           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])
 186          }[∃x∃y.x=x]
 187        = {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a]),
 188           (w, n+2, r[x->n][y->n+1], g[n->a][n+1->b]),
 189           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->a]),
 190           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])
 191          }
 192
 193     5. {i}[∃x∃y.x=y]
 194
 195        = {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a]),
 196           (w, n+2, r[x->n][y->n+1], g[n->a][n+1->b]),
 197           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->a]),
 198           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])
 199          }[∃x∃y.x=y]
 200        = {(w, n+2, r[x->n][y->n+1], g[n->a][n+1->a]),
 201           (w, n+2, r[x->n][y->n+1], g[n->b][n+1->b])
 202          }
 203
 204 ## Order and modality
 205
 206 The final remaining update rule concerns modality:
 207
 208     s[◊φ] = {i in s | s[φ] ≠ {}}
 209
 210 This is a peculiar rule: a possibility `i` will survive update just in
 211 case something is true of the information state `s` as a whole.  That
 212 means that either every `i` in `s` will survive, or none of them will.  The
 213 criterion is that updating `s` with the information in φ does not
 214 produce the contradictory information state (i.e., `{}`).
 215
 216 So let's explore what this means.  GSV offer a contrast between two
 217 discourses that differ only in the order in which the updates occur.
 218 The fact that the predictions of the fragment differ depending on
 219 order shows that the system is order-sensitive.
 220
 221     1. Alice isn't hungry.  #Alice might be hungry.
 222
 223 According to GSV, the combination of these sentences in this order is
 224 `inconsistent', and they mark the second sentence with the star of
 225 ungrammaticality.  We'll say instead that the discourse is
 226 gramamtical, leave the exact word to use for its intuitive effect up
 227 for grabs.  What is important for our purposes is to get clear on how
 228 the fragment behaves with respect to these sentences.
 229
 230 We'll start with an infostate containing two possibilities.  In one
 231 possibility, Alice is hungry (call this possibility "hungry"); in the
 232 other, she is not (call it "full").
 233
 234       {hungry, full}[Alice isn't hungry][Alice might be hungry]
 235     = {full}[Alice might be hungry]
 236     = {}
 237
 238 As usual in dynamic theories, a sequence of sentences is treated as if
 239 the sentence were conjoined.  This is the same thing as updating with
 240 the first sentence, then updating with the second sentence.
 241 Update with *Alice isn't hungry* eliminates the possibility in which
 242 Alice is hungry, leaving only the possibility in which she is full.
 243 Subsequent update with *Alice might be hungry* depends on the result
 244 of updating with the prejacent, *Alice is hungry*.  Let's do that side
 245 calculation:
 246
 247       {full}[Alice is hungry]
 248     = {}
 249
 250 Because the only possibility in the information state is one in which
 251 Alice is not hungry, update with *Alice is hungry* results in an empty
 252 information state.  That means that update with *Alice might be
 253 hungry* will also be empty, as indicated above.
 254
 255 In order for update with *Alice might be hungry* to be non-empty,
 256 there must be at least one possibility in the input state in which
 257 Alice is hungry.  That is what epistemic might means in this fragment:
 258 the prejacent must be possible.  But update with *Alice isn't hungry*
 259 eliminates all possibilities in which Alice is hungry.  So the
 260 prediction of the fragment is that update with the sequence in (1)
 261 will always produce an empty information state.
 262
 263 In contrast, consider the sentences in the opposite order:
 264
 265     2. Alice might be hungry.  Alice isn't hungry.
 266
 267 We'll start with the same two possibilities.
 268
 269
 270     = {hungry, full}[Alice might be hungry][Alice isn't hungry]
 271     = {hungry, full}[Alice isn't hungry]
 272     = {full}
 273
 274 Update with *Alice might be hungry* depends on the result of updating
 275 with the prejacent, *Alice is hungry*.  Here's the side calculation:
 276
 277       {hungry, full}[Alice is hungry]
 278     = {hungry}
 279
 280 Since this update is non-empty, all of the original possibilities
 281 survive update with *Alice might be hungry*.  By now it should be
 282 obvious that update with a *might* sentence either has no effect, or
 283 produces an empty information state.  The net result is that we can
 284 then go on to update with *Alice isn't hungry*, yielding an updated
 285 information state that contains only possibilities in which Alice
 286 isn't hungry.
 287
 288 GSV comment that a single speaker couldn't possibly be in a position
 289 to utter the discourse in (2).  The reason is that in order for the
 290 speaker to appropriately assert that Alice isn't hungry, that speaker
 291 would have to possess knowledge (or sufficient justification,
 292 depending on your theory of the norms for assertion) that Alice isn't
 293 hungry.  But if they know that Alice isn't hungry, they couldn't
 294 appropriately assert *Alice might be hungry*, based on the predictions
 295 of the fragment.
 296
 297 Another view is that it can be acceptable to assert a sentence if it
 298 is supported by the information in the common ground.  So if the
 299 speaker assumes that as far as the listener knows, Alice might be
 300 hungry, they can utter the discourse in (2).  Here's a variant that
 301 makes this thought more vivid:
 302
 303     3. Based on public evidence, Alice might be hungry.  But in fact she's not hungry.
 304
 305 The main point to appreciate here is that the update behavior of the
 306 discourses depends on the order in which the updates due to the
 307 individual sentence occur.
 308
 309 Note, incidentally, that there is an asymmetry in the fragment
 310 concerning negation.
 311
 312     4. Alice might be hungry.  Alice *is* hungry.
 313     5. Alice is hungry.  (So of course) Alice might be hungry.
 314
 315 Both of these discourses lead to the same update effect: all and only
 316 those possibilites in which Alice is hungry survive.  If you think
 317 that asserting *might* requires that the prejacent be undecided, you
 318 will have to consider an update rule for the diamond on which update
 319 with the prejacent and its negation must both be non-empty.
 320
 321 ## Binding
 322
 323 The GSV fragment differs from the DPL and the DMG dynamic semantics in
 324 important details.  Nevertheless, it has more or less the same things
 325 to say about anaphora, binding, quantificational binding, and donkey
 326 anaphora.
 327
 328 In particular, continuing the theme of order-based asymmetries,
 329
 330     6. A man^x entered.  He_x sat.
 331     7. He_x sat.  A man^x entered.
 332
 333 These discourses differ only in the order of the sentences.  Yet the
 334 first allows for coreference between the indefinite and the pronoun,
 335 where the second discourse does not.  In order to demonstrate, we'll
 336 need an information state whose refsys is defined for at least one
 337 variable.
 338
 339     8. {(w,1,r[x->0],g[0->b])}
 340
 341 This infostate contains a refsys and an assignment that maps the
 342 variable x to Bob.  Here are the facts in world w:
 343
 344     w "enter" a = false
 345     w "enter" b = true
 346     w "enter" c = true
 347     w "sit" a = true
 348     w "sit" b = true
 349     w "sit" c = false
 350
 351 We can now consider the discourses in (6) and (7) (after magically
 352 converting them to the Predicate Calculus):
 353
 354     9. Someone^x entered.  He_x sat.
 355
 356          {(w,1,r[x->0],g[0->b])}[∃x.enter(x)][sit(x)]
 357
 358           -- the existential adds a new peg and assigns it to each
 359           -- entity in turn
 360
 361        = {(w,2,r[x->0][x->1],g[0->b][1->a]),
 362           (w,2,r[x->0][x->1],g[0->b][1->b]),
 363           (w,2,r[x->0][x->1],g[0->b][1->c])}[enter(x)][sit(x)]
 364
 365           -- "enter(x)" filters out the possibility in which x refers
 366           -- to Alice, since Alice didn't enter
 367
 368        = {(w,2,r[x->0][x->1],g[0->b][1->b]),
 369           (w,2,r[x->0][x->1],g[0->b][1->c])}[sit(x)]
 370
 371           -- "sit(x)" filters out the possibility in which x refers
 372           -- to Carl, since Carl didn't sit
 373
 374        = {(w,2,r[x->0][x->1],g[0->b][1->b])}
 375
 376 Note that `r[x->0][x->1]` maps `x` to 1---the outermost adjustment is
 377 the operative one.  In other words, `r[x->0][x->1] == (r[x->0])[x->1]`.
 378
 379 One of the key facts here is that even though the existential has
 380 scope only over the first sentence, in effect it binds the pronoun in
 381 the following clause.  This is characteristic of dynamic theories in
 382 the style of Groenendijk and Stokhof, including DPL and DMG.
 383
 384     10. He_x sat.  Someone^x entered.
 385
 386          {(w,1,r[x->0],g[0->b])}[sit(x)][∃x.enter(x)]
 387
 388          -- evaluating `sit(x)` rules out nothing, since (coincidentally)
 389          -- x refers to Bob, and Bob is a sitter
 390
 391        = {(w,1,r[x->0],g[0->b])}[∃x.enter(x)]
 392
 393          -- Just as before, the existential adds a new peg and assigns
 394          -- it to each object
 395
 396        = {(w,2,r[x->0][x->1],g[0->b][1->a]),
 397           (w,2,r[x->0][x->1],g[0->b][1->b]),
 398           (w,2,r[x->0][x->1],g[0->b][1->c])}[enter(x)]
 399
 400          -- enter(x) eliminates all those possibilities in which x did
 401          -- not enter
 402
 403        = {(w,2,r[x->0][x->1],g[0->b][1->b]),
 404           (w,2,r[x->0][x->1],g[0->b][1->c])}
 405
 406 The result is different than before.  Before, there was only one
 407 possibility: that x refered to the only person who both entered and
 408 sat.  Here, there remain two possibilities: that x refers to Bob, or
 409 that x refers to Carl.  This makes predictions about the
 410 interpretation of continuations of the dialogs:
 411
 412     11. A man^x entered.  He_x sat.  He_x spoke.
 413     12. He_x sat.  A man^x entered.  He_x spoke.
 414
 415 The construal of (11) as marked entails that the person who spoke also
 416 entered and sat.  The construal of (12) guarantees only that the
 417 person who spoke also entered.  There is no guarantee that the person
 418 who spoke sat.
 419
 420 Intuitively, there is a strong impression in (12) that the person who
 421 entered and spoke not only should not be identified as the person who
 422 sat, he should be different from the person who sat.  Some dynamic
 423 systems, such as Heim's File Change Semantics, guarantee non-identity.
 424 That is not guaranteed by the GSV fragment.  The GSV guarantees that
 425 the indefinite introduces a novel peg, but there is no requirement
 426 that the peg refers to a novel object.  If you wanted to add this as a
 427 refinement to the fragment, you could required that whenever a new peg
 428 gets added, it must be mapped onto an object that is not in the range
 429 of the original assignment function.
 430
 431 As usual with dynamic semantics, a point of pride is the ability to
 432 give a good account of donkey anaphora, as in
 433
 434     13. If a woman entered, she sat.
 435
 436 See the paper for details.
 437
 438 ## Interactions of binding with modality
 439
 440 At this point, we have a fragment that handles modality, and that
 441 handles indefinites and pronouns.  It it only interesting to combine
 442 these two elements if they interact in non-trivial ways.  This is
 443 exactly what GSV argue.
 444
 445 The discussion of indefinites in the previous section established the
 446 following dynamic equivalence:
 447
 448     (∃x.enter(x)) and (sit(x)) ≡ ∃x (enter(x) and sit(x))
 449
 450 In words, existentials in effect take scope over subsequent clauses.
 451
 452 The presence of modal possibility, however, disrupts this
 453 generalization:
 454
 455     (∃x.enter(x)) and (◊sit(x)) ≡/≡ ∃x (enter(x) and ◊sit(x))
 456
 457 To see this, we'll start with the left hand side.
 458
 459     14. Someone^x entered.  He_x might sit.
 460
 461          {(w,1,r[x->0],g[0->b])}[∃x.enter(x)][◊sit(x)]
 462
 463           -- same computation up to the point of the modal
 464
 465        = {(w,2,r[x->0][x->1],g[0->b][1->b]),
 466           (w,2,r[x->0][x->1],g[0->b][1->c])}[◊sit(x)]
 467
 468           -- modal returns all or none, depending on whether the
 469           -- prejacent is consistent with the starting infostate.
 470           -- since there is one choice for x who sat, returns all:
 471
 472        = {(w,2,r[x->0][x->1],g[0->b][1->b]),
 473           (w,2,r[x->0][x->1],g[0->b][1->c])}
 474
 475 To paraphrase, the requirements are that there must be a person who
 476 entered, and it might be possible that that person sat.  But this is
 477 not metaphysical possibility: we're not choosing a person an wondering
 478 whether that person sat.  If that's what we had in mind, we'd go off
 479 to a bunch of non-actual possible worlds and see what is happening
 480 there.  Instead, this is supposed to be epistemic possibility.  The
 481 paraphrase should be something like: there must be a person who
 482 entered, and for all we know, that person might have sat.
 483
 484 The peculiar thing is that the uncertainty has nothing to do with the
 485 facts of the world, but only with the fact about the discourse: it's
 486 uncertainty about which object the pronoun refers to.  GSV work hard
 487 to make this interpretation plausible.  Here's their story:
 488
 489     There are three kids.  One of them breaks a vase.  One is known to
 490     be innocent.  There are sounds coming out of the closet.
 491
 492     15. Someone^x is in the closet.  He_x might be guilty.
 493
 494 You have enough information to know that someone is in the closet.
 495 You use the pronoun to refer to the person in the closet, and assert
 496 that, for all you know, that person might be guilty.  The fragment
 497 gives you guaranteed coreference---it's whoever is in the closet who
 498 might be guilty---in the presence of uncertainty about who the pronoun
 499 refers to.
 500
 501 Now we consider the second half:
 502
 503     14. Someone^x entered who_x might sit.
 504
 505          {(w,1,r[x->0],g[0->b])}[∃x.(enter(x) & ◊sit(x)]
 506
 507        = {(w,2,r[x->0][x->1],g[0->a]),
 508           (w,2,r[x->0][x->1],g[0->b]),
 509           (w,2,r[x->0][x->1],g[0->c])}[enter(x)][◊sit(x)]
 510
 511           -- recall that Alice didn't enter, so
 512
 513        = {(w,2,r[x->0][x->1],g[0->b]),
 514           (w,2,r[x->0][x->1],g[0->c])}[◊sit(x)]
 515