[[!toc]]
# Applications of continuations to natural language
We've seen a number of applications of monads to natural language,
including presupposition projection, binding, intensionality, and the
dynamics of the GSV fragment.
In the past couple of weeks, we've introduced continuations, first as
a functional programming technique, then in terms of list and tree
zippers, then as a monad. In this lecture, we will generalize
continuations slightly beyond a monad, and then begin to outline some
of the applications of monads. In brief, the generalization can be
summarized in terms of types: instead of using a Kleisli arrow mapping
a type α to a continuized type α > ρ > ρ, we'll allow the result
types to differ, i.e., we'll map α to α > β > γ. This will be
crucial for some natural language applications.
Many (though not all) of the applications are discussed in detail in
Barker and Shan 2014, *Continuations in Natural Language*, OUP.
In terms of list zippers, the continuation of a focussed element in
the list is the front part of the list.
list zipper for the list [a;b;c;d;e;f] with focus on d:
([c;b;a], [d;e;f])

defunctionalized
continuation
In terms of tree zippers, the continuation is the entire context of
the focussed elementthe entire rest of the tree.
[drawing of a broken tree]
Last week we had trouble computing the doubling task when there was more
than one shifty operator after moving from a list perspective to a
tree perspective. That is, it remained unclear why "aScSe" was
"aacaceecaacaceecee"
We'll burn through that conceptual fog today. The natural thing to
try would have been to defunctionalize the continuationbased solution
using a tree zipper. But that would not have been easy, since the
natural way to implement the doubling behavior of the shifty operator
would have been to simply copy the context provided by the zipper.
This would have produced two uncoordinated copies of the other shifty
operator, and we'd have been in the situation described in class of
having a reduction strategy that never reduced the number of shifty
operators below 2.
Instead, we'll reinterpreting what the continuation monad was doing
in defunctionalized terms by using Quantifier Raising (a technique
from linguistics).
But first, motivating quantifier scope as a linguistic application.
# The primary application of continuations to natural language: scopetaking
We have seen that continuations allow a deeplyembedded element to
take control over (a portion of) the entire computation that contains
it. In natural language semantics, this is exactly what it means for
a scopetaking expression to take scope.
1. [Ann put a copy of [everyone]'s homeworks in her briefcase]
2. For every x, [Ann put a copy of x's homeworks in her briefcase]
The sentence in (1) can be paraphrased as in (2), in which the
quantificational DP *every student* takes scope over the rest of the sentence.
Even if you suspect that there could be an analysis of (2) on which
"every student's term paper" could denote some kind of mereological
fusion of a set of papers, it is much more difficult to be satisfied
with a referential analysis when *every student* is replaced with
*no student*, or *fewer than three students*, and so onsee any
semantics text book for abundant discussion.
We can arrive at an analysis by expressing the meaning of
quantificational DP such as *everyone* using continuations:
3. everyone = shift (\k.∀x.kx)
Assuming there is an implicit reset at the top of the sentence (we'll
explicitly address determining where there is or isn't a reset), the
reduction rules for `shift` will apply the handler function (\k.∀x.kx)
to the remainder of the sentence after abstracting over the position
of the shift expression:
[Ann put a copy of [shift (\k.∀x.kx)]'s homeworks in her briefcase]
~~> (\k.∀x.kx) (\v. Ann put a copy of v's homeworks in her briefcase)
~~> ∀x. Ann put a copy of x's homeworks in her briefcase
(To be a bit pedantic, this reduction sequence is more suitable for
shift0 than for shift, but we're not being fussy here about subflavors
of shifty operators.)
The standard technique for handling scopetaking in linguistics is
Quantifier Raising (QR). As you might suppose, the rule for Quantifier
Raising closely resembles the reduction rule for shift:
Quantifier Raising: given a sentence [... [QDP] ...], build a new
sentence [QDP (\x.[... [x] ...])].
Just to emphasize the similarity between QR and shift, we can use QR
to provide insight into the tree task that mystified us earlier.
\tree (. (a)((S)((d)((S)(e)))))
.
_____
 
a _____
 
S ____
 
d ___
 
S e
First we QR the lower shift operator
\tree (. (S) ((\\x) ((a)((S)((d)((x)(e)))))))
.
______
 
S ______
 
\x _____
 
a _____
 
S ____
 
d ___
 
x e
Next, we QR the upper shift operator
\tree (. (S) ((\\y) ((S) ((\\x) ((a)((y)((d)((x)(e)))))))))
.
______
 
S _______
 
\y ______
 
S ______
 
\x _____
 
a _____
 
y ____
 
d ___
 
x e
We then evaluate, using the same value for the shift operator proposed before:
shift = \k.k(k "")
It will be easiest to begin evaluating this tree with the lower shift
operator (we get the same result if we start with the upper one).
The relevant value for k is (\x.a(y(d(x e)))). Then k "" is
a(y(d(""(e)))), and k(k "") is a(y(d((a(y(d(""(e)))))(e)))). In tree
form:
\tree (. (S) ((\\y) ((a)((y)((d)(((a)((y)((d)(("")(e)))))(e)))))))
.
______
 
S _______
 
\y ______
 
a ______
 
y ______
 
d ______
 
_____ e
 
a _____
 
y _____
 
d ____
 
"" e
Repeating the process for the upper shift operator replaces each
occurrence of y with a copy of the whole tree.
\tree (. ((a)((((a)(("")((d)(((a)(("")((d)(("")(e)))))(e))))))((d)(((a)((((a)(("")((d)(((a)(("")((d)(("")(e)))))(e))))))((d)(("")(e)))))(e))))))
.

____________
 
a ___________________
 
 ______
______  
  d _______
a _______  
  _______ e
"" ______  
  a _________
d ______  
   _____
______ e ______  
    d ____
a ______ a _______  
    "" e
"" _____ "" ______
   
d ____ d ______
   
"" e ______ e
 
a ______
 
"" _____
 
d ____
 
"" e
The yield of this tree (the sequence of leaf nodes) is aadadeedaadadeedee.
Exercise: the result is different, by the way, if the QR occurs in a
different order.
Three lessons:
* Generalizing from onesided, listbased continuation
operators to twosided, treebased continuation operators is a
dramatic increase in power and complexity.
* Operators that
compose multiple copies of a context can be hard to understand.
* When considering twosided, treebased continuation operators,
quantifier raising is a good tool for visualizing (defunctionalizing)
the computation.
## Tower notation
At this point, we have three ways of representing computations
involving control operators such as shift and reset: using a CPS
transform, lifting into a continuation monad, and by using QR.
QR is the traditional system in linguistics, but it will not be
adequate for us in general. The reason has to do with order. As
we've discussed, especially with respect to the CPS transform,
continuations allow finegrained control over the order of evaluation.
One of the main empirical claims of Barker and Shan 2014 is that
natural language is sensitive to evaluation order. Unlike other
presentations of continuations, QR does not lend itself to reasoning
about evaluation order, so we will need to use a different strategy.
[Note to self: it is interesting to consider what it would take to
reproduce the analyses giving in Barker and Shan in purely QR terms.
Simple quantificational binding using parasitic scope should be easy,
but how reconstruction would work is not so clear.]
We'll present tower notation, then comment and motivate several of its
features as we consider various applications. For now, we'll motivate
the tower notation by thinking about box types. In the discussion of
monads, we've thought of monadic types as values inside of a box. The
box will often contain information in addition to the core object.
For instance, in the Reader monad, a boxed int contains an expression
of type int as the payload, but also contains a function that
manipulates a list of information. It is natural to imagine
separating a box into two regions, the payload and the hidden scratch
space:
_______________ _______________ _______________
 [x>2, y>3]   [x>2, y>3]   [x>2, y>3] 
  
  ¢   =  
 +2   y   5 
______________ ______________ ______________
(Imagine the + operation has been lifted into the Reader monad too.)
For people who are familiar with Discourse Representation Theory (Kamp
1981, Kamp and Reyle 1993), this separation of boxes into payload and
discourse scorekeeping will be familiar (although many details differ).
The general pattern is that monadic treatments separate computation
into an atissue (premonadic) computation with a layer at which
sideeffects occur.
The tower notation is a precise way of articulating continuationbased
computations into a payload and (potentially multiple) layers of sideeffects.
We won't keep the outer box, but we will keep the horizontal line
dividing main effects from sideeffects.
Tower convention for types:
γ  β
(α > β) > γ can be equivalently written 
α
Tower convention for values:
g[]
\k.g[k(x)] can be equivalently written 
x
If \k.g[k(x)] has type (α > β) > γ, then k has type (α > β).
Here "g[ ]" is a *context*, that is, an expression with (exactly) one
hole in it. For instance, we might have g[x] = \forall x.P[x].
We'll use a simplytyped system with two atomic types, DP (the type of
individuals) and S (the type of truth values).
Then in the spirit of monadic thinking, we'll have a way of lifting an
arbitrary value into the tower system:
[] γβ
LIFT (x:α) = \k.kx : (α > β) > γ ==  : 
x α
Obviously, LIFT is exactly the midentity (the unit) for the continuation monad.
The name comes from Partee's 1987 theory of typeshifters for
determiner phrases. Importantly, LIFT applied to an
individualdenoting expression yields the generalized quantifier
proposed by Montague as the denotation for proper names:
[] SS
LIFT (j:DP) = \k.kx : (DP > S) > S ==  : 
j DP
So if the proper name *John* denotes the individual j, LIFT(j) is the
generalized quantifier that maps each property k of type DP > S to true
just in case kj is true.
Once we have expressions of type (α > β) > γ, we'll need to combine
them. We'll use the ¢ operator from the continuation monad:
g[] γ  δ h[] δ  ρ g[h[]] γ  ρ
 :  ¢  :  ==  : 
f α > β x α fx β
Note that the types below the horizontal line combine just like
functional application (i.e, f:(α>β) (x:α) = fx:β).
To demonstrate that this is indeed the continuation monad's ¢
operator:
¢ (\k.g[kf]) (\k.h[kx])
= (\MNk.M(\m.N(\n.k(mn)))) (\k.g[kf]) (\k.h[kx])
~~> \k.(\k.g[kf])(\m.(\k.h[kx])(\n.k(mn))
~~> \k.g[(\k.h[kx])(\n.k(fn))
~~> \k.g[h[k(fx)]]
g[h[]]
== 
fx
Not a monad (Wadler); would be if the types were
Neverthless, obeys the monad laws.
This is (almost) all we need to get some significant linguistic work
done.