continuations for natural langauge sketch

author Chris Barker <barker@kappa.linguistics.fas.nyu.edu>

Mon, 6 Dec 2010 20:06:22 +0000 (15:06 -0500)

committer Chris Barker <barker@kappa.linguistics.fas.nyu.edu>

Mon, 6 Dec 2010 20:06:22 +0000 (15:06 -0500)
author Chris Barker <barker@kappa.linguistics.fas.nyu.edu>
Mon, 6 Dec 2010 20:06:22 +0000 (15:06 -0500)
committer Chris Barker <barker@kappa.linguistics.fas.nyu.edu>
Mon, 6 Dec 2010 20:06:22 +0000 (15:06 -0500)
diff --git a/manipulating_trees_with_monads.mdwn b/manipulating_trees_with_monads.mdwn

index 38f8ff3..94a88e7 100644 (file)
--- a/manipulating_trees_with_monads.mdwn
+++ b/manipulating_trees_with_monads.mdwn
@@ -318,7 +318,18 @@ So for example, we compute:
         # tree_monadize (fun a -> fun k -> a :: k a) t1 (fun t -> []);;
         - : int list = [2; 3; 5; 7; 11]
  
         # tree_monadize (fun a -> fun k -> a :: k a) t1 (fun t -> []);;
         - : int list = [2; 3; 5; 7; 11]
  
-We have found a way of collapsing a tree into a list of its leaves. Can you trace how this is working? Think first about what the operation `fun a -> fun k -> a :: k a` does when you apply it to a plain `int`, and the continuation `fun _ -> []`. Then given what we've said about `tree_monadize`, what should we expect `tree_monadize (fun a -> fun k -> a :: k a` to do?
+We have found a way of collapsing a tree into a list of its
+leaves. Can you trace how this is working? Think first about what the
+operation `fun a -> fun k -> a :: k a` does when you apply it to a
+plain `int`, and the continuation `fun _ -> []`. Then given what we've
+said about `tree_monadize`, what should we expect `tree_monadize (fun
+a -> fun k -> a :: k a` to do?
+
+In a moment, we'll return to the same-fringe problem.  Since the
+simple but inefficient way to solve it is to map each tree to a list
+of its leaves, this transformation is on the path to a more efficient
+solution.  We'll just have to figure out how to postpone computing the
+tail of the list until its needed...
  
  The Continuation monad is amazingly flexible; we can use it to
  simulate some of the computations performed above.  To see how, first
  
  The Continuation monad is amazingly flexible; we can use it to
  simulate some of the computations performed above.  To see how, first
@@ -360,25 +371,219 @@ Using continuations to solve the same fringe problem
  ----------------------------------------------------
  
  We've seen two solutions to the same fringe problem so far.  
  ----------------------------------------------------
  
  We've seen two solutions to the same fringe problem so far.  
-The simplest is to map each tree to a list of its leaves, then compare
-the lists.  But if the fringes differ in an early position, we've
-wasted our time visiting the rest of the tree. 
+The problem, recall, is to take two trees and decide whether they have
+the same leaves in the same order.
+
+<pre>
+ ta            tb          tc
+ .             .           .
+_|__          _|__        _|__
+|  |          |  |        |  |
+1  .          .  3        1  .
+  _|__       _|__           _|__
+  |  |       |  |           |  |
+  2  3       1  2           3  2
+
+let ta = Node (Leaf 1, Node (Leaf 2, Leaf 3));;
+let tb = Node (Node (Leaf 1, Leaf 2), Leaf 3);;
+let tc = Node (Leaf 1, Node (Leaf 3, Leaf 2));;
+</pre>
+
+So `ta` and `tb` are different trees that have the same fringe, but
+`ta` and `tc` are not.
+
+The simplest solution is to map each tree to a list of its leaves,
+then compare the lists.  But because we will have computed the entire
+fringe before starting the comparison, if the fringes differ in an
+early position, we've wasted our time examining the rest of the trees.
  
  The second solution was to use tree zippers and mutable state to
  
  The second solution was to use tree zippers and mutable state to
-simulate coroutines.  We would unzip the first tree until we found the
-next leaf, then store the zipper structure in the mutable variable
-while we turned our attention to the other tree.  Because we stop as
-soon as we find the first mismatched leaf, this solution does not have
-the flaw just mentioned of the solution that maps both trees to a list
-of leaves before beginning comparison.
+simulate coroutines (see [[coroutines and aborts]]).  In that
+solution, we pulled the zipper on the first tree until we found the
+next leaf, then stored the zipper structure in the mutable variable
+while we turned our attention to the other tree.  Because we stopped
+as soon as we find the first mismatched leaf, this solution does not
+have the flaw just mentioned of the solution that maps both trees to a
+list of leaves before beginning comparison.
  
  Since zippers are just continuations reified, we expect that the
  solution in terms of zippers can be reworked using continuations, and
  
  Since zippers are just continuations reified, we expect that the
  solution in terms of zippers can be reworked using continuations, and
-this is indeed the case.  To make this work in the most convenient
-way, we need to use the fully general type for continuations just mentioned.
-
-tree_monadize (fun a k -> a, k a) t1 (fun t -> 0);;
+this is indeed the case.  Before we can arrive at a solution, however,
+we must define a data structure called a stream:
+
+    type 'a stream = End | Next of 'a * (unit -> 'a stream);;
+
+A stream is like a list in that it contains a series of objects (all
+of the same type, here, type `'a`).  The first object in the stream
+corresponds to the head of a list, which we pair with a stream
+representing the rest of a the list.  There is a special stream called
+`End` that represents a stream that contains no (more) elements,
+analogous to the empty list `[]`.  
+
+Actually, we pair each element not with a stream, but with a thunked
+stream, that is, a function from the unit type to streams.  The idea
+is that the next element in the stream is not computed until we forced
+the thunk by applying it to the unit:
+
+<pre>
+# let rec make_int_stream i = Next (i, fun () -> make_int_stream (i + 1));;
+val make_int_stream : int -> int stream = <fun>
+# let int_stream = make_int_stream 1;;
+val int_stream : int stream = Next (1, <fun>)         (* First element: 1 *)
+# match int_stream with Next (i, rest) -> rest;;      
+- : unit -> int stream = <fun>                        (* Rest: a thunk *)
+
+(* Force the thunk to compute the second element *)
+# (match int_stream with Next (i, rest) -> rest) ();;
+- : int stream = Next (2, <fun>)      
+</pre>
+
+You can think of `int_stream` as a functional object that provides
+access to an infinite sequence of integers, one at a time.  It's as if
+we had written `[1;2;...]` where `...` meant "continue indefinitely".
+
+So, with streams in hand, we need only rewrite our continuation tree
+monadizer so that instead of mapping trees to lists, it maps them to 
+streams.  Instead of 
+
+       # tree_monadize (fun a k -> a :: k a) t1 (fun t -> []);;
+       - : int list = [2; 3; 5; 7; 11]
  
  
+as above, we have 
+
+        # tree_monadize (fun i k -> Next (i, fun () -> k ())) t1 (fun _ -> End);;
+        - : int stream = Next (2, <fun>)
+
+We can see the first element in the stream, the first leaf (namely,
+2), but in order to see the next, we'll have to force a thunk.
+
+Then to complete the same-fringe function, we simply convert both
+trees into leaf-streams, then compare the streams element by element.
+The code is enitrely routine, but for the sake of completeness, here it is:
+
+<pre>
+let rec compare_streams stream1 stream2 =
+    match stream1, stream2 with 
+    | End, End -> true (* Done!  Fringes match. *)
+    | Next (next1, rest1), Next (next2, rest2) when next1 = next2 -> compare_streams (rest1 ()) (rest2 ())
+    | _ -> false;;
+
+let same_fringe t1 t2 =
+  let stream1 = tree_monadize (fun i k -> Next (i, fun () -> k ())) t1 (fun _ -> End) in 
+  let stream2 = tree_monadize (fun i k -> Next (i, fun () -> k ())) t2 (fun _ -> End) in 
+  compare_streams stream1 stream2;;
+</pre>
+
+Notice the forcing of the thunks in the recursive call to
+`compare_streams`.  So indeed:
+
+<pre>
+# same_fringe ta tb;;
+- : bool = true
+# same_fringe ta tc;;
+- : bool = false
+</pre>
+
+Now, this implementation is a bit silly, since in order to convert the
+trees to leaf streams, our tree_monadizer function has to visit every
+node in the tree.  But if we needed to compare each tree to a large
+set of other trees, we could arrange to monadize each tree only once,
+and then run compare_streams on the monadized trees.
+
+By the way, what if you have reason to believe that the fringes of
+your trees are more likely to differ near the right edge than the left
+edge?  If we reverse evaluation order in the tree_monadizer function,
+as shown above when we replaced leaves with their ordinal position,
+then the resulting streams would produce leaves from the right to the
+left.
+
+The idea of using continuations to characterize natural language meaning
+------------------------------------------------------------------------
+
+We might a philosopher or a linguist be interested in continuations,
+especially if efficiency of computation is usually not an issue?
+Well, the application of continuations to the same-fringe problem
+shows that continuations can manage order of evaluation in a
+well-controlled manner.  In a series of papers, one of us (Barker) and
+Ken Shan have argued that a number of phenomena in natural langauge
+semantics are sensitive to the order of evaluation.  We can't
+reproduce all of the intricate arguments here, but we can give a sense
+of how the analyses use continuations to achieve an analysis of
+natural language meaning.
+
+**Quantification and default quantifier scope construal**.  
+
+We saw in the copy-string example and in the same-fringe example that
+local properties of a tree (whether a character is `S` or not, which
+integer occurs at some leaf position) can control global properties of
+the computation (whether the preceeding string is copied or not,
+whether the computation halts or proceeds).  Local control of
+surrounding context is a reasonable description of in-situ
+quantification.
+
+    (1) John saw everyone yesterday.
+
+This sentence means (roughly)
+
+    &Forall; x . yesterday(saw x) john
+
+That is, the quantifier *everyone* contributes a variable in the
+direct object position, and a universal quantifier that takes scope
+over the whole sentence.  If we have a lexical meaning function like
+the following:
+
+<pre>
+let lex (s:string) k = match s with 
+  | "everyone" -> Node (Leaf "forall x", k "x")
+  | "someone" -> Node (Leaf "exists y", k "y")
+  | _ -> k s;;
+
+let sentence1 = Node (Leaf "John", 
+                      Node (Node (Leaf "saw", 
+                                  Leaf "everyone"), 
+                            Leaf "yesterday"));;
+</pre>
+
+Then we can crudely approximate quantification as follows:
+
+<pre>
+# tree_monadize lex sentence1 (fun x -> x);;
+- : string tree =
+Node
+ (Leaf "forall x",
+  Node (Leaf "John", Node (Node (Leaf "saw", Leaf "x"), Leaf "yesterday")))
+</pre>
+
+In order to see the effects of evaluation order, 
+observe what happens when we combine two quantifiers in the same
+sentence:
+
+<pre>
+# let sentence2 = Node (Leaf "everyone", Node (Leaf "saw", Leaf "someone"));;
+# tree_monadize lex sentence2 (fun x -> x);;
+- : string tree =
+Node
+ (Leaf "forall x",
+  Node (Leaf "exists y", Node (Leaf "x", Node (Leaf "saw", Leaf "y"))))
+</pre>
+
+The universal takes scope over the existential.  If, however, we
+replace the usual tree_monadizer with tree_monadizer_rev, we get
+inverse scope:
+
+<pre>
+# tree_monadize_rev lex sentence2 (fun x -> x);;
+- : string tree =
+Node
+ (Leaf "exists y",
+  Node (Leaf "forall x", Node (Leaf "x", Node (Leaf "saw", Leaf "y"))))
+</pre>
+
+There are many crucially important details about quantification that
+are being simplified here, and the continuation treatment here is not
+scalable for a number of reasons.  Nevertheless, it will serve to give
+an idea of how continuations can provide insight into the behavior of
+quantifiers.  
  
  
  The Binary Tree monad
  
  
  The Binary Tree monad
author	Chris Barker <barker@kappa.linguistics.fas.nyu.edu>
	Mon, 6 Dec 2010 20:06:22 +0000 (15:06 -0500)
committer	Chris Barker <barker@kappa.linguistics.fas.nyu.edu>
	Mon, 6 Dec 2010 20:06:22 +0000 (15:06 -0500)