X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=blobdiff_plain;f=topics%2Fweek13_from_list_zippers_to_continuations.mdwn;h=dcd11cec2181b4b98f37477e0fbd693ba889a7fa;hp=0000000000000000000000000000000000000000;hb=8c58865c9934ef39e8452e07de96db9af64e22a4;hpb=c5d481fb6e394b6d439cca2928e07a38e758d357 diff --git a/topics/week13_from_list_zippers_to_continuations.mdwn b/topics/week13_from_list_zippers_to_continuations.mdwn new file mode 100644 index 00000000..dcd11cec --- /dev/null +++ b/topics/week13_from_list_zippers_to_continuations.mdwn @@ -0,0 +1,222 @@ +Refunctionalizing zippers: from lists to continuations +------------------------------------------------------ + +If zippers are continuations reified (defuntionalized), then one route +to continuations is to re-functionalize a zipper. Then the +concreteness and understandability of the zipper provides a way of +understanding an equivalent treatment using continuations. + +Let's work with lists of `char`s for a change. We'll sometimes write +"abSd" as an abbreviation for +`['a'; 'b'; 'S'; 'd']`. + +We will set out to compute a deceptively simple-seeming **task: given a +string, replace each occurrence of 'S' in that string with a copy of +the string up to that point.** + +We'll define a function `t` (for "task") that maps strings to their +updated version. + +Expected behavior: + + t "abSd" ~~> "ababd" + + +In linguistic terms, this is a kind of anaphora +resolution, where `'S'` is functioning like an anaphoric element, and +the preceding string portion is the antecedent. + +This task can give rise to considerable complexity. +Note that it matters which 'S' you target first (the position of the * +indicates the targeted 'S'): + + t "aSbS" + * + ~~> t "aabS" + * + ~~> "aabaab" + +versus + + t "aSbS" + * + ~~> t "aSbaSb" + * + ~~> t "aabaSb" + * + ~~> "aabaaabab" + +versus + + t "aSbS" + * + ~~> t "aSbaSb" + * + ~~> t "aSbaaSbab" + * + ~~> t "aSbaaaSbaabab" + * + ~~> ... + +Apparently, this task, as simple as it is, is a form of computation, +and the order in which the `'S'`s get evaluated can lead to divergent +behavior. + +For now, we'll agree to always evaluate the leftmost `'S'`, which +guarantees termination, and a final string without any `'S'` in it. + +This is a task well-suited to using a zipper. We'll define a function +`tz` (for task with zippers), which accomplishes the task by mapping a +`char list zipper` to a `char list`. We'll call the two parts of the +zipper `unzipped` and `zipped`; we start with a fully zipped list, and +move elements to the unzipped part by pulling the zipper down until the +entire list has been unzipped, at which point the zipped half of the +zipper will be empty. + + type 'a list_zipper = ('a list) * ('a list);; + + let rec tz (z : char list_zipper) = + match z with + | (unzipped, []) -> List.rev(unzipped) (* Done! *) + | (unzipped, 'S'::zipped) -> tz ((List.append unzipped unzipped), zipped) + | (unzipped, target::zipped) -> tz (target::unzipped, zipped);; (* Pull zipper *) + + # tz ([], ['a'; 'b'; 'S'; 'd']);; + - : char list = ['a'; 'b'; 'a'; 'b'; 'd'] + + # tz ([], ['a'; 'S'; 'b'; 'S']);; + - : char list = ['a'; 'a'; 'b'; 'a'; 'a'; 'b'] + +Note that the direction in which the zipper unzips enforces the +evaluate-leftmost rule. Task completed. + +One way to see exactly what is going on is to watch the zipper in +action by tracing the execution of `tz`. By using the `#trace` +directive in the OCaml interpreter, the system will print out the +arguments to `tz` each time it is called, including when it is called +recursively within one of the `match` clauses. Note that the +lines with left-facing arrows (`<--`) show (both initial and recursive) calls to `tz`, +giving the value of its argument (a zipper), and the lines with +right-facing arrows (`-->`) show the output of each recursive call, a +simple list. + + # #trace tz;; + t1 is now traced. + # tz ([], ['a'; 'b'; 'S'; 'd']);; + tz <-- ([], ['a'; 'b'; 'S'; 'd']) (* Initial call *) + tz <-- (['a'], ['b'; 'S'; 'd']) (* Pull zipper *) + tz <-- (['b'; 'a'], ['S'; 'd']) (* Pull zipper *) + tz <-- (['b'; 'a'; 'b'; 'a'], ['d']) (* Special 'S' step *) + tz <-- (['d'; 'b'; 'a'; 'b'; 'a'], []) (* Pull zipper *) + tz --> ['a'; 'b'; 'a'; 'b'; 'd'] (* Output reversed *) + tz --> ['a'; 'b'; 'a'; 'b'; 'd'] + tz --> ['a'; 'b'; 'a'; 'b'; 'd'] + tz --> ['a'; 'b'; 'a'; 'b'; 'd'] + tz --> ['a'; 'b'; 'a'; 'b'; 'd'] + - : char list = ['a'; 'b'; 'a'; 'b'; 'd'] + +The nice thing about computations involving lists is that it's so easy +to visualize them as a data structure. Eventually, we want to get to +a place where we can talk about more abstract computations. In order +to get there, we'll first do the exact same thing we just did with +concrete zipper using procedures instead. + +Think of a list as a procedural recipe: `['a'; 'b'; 'c'; 'd']` is the result of +the computation `'a'::('b'::('c'::('d'::[])))` (or, in our old style, +`make_list 'a' (make_list 'b' (make_list 'c' (make_list 'd' empty)))`). The +recipe for constructing the list goes like this: + +> (0) Start with the empty list [] +> (1) make a new list whose first element is 'd' and whose tail is the list constructed in step (0) +> (2) make a new list whose first element is 'c' and whose tail is the list constructed in step (1) +> ----------------------------------------- +> (3) make a new list whose first element is 'b' and whose tail is the list constructed in step (2) +> (4) make a new list whose first element is 'a' and whose tail is the list constructed in step (3) + +What is the type of each of these steps? Well, it will be a function +from the result of the previous step (a list) to a new list: it will +be a function of type `char list -> char list`. We'll call each step +(or group of steps) a **continuation** of the previous steps. So in this +context, a continuation is a function of type `char list -> char +list`. For instance, the continuation corresponding to the portion of +the recipe below the horizontal line is the function `fun (tail : char +list) -> 'a'::('b'::tail)`. What is the continuation of the 4th step? That is, after we've built up `'a'::('b'::('c'::('d'::[])))`, what more has to happen to that for it to become the list `['a'; 'b'; 'c'; 'd']`? Nothing! Its continuation is the function that does nothing: `fun tail -> tail`. + +In what follows, we'll be thinking about the result list that we're building up in this procedural way. We'll treat our input list just as a plain old static list data structure, that we recurse through in the normal way we're accustomed to. We won't need a zipper data structure, because the continuation-based representation of our result list will take over the same role. + +So our new function `tc` (for task with continuations) takes an input list (not a zipper) and a also takes a continuation `k` (it's conventional to use `k` for continuation variables). `k` is a function that represents how the result list is going to continue being built up after this invocation of `tc` delivers up a value. When we invoke `tc` for the first time, we expect it to deliver as a value the very de-S'd list we're seeking, so the way for the list to continue being built up is for nothing to happen to it. That is, our initial invocation of `tc` will supply `fun tail -> tail` as the value for `k`. Here is the whole `tc` function. Its structure and behavior follows `tz` from above, which we've repeated here to facilitate detailed comparison: + + let rec tz (z : char list_zipper) = + match z with + | (unzipped, []) -> List.rev(unzipped) (* Done! *) + | (unzipped, 'S'::zipped) -> tz ((List.append unzipped unzipped), zipped) + | (unzipped, target::zipped) -> tz (target::unzipped, zipped);; (* Pull zipper *) + + let rec tc (l: char list) (k: (char list) -> (char list)) = + match l with + | [] -> List.rev (k []) + | 'S'::zipped -> tc zipped (fun tail -> k (k tail)) + | target::zipped -> tc zipped (fun tail -> target::(k tail));; + + # tc ['a'; 'b'; 'S'; 'd'] (fun tail -> tail);; + - : char list = ['a'; 'b'; 'a'; 'b'] + + # tc ['a'; 'S'; 'b'; 'S'] (fun tail -> tail);; + - : char list = ['a'; 'a'; 'b'; 'a'; 'a'; 'b'] + +To emphasize the parallel, we've re-used the names `zipped` and +`target`. The trace of the procedure will show that these variables +take on the same values in the same series of steps as they did during +the execution of `tz` above: there will once again be one initial and +four recursive calls to `tc`, and `zipped` will take on the values +`"bSd"`, `"Sd"`, `"d"`, and `""` (and, once again, on the final call, +the first `match` clause will fire, so the the variable `zipped` will +not be instantiated). + +We have not named the continuation argument `unzipped`, although that is +what the parallel would suggest. The reason is that `unzipped` (in +`tz`) is a list, but `k` (in `tc`) is a function. That's the most crucial +difference between the solutions---it's the +point of the excercise, and it should be emphasized. For instance, +you can see this difference in the fact that in `tz`, we have to glue +together the two instances of `unzipped` with an explicit (and, +computationally speaking, relatively inefficient) `List.append`. +In the `tc` version of the task, we simply compose `k` with itself: +`k o k = fun tail -> k (k tail)`. + +A call `tc ['a'; 'b'; 'S'; 'd']` would yield a partially-applied function; it would still wait for another argument, a continuation of type `char list -> char list`. So we have to give it an "initial continuation" to get started. As mentioned above, we supply *the identity function* as the initial continuation. Why did we choose that? Again, if +you have already constructed the result list `"ababd"`, what's the desired continuation? What's the next step in the recipe to produce the desired result, i.e, the very same list, `"ababd"`? Clearly, the identity function. + +A good way to test your understanding is to figure out what the +continuation function `k` must be at the point in the computation when +`tc` is applied to the argument `"Sd"`. Two choices: is it +`fun tail -> 'a'::'b'::tail`, or it is `fun tail -> 'b'::'a'::tail`? The way to see if you're right is to execute the following command and see what happens: + + tc ['S'; 'd'] (fun tail -> 'a'::'b'::tail);; + +There are a number of interesting directions we can go with this task. +The reason this task was chosen is because the task itself (as opposed +to the functions used to implement the task) can be viewed as a +simplified picture of a computation using continuations, where `'S'` +plays the role of a continuation operator. (It works like the Scheme +operators `shift` or `control`; the differences between them don't +manifest themselves in this example. +See Ken Shan's paper [Shift to control](http://www.cs.rutgers.edu/~ccshan/recur/recur.pdf), +which inspired some of the discussion in this topic.) +In the analogy, the input list portrays a +sequence of functional applications, where `[f1; f2; f3; x]` represents +`f1(f2(f3 x))`. The limitation of the analogy is that it is only +possible to represent computations in which the applications are +always right-branching, i.e., the computation `((f1 f2) f3) x` cannot +be directly represented. + +One way to extend this exercise would be to add a special symbol `'#'`, +and then the task would be to copy from the target `'S'` only back to +the closest `'#'`. This would allow our task to simulate delimited +continuations with embedded `prompt`s (also called `reset`s). + +The reason the task is well-suited to the list zipper is in part +because the List monad has an intimate connection with continuations. +We'll explore this next. + +