Refunctionalizing zippers: from lists to continuations ------------------------------------------------------ If zippers are continuations reified (defuntionalized), then one route to continuations is to re-functionalize a zipper. Then the concreteness and understandability of the zipper provides a way of understanding and equivalent treatment using continuations. Let's work with lists of `char`s for a change. To maximize readability, we'll indulge in an abbreviatory convention that "abSd" abbreviates the list `['a'; 'b'; 'S'; 'd']`. We will set out to compute a deceptively simple-seeming **task: given a string, replace each occurrence of 'S' in that string with a copy of the string up to that point.** We'll define a function `t` (for "task") that maps strings to their updated version. Expected behavior: t "abSd" ~~> "ababd" In linguistic terms, this is a kind of anaphora resolution, where `'S'` is functioning like an anaphoric element, and the preceding string portion is the antecedent. This deceptively simple task gives rise to some mind-bending complexity. Note that it matters which 'S' you target first (the position of the * indicates the targeted 'S'): t "aSbS" * ~~> t "aabS" * ~~> "aabaab" versus t "aSbS" * ~~> t "aSbaSb" * ~~> t "aabaSb" * ~~> "aabaaabab" versus t "aSbS" * ~~> t "aSbaSb" * ~~> t "aSbaaSbab" * ~~> t "aSbaaaSbaabab" * ~~> ... Aparently, this task, as simple as it is, is a form of computation, and the order in which the `'S'`s get evaluated can lead to divergent behavior. For now, we'll agree to always evaluate the leftmost `'S'`, which guarantees termination, and a final string without any `'S'` in it. This is a task well-suited to using a zipper. We'll define a function `tz` (for task with zippers), which accomplishes the task by mapping a `char list zipper` to a `char list`. We'll call the two parts of the zipper `unzipped` and `zipped`; we start with a fully zipped list, and move elements to the zipped part by pulling the zipper down until the entire list has been unzipped (and so the zipped half of the zipper is empty). type 'a list_zipper = ('a list) * ('a list);; let rec tz (z : char list_zipper) = match z with | (unzipped, []) -> List.rev(unzipped) (* Done! *) | (unzipped, 'S'::zipped) -> tz ((List.append unzipped unzipped), zipped) | (unzipped, target::zipped) -> tz (target::unzipped, zipped);; (* Pull zipper *) # tz ([], ['a'; 'b'; 'S'; 'd']);; - : char list = ['a'; 'b'; 'a'; 'b'; 'd'] # tz ([], ['a'; 'S'; 'b'; 'S']);; - : char list = ['a'; 'a'; 'b'; 'a'; 'a'; 'b'] Note that this implementation enforces the evaluate-leftmost rule. Task completed. One way to see exactly what is going on is to watch the zipper in action by tracing the execution of `tz`. By using the `#trace` directive in the Ocaml interpreter, the system will print out the arguments to `tz` each time it is (recurcively) called. Note that the lines with left-facing arrows (`<--`) show (recursive) calls to `tz`, giving the value of its argument (a zipper), and the lines with right-facing arrows (`-->`) show the output of each recursive call, a simple list.
# #trace tz;;
t1 is now traced.
# tz ([], ['a'; 'b'; 'S'; 'd']);;
tz <-- ([], ['a'; 'b'; 'S'; 'd'])
tz <-- (['a'], ['b'; 'S'; 'd'])         (* Pull zipper *)
tz <-- (['b'; 'a'], ['S'; 'd'])         (* Pull zipper *)
tz <-- (['b'; 'a'; 'b'; 'a'], ['d'])    (* Special step *)
tz <-- (['d'; 'b'; 'a'; 'b'; 'a'], [])  (* Pull zipper *)
tz --> ['a'; 'b'; 'a'; 'b'; 'd']        (* Output reversed *)
tz --> ['a'; 'b'; 'a'; 'b'; 'd']
tz --> ['a'; 'b'; 'a'; 'b'; 'd']
tz --> ['a'; 'b'; 'a'; 'b'; 'd']
tz --> ['a'; 'b'; 'a'; 'b'; 'd']
- : char list = ['a'; 'b'; 'a'; 'b'; 'd'] 
The nice thing about computations involving lists is that it's so easy to visualize them as a data structure. Eventually, we want to get to a place where we can talk about more abstract computations. In order to get there, we'll first do the exact same thing we just did with concrete zipper using procedures. Think of a list as a procedural recipe: `['a'; 'b'; 'S'; 'd']` is the result of the computation `'a'::('b'::('S'::('d'::[])))` (or, in our old style, `make_list 'a' (make_list 'b' (make_list 'S' (make_list 'd' empty)))`). The recipe for constructing the list goes like this:
(0)  Start with the empty list []
(1)  make a new list whose first element is 'd' and whose tail is the list constructed in step (0)
(2)  make a new list whose first element is 'S' and whose tail is the list constructed in step (1)
-----------------------------------------
(3)  make a new list whose first element is 'b' and whose tail is the list constructed in step (2)
(4)  make a new list whose first element is 'a' and whose tail is the list constructed in step (3)
What is the type of each of these steps? Well, it will be a function from the result of the previous step (a list) to a new list: it will be a function of type `char list -> char list`. We'll call each step (or group of steps) a **continuation** of the recipe. So in this context, a continuation is a function of type `char list -> char list`. For instance, the continuation corresponding to the portion of the recipe below the horizontal line is the function `fun (tail : char list) -> 'a'::('b'::tail)`. This means that we can now represent the unzipped part of our zipper---the part we've already unzipped---as a continuation: a function describing how to finish building the list. We'll write a new function, `tc` (for task with continuations), that will take an input list (not a zipper!) and a continuation and return a processed list. The structure and the behavior will follow that of `tz` above, with some small but interesting differences. We've included the orginal `tz` to facilitate detailed comparison: let rec tz (z : char list_zipper) = match z with | (unzipped, []) -> List.rev(unzipped) (* Done! *) | (unzipped, 'S'::zipped) -> tz ((List.append unzipped unzipped), zipped) | (unzipped, target::zipped) -> tz (target::unzipped, zipped);; (* Pull zipper *) let rec tc (l: char list) (c: (char list) -> (char list)) = match l with | [] -> List.rev (c []) | 'S'::zipped -> tc zipped (fun x -> c (c x)) | target::zipped -> tc zipped (fun x -> target::(c x));; # tc ['a'; 'b'; 'S'; 'd'] (fun x -> x);; - : char list = ['a'; 'b'; 'a'; 'b'] # tc ['a'; 'S'; 'b'; 'S'] (fun x -> x);; - : char list = ['a'; 'a'; 'b'; 'a'; 'a'; 'b'] To emphasize the parallel, I've re-used the names `zipped` and `target`. The trace of the procedure will show that these variables take on the same values in the same series of steps as they did during the execution of `tz` above. There will once again be one initial and four recursive calls to `tc`, and `zipped` will take on the values `"bSd"`, `"Sd"`, `"d"`, and `""` (and, once again, on the final call, the first `match` clause will fire, so the the variable `zipper` will not be instantiated). I have not called the functional argument `unzipped`, although that is what the parallel would suggest. The reason is that `unzipped` is a list, but `c` is a function. That's the most crucial difference, the point of the excercise, and it should be emphasized. For instance, you can see this difference in the fact that in `tz`, we have to glue together the two instances of `unzipped` with an explicit (and relatively inefficient) `List.append`. In the `tc` version of the task, we simply compose `c` with itself: `c o c = fun x -> c (c x)`. Why use the identity function as the initial continuation? Well, if you have already constructed the initial list `"abSd"`, what's the next step in the recipe to produce the desired result, i.e, the very same list, `"abSd"`? Clearly, the identity continuation. A good way to test your understanding is to figure out what the continuation function `c` must be at the point in the computation when `tc` is called with the first argument `"Sd"`. Two choices: is it `fun x -> a::b::x`, or it is `fun x -> b::a::x`? The way to see if you're right is to execute the following command and see what happens: tc ['S'; 'd'] (fun x -> 'a'::'b'::x);; There are a number of interesting directions we can go with this task. The reason this task was chosen is because it can be viewed as a simplified picture of a computation using continuations, where `'S'` plays the role of a control operator with some similarities to what is often called `shift`. In the analogy, the input list portrays a sequence of functional applications, where `[f1; f2; f3; x]` represents `f1(f2(f3 x))`. The limitation of the analogy is that it is only possible to represent computations in which the applications are always right-branching, i.e., the computation `((f1 f2) f3) x` cannot be directly represented. One possibile development is that we could add a special symbol `'#'`, and then the task would be to copy from the target `'S'` only back to the closest `'#'`. This would allow the task to simulate delimited continuations with embedded prompts. The reason the task is well-suited to the list zipper is in part because the list monad has an intimate connection with continuations. The following section explores this connection. We'll return to the list task after talking about generalized quantifiers below.