from_list_zippers_to_continuations.mdwn

   1 Refunctionalizing list zippers
   2 ------------------------------
   3
   4 If zippers are continuations reified (defuntionalized), then one route
   5 to continuations is to re-functionalize a zipper.  Then the
   6 concreteness and understandability of the zipper provides a way of
   7 understanding an equivalent treatment using continuations.
   8
   9 Let's work with lists of `char`s for a change.  To maximize readability, we'll
  10 indulge in an abbreviatory convention that "abSd" abbreviates the
  11 list `['a'; 'b'; 'S'; 'd']`.
  12
  13 We will set out to compute a deceptively simple-seeming **task: given a
  14 string, replace each occurrence of 'S' in that string with a copy of
  15 the string up to that point.**
  16
  17 We'll define a function `t` (for "task") that maps strings to their
  18 updated version.
  19
  20 Expected behavior:
  21
  22         t "abSd" ~~> "ababd"
  23
  24
  25 In linguistic terms, this is a kind of anaphora
  26 resolution, where `'S'` is functioning like an anaphoric element, and
  27 the preceding string portion is the antecedent.
  28
  29 This deceptively simple task gives rise to some mind-bending complexity.
  30 Note that it matters which 'S' you target first (the position of the *
  31 indicates the targeted 'S'):
  32
  33             t "aSbS"
  34                 *
  35         ~~> t "aabS"
  36                   *
  37         ~~> "aabaab"
  38
  39 versus
  40
  41             t "aSbS"
  42                   *
  43         ~~> t "aSbaSb"
  44                 *
  45         ~~> t "aabaSb"
  46                    *
  47         ~~> "aabaaabab"
  48
  49 versus
  50
  51             t "aSbS"
  52                   *
  53         ~~> t "aSbaSb"
  54                    *
  55         ~~> t "aSbaaSbab"
  56                     *
  57         ~~> t "aSbaaaSbaabab"
  58                      *
  59         ~~> ...
  60
  61 Aparently, this task, as simple as it is, is a form of computation,
  62 and the order in which the `'S'`s get evaluated can lead to divergent
  63 behavior.
  64
  65 For now, we'll agree to always evaluate the leftmost `'S'`, which
  66 guarantees termination, and a final string without any `'S'` in it.
  67
  68 This is a task well-suited to using a zipper.  We'll define a function
  69 `tz` (for task with zippers), which accomplishes the task by mapping a
  70 `char list zipper` to a `char list`.  We'll call the two parts of the
  71 zipper `unzipped` and `zipped`; we start with a fully zipped list, and
  72 move elements to the unzipped part by pulling the zipper down until the
  73 entire list has been unzipped (and so the zipped half of the zipper is empty).
  74
  75         type 'a list_zipper = ('a list) * ('a list);;
  76
  77         let rec tz (z : char list_zipper) =
  78             match z with
  79             | (unzipped, []) -> List.rev(unzipped) (* Done! *)
  80             | (unzipped, 'S'::zipped) -> tz ((List.append unzipped unzipped), zipped)
  81             | (unzipped, target::zipped) -> tz (target::unzipped, zipped);; (* Pull zipper *)
  82
  83         # tz ([], ['a'; 'b'; 'S'; 'd']);;
  84         - : char list = ['a'; 'b'; 'a'; 'b'; 'd']
  85
  86         # tz ([], ['a'; 'S'; 'b'; 'S']);;
  87         - : char list = ['a'; 'a'; 'b'; 'a'; 'a'; 'b']
  88
  89 Note that this implementation enforces the evaluate-leftmost rule.
  90 Task completed.
  91
  92 One way to see exactly what is going on is to watch the zipper in
  93 action by tracing the execution of `tz`.  By using the `#trace`
  94 directive in the OCaml interpreter, the system will print out the
  95 arguments to `tz` each time it is (recurcively) called.  Note that the
  96 lines with left-facing arrows (`<--`) show (recursive) calls to `tz`,
  97 giving the value of its argument (a zipper), and the lines with
  98 right-facing arrows (`-->`) show the output of each recursive call, a
  99 simple list.
 100
 101         # #trace tz;;
 102         t1 is now traced.
 103         # tz ([], ['a'; 'b'; 'S'; 'd']);;
 104         tz <-- ([], ['a'; 'b'; 'S'; 'd'])
 105         tz <-- (['a'], ['b'; 'S'; 'd'])         (* Pull zipper *)
 106         tz <-- (['b'; 'a'], ['S'; 'd'])         (* Pull zipper *)
 107         tz <-- (['b'; 'a'; 'b'; 'a'], ['d'])    (* Special step *)
 108         tz <-- (['d'; 'b'; 'a'; 'b'; 'a'], [])  (* Pull zipper *)
 109         tz --> ['a'; 'b'; 'a'; 'b'; 'd']        (* Output reversed *)
 110         tz --> ['a'; 'b'; 'a'; 'b'; 'd']
 111         tz --> ['a'; 'b'; 'a'; 'b'; 'd']
 112         tz --> ['a'; 'b'; 'a'; 'b'; 'd']
 113         tz --> ['a'; 'b'; 'a'; 'b'; 'd']
 114         - : char list = ['a'; 'b'; 'a'; 'b'; 'd']
 115
 116 The nice thing about computations involving lists is that it's so easy
 117 to visualize them as a data structure.  Eventually, we want to get to
 118 a place where we can talk about more abstract computations.  In order
 119 to get there, we'll first do the exact same thing we just did with
 120 concrete zipper using procedures.
 121
 122 Think of a list as a procedural recipe: `['a'; 'b'; 'S'; 'd']` is the result of
 123 the computation `'a'::('b'::('S'::('d'::[])))` (or, in our old style,
 124 `make_list 'a' (make_list 'b' (make_list 'S' (make_list 'd' empty)))`). The
 125 recipe for constructing the list goes like this:
 126
 127 >       (0)  Start with the empty list []
 128 >       (1)  make a new list whose first element is 'd' and whose tail is the list constructed in step (0)
 129 >       (2)  make a new list whose first element is 'S' and whose tail is the list constructed in step (1)
 130 >       -----------------------------------------
 131 >       (3)  make a new list whose first element is 'b' and whose tail is the list constructed in step (2)
 132 >       (4)  make a new list whose first element is 'a' and whose tail is the list constructed in step (3)
 133
 134 What is the type of each of these steps?  Well, it will be a function
 135 from the result of the previous step (a list) to a new list: it will
 136 be a function of type `char list -> char list`.  We'll call each step
 137 (or group of steps) a **continuation** of the previous steps.  So in this
 138 context, a continuation is a function of type `char list -> char
 139 list`.  For instance, the continuation corresponding to the portion of
 140 the recipe below the horizontal line is the function `fun (tail : char
 141 list) -> 'a'::('b'::tail)`.
 142
 143 This means that we can now represent the unzipped part of our
 144 zipper---the part we've already unzipped---as a continuation: a function
 145 describing how to finish building a list.  We'll write a new
 146 function, `tc` (for task with continuations), that will take an input
 147 list (not a zipper!) and a continuation `k` (it's conventional to use `k` for continuation variables) and return a processed list.
 148 The structure and the behavior will follow that of `tz` above, with
 149 some small but interesting differences.  We've included the orginal
 150 `tz` to facilitate detailed comparison:
 151
 152         let rec tz (z : char list_zipper) =
 153             match z with
 154             | (unzipped, []) -> List.rev(unzipped) (* Done! *)
 155             | (unzipped, 'S'::zipped) -> tz ((List.append unzipped unzipped), zipped)
 156             | (unzipped, target::zipped) -> tz (target::unzipped, zipped);; (* Pull zipper *)
 157
 158         let rec tc (l: char list) (k: (char list) -> (char list)) =
 159             match l with
 160             | [] -> List.rev (k [])
 161             | 'S'::zipped -> tc zipped (fun tail -> k (k tail))
 162             | target::zipped -> tc zipped (fun tail -> target::(k tail));;
 163
 164         # tc ['a'; 'b'; 'S'; 'd'] (fun tail -> tail);;
 165         - : char list = ['a'; 'b'; 'a'; 'b']
 166
 167         # tc ['a'; 'S'; 'b'; 'S'] (fun tail -> tail);;
 168         - : char list = ['a'; 'a'; 'b'; 'a'; 'a'; 'b']
 169
 170 To emphasize the parallel, I've re-used the names `zipped` and
 171 `target`.  The trace of the procedure will show that these variables
 172 take on the same values in the same series of steps as they did during
 173 the execution of `tz` above.  There will once again be one initial and
 174 four recursive calls to `tc`, and `zipped` will take on the values
 175 `"bSd"`, `"Sd"`, `"d"`, and `""` (and, once again, on the final call,
 176 the first `match` clause will fire, so the the variable `zipped` will
 177 not be instantiated).
 178
 179 I have not called the functional argument `unzipped`, although that is
 180 what the parallel would suggest.  The reason is that `unzipped` is a
 181 list, but `k` is a function.  That's the most crucial difference, the
 182 point of the excercise, and it should be emphasized.  For instance,
 183 you can see this difference in the fact that in `tz`, we have to glue
 184 together the two instances of `unzipped` with an explicit (and
 185 relatively inefficient) `List.append`.
 186 In the `tc` version of the task, we simply compose `k` with itself:
 187 `k o k = fun tail -> k (k tail)`.
 188
 189 A call `tc ['a'; 'b'; 'S'; 'd']` yields a partially-applied function; it still waits for another argument, a continuation of type `char list -> char list`. We have to give it an "initial continuation" to get started. Here we supply *the identity function* as the initial continuation. Why did we choose that? Well, if
 190 you have already constructed the initial list `"abSd"`, what's the desired continuation? What's the next step in the recipe to produce the desired result, i.e, the very same list, `"abSd"`?  Clearly, the identity function.
 191
 192 A good way to test your understanding is to figure out what the
 193 continuation function `k` must be at the point in the computation when
 194 `tc` is called with the first argument `"Sd"`.  Two choices: is it
 195 `fun tail -> 'a'::'b'::tail`, or it is `fun tail -> 'b'::'a'::tail`?  The way to see if
 196 you're right is to execute the following command and see what happens:
 197
 198     tc ['S'; 'd'] (fun tail -> 'a'::'b'::tail);;
 199
 200 There are a number of interesting directions we can go with this task.
 201 The reason this task was chosen is because it can be viewed as a
 202 simplified picture of a computation using continuations, where `'S'`
 203 plays the role of a continuation operator. (It works like the Scheme operators `shift` or `control`; the differences between them don't manifest themselves in this example.) In the analogy, the input list portrays a
 204 sequence of functional applications, where `[f1; f2; f3; x]` represents
 205 `f1(f2(f3 x))`.  The limitation of the analogy is that it is only
 206 possible to represent computations in which the applications are
 207 always right-branching, i.e., the computation `((f1 f2) f3) x` cannot
 208 be directly represented.
 209
 210 One way to extend this exercise would be to add a special symbol `'#'`,
 211 and then the task would be to copy from the target `'S'` only back to
 212 the closest `'#'`.  This would allow our task to simulate delimited
 213 continuations with embedded `prompt`s (also called `reset`s).
 214
 215 The reason the task is well-suited to the list zipper is in part
 216 because the list monad has an intimate connection with continuations.
 217 We'll explore this next.
 218
 219