from_lists_to_continuations.mdwn

   1 Refunctionalizing zippers: from lists to continuations
   2 ------------------------------------------------------
   3
   4 If zippers are continuations reified (defuntionalized), then one route
   5 to continuations is to re-functionalize a zipper.  Then the
   6 concreteness and understandability of the zipper provides a way of
   7 understanding and equivalent treatment using continuations.
   8
   9 Let's work with lists of chars for a change.  To maximize readability, we'll
  10 indulge in an abbreviatory convention that "abSd" abbreviates the
  11 list `['a'; 'b'; 'S'; 'd']`.
  12
  13 We will set out to compute a deceptively simple-seeming **task: given a
  14 string, replace each occurrence of 'S' in that string with a copy of
  15 the string up to that point.**
  16
  17 We'll define a function `t` (for "task") that maps strings to their
  18 updated version.
  19
  20 Expected behavior:
  21
  22 <pre>
  23 t "abSd" ~~> "ababd"
  24 </pre>
  25
  26
  27 In linguistic terms, this is a kind of anaphora
  28 resolution, where `'S'` is functioning like an anaphoric element, and
  29 the preceding string portion is the antecedent.
  30
  31 This deceptively simple task gives rise to some mind-bending complexity.
  32 Note that it matters which 'S' you target first (the position of the *
  33 indicates the targeted 'S'):
  34
  35 <pre>
  36     t "aSbS"
  37         *
  38 ~~> t "aabS"
  39           *
  40 ~~> "aabaab"
  41 </pre>
  42
  43 versus
  44
  45 <pre>
  46     t "aSbS"
  47           *
  48 ~~> t "aSbaSb"
  49         *
  50 ~~> t "aabaSb"
  51            *
  52 ~~> "aabaaabab"
  53 </pre>
  54
  55 versus
  56
  57 <pre>
  58     t "aSbS"
  59           *
  60 ~~> t "aSbaSb"
  61            *
  62 ~~> t "aSbaaSbab"
  63             *
  64 ~~> t "aSbaaaSbaabab"
  65              *
  66 ~~> ...
  67 </pre>
  68
  69 Aparently, this task, as simple as it is, is a form of computation,
  70 and the order in which the `'S'`s get evaluated can lead to divergent
  71 behavior.
  72
  73 For now, we'll agree to always evaluate the leftmost `'S'`, which
  74 guarantees termination, and a final string without any `'S'` in it.
  75
  76 This is a task well-suited to using a zipper.  We'll define a function
  77 `tz` (for task with zippers), which accomplishes the task by mapping a
  78 char list zipper to a char list.  We'll call the two parts of the
  79 zipper `unzipped` and `zipped`; we start with a fully zipped list, and
  80 move elements to the zipped part by pulling the zipped down until the
  81 entire list has been unzipped (and so the zipped half of the zipper is empty).
  82
  83 <pre>
  84 type 'a list_zipper = ('a list) * ('a list);;
  85
  86 let rec tz (z:char list_zipper) =
  87     match z with (unzipped, []) -> List.rev(unzipped) (* Done! *)
  88                | (unzipped, 'S'::zipped) -> tz ((List.append unzipped unzipped), zipped)
  89                | (unzipped, target::zipped) -> tz (target::unzipped, zipped);; (* Pull zipper *)
  90
  91 # tz ([], ['a'; 'b'; 'S'; 'd']);;
  92 - : char list = ['a'; 'b'; 'a'; 'b'; 'd']
  93
  94 # tz ([], ['a'; 'S'; 'b'; 'S']);;
  95 - : char list = ['a'; 'a'; 'b'; 'a'; 'a'; 'b']
  96 </pre>
  97
  98 Note that this implementation enforces the evaluate-leftmost rule.
  99 Task completed.
 100
 101 One way to see exactly what is going on is to watch the zipper in
 102 action by tracing the execution of `tz`.  By using the `#trace`
 103 directive in the Ocaml interpreter, the system will print out the
 104 arguments to `tz` each time it is (recurcively) called.  Note that the
 105 lines with left-facing arrows (`<--`) show (recursive) calls to `tz`,
 106 giving the value of its argument (a zipper), and the lines with
 107 right-facing arrows (`-->`) show the output of each recursive call, a
 108 simple list.
 109
 110 <pre>
 111 # #trace tz;;
 112 t1 is now traced.
 113 # tz ([], ['a'; 'b'; 'S'; 'd']);;
 114 tz <-- ([], ['a'; 'b'; 'S'; 'd'])
 115 tz <-- (['a'], ['b'; 'S'; 'd'])         (* Pull zipper *)
 116 tz <-- (['b'; 'a'], ['S'; 'd'])         (* Pull zipper *)
 117 tz <-- (['b'; 'a'; 'b'; 'a'], ['d'])    (* Special step *)
 118 tz <-- (['d'; 'b'; 'a'; 'b'; 'a'], [])  (* Pull zipper *)
 119 tz --> ['a'; 'b'; 'a'; 'b'; 'd']        (* Output reversed *)
 120 tz --> ['a'; 'b'; 'a'; 'b'; 'd']
 121 tz --> ['a'; 'b'; 'a'; 'b'; 'd']
 122 tz --> ['a'; 'b'; 'a'; 'b'; 'd']
 123 tz --> ['a'; 'b'; 'a'; 'b'; 'd']
 124 - : char list = ['a'; 'b'; 'a'; 'b'; 'd']
 125 </pre>
 126
 127 The nice thing about computations involving lists is that it's so easy
 128 to visualize them as a data structure.  Eventually, we want to get to
 129 a place where we can talk about more abstract computations.  In order
 130 to get there, we'll first do the exact same thing we just did with
 131 concrete zipper using procedures.
 132
 133 Think of a list as a procedural recipe: `['a'; 'b'; 'S'; 'd']`
 134 is the result of the computation `a::(b::(S::(d::[])))` (or, in our old
 135 style, `makelist a (makelist b (makelist S (makelist c empty)))`).
 136 The recipe for constructing the list goes like this:
 137
 138 <pre>
 139 (0)  Start with the empty list []
 140 (1)  make a new list whose first element is 'd' and whose tail is the list constructed in step (0)
 141 (2)  make a new list whose first element is 'S' and whose tail is the list constructed in step (1)
 142 -----------------------------------------
 143 (3)  make a new list whose first element is 'b' and whose tail is the list constructed in step (2)
 144 (4)  make a new list whose first element is 'a' and whose tail is the list constructed in step (3)
 145 </pre>
 146
 147 What is the type of each of these steps?  Well, it will be a function
 148 from the result of the previous step (a list) to a new list: it will
 149 be a function of type `char list -> char list`.  We'll call each step
 150 (or group of steps) a **continuation** of the recipe.  So in this
 151 context, a continuation is a function of type `char list -> char
 152 list`.  For instance, the continuation corresponding to the portion of
 153 the recipe below the horizontal line is the function `fun (tail:char
 154 list) -> a::(b::tail)`.
 155
 156 This means that we can now represent the unzipped part of our
 157 zipper--the part we've already unzipped--as a continuation: a function
 158 describing how to finish building the list.  We'll write a new
 159 function, `tc` (for task with continuations), that will take an input
 160 list (not a zipper!) and a continuation and return a processed list.
 161 The structure and the behavior will follow that of `tz` above, with
 162 some small but interesting differences.  We've included the orginal
 163 `tz` to facilitate detailed comparison:
 164
 165 <pre>
 166 let rec tz (z:char list_zipper) =
 167     match z with (unzipped, []) -> List.rev(unzipped) (* Done! *)
 168                | (unzipped, 'S'::zipped) -> tz ((List.append unzipped unzipped), zipped)
 169                | (unzipped, target::zipped) -> tz (target::unzipped, zipped);; (* Pull zipper *)
 170
 171 let rec tc (l: char list) (c: (char list) -> (char list)) =
 172   match l with [] -> List.rev (c [])
 173              | 'S'::zipped -> tc zipped (fun x -> c (c x))
 174              | target::zipped -> tc zipped (fun x -> target::(c x));;
 175
 176 # tc ['a'; 'b'; 'S'; 'd'] (fun x -> x);;
 177 - : char list = ['a'; 'b'; 'a'; 'b']
 178
 179 # tc ['a'; 'S'; 'b'; 'S'] (fun x -> x);;
 180 - : char list = ['a'; 'a'; 'b'; 'a'; 'a'; 'b']
 181 </pre>
 182
 183 To emphasize the parallel, I've re-used the names `zipped` and
 184 `target`.  The trace of the procedure will show that these variables
 185 take on the same values in the same series of steps as they did during
 186 the execution of `tz` above.  There will once again be one initial and
 187 four recursive calls to `tc`, and `zipped` will take on the values
 188 `"bSd"`, `"Sd"`, `"d"`, and `""` (and, once again, on the final call,
 189 the first `match` clause will fire, so the the variable `zipper` will
 190 not be instantiated).
 191
 192 I have not called the functional argument `unzipped`, although that is
 193 what the parallel would suggest.  The reason is that `unzipped` is a
 194 list, but `c` is a function.  That's the most crucial difference, the
 195 point of the excercise, and it should be emphasized.  For instance,
 196 you can see this difference in the fact that in `tz`, we have to glue
 197 together the two instances of `unzipped` with an explicit (and
 198 relatively inefficient) `List.append`.
 199 In the `tc` version of the task, we simply compose `c` with itself:
 200 `c o c = fun x -> c (c x)`.
 201
 202 Why use the identity function as the initial continuation?  Well, if
 203 you have already constructed the initial list `"abSd"`, what's the next
 204 step in the recipe to produce the desired result, i.e, the very same
 205 list, `"abSd"`?  Clearly, the identity continuation.
 206
 207 A good way to test your understanding is to figure out what the
 208 continuation function `c` must be at the point in the computation when
 209 `tc` is called with the first argument `"Sd"`.  Two choices: is it
 210 `fun x -> a::b::x`, or it is `fun x -> b::a::x`?  The way to see if
 211 you're right is to execute the following command and see what happens:
 212
 213     tc ['S'; 'd'] (fun x -> 'a'::'b'::x);;
 214
 215 There are a number of interesting directions we can go with this task.
 216 The reason this task was chosen is because it can be viewed as a
 217 simplified picture of a computation using continuations, where `'S'`
 218 plays the role of a control operator with some similarities to what is
 219 often called `shift`.  In the analogy, the input list portrays a
 220 sequence of functional applications, where `[f1; f2; f3; x]` represents
 221 `f1(f2(f3 x))`.  The limitation of the analogy is that it is only
 222 possible to represent computations in which the applications are
 223 always right-branching, i.e., the computation `((f1 f2) f3) x` cannot
 224 be directly represented.
 225
 226 One possibile development is that we could add a special symbol `'#'`,
 227 and then the task would be to copy from the target `'S'` only back to
 228 the closest `'#'`.  This would allow the task to simulate delimited
 229 continuations with embedded prompts.
 230
 231 The reason the task is well-suited to the list zipper is in part
 232 because the list monad has an intimate connection with continuations.
 233 The following section explores this connection.  We'll return to the
 234 list task after talking about generalized quantifiers below.
 235
 236
 237