+ ta tb tc + . . . +_|__ _|__ _|__ +| | | | | | +1 . . 3 1 . + _|__ _|__ _|__ + | | | | | | + 2 3 1 2 3 2 + +let ta = Node (Leaf 1, Node (Leaf 2, Leaf 3));; +let tb = Node (Node (Leaf 1, Leaf 2), Leaf 3);; +let tc = Node (Leaf 1, Node (Leaf 3, Leaf 2));; ++ +So `ta` and `tb` are different trees that have the same fringe, but +`ta` and `tc` are not. + +We've seen two solutions to the same fringe problem so far. +The simplest solution is to map each tree to a list of its leaves, +then compare the lists. But because we will have computed the entire +fringe before starting the comparison, if the fringes differ in an +early position, we've wasted our time examining the rest of the trees. + +The second solution was to use tree zippers and mutable state to +simulate coroutines (see [[coroutines and aborts]], and +[[assignment8]]). In that solution, we pulled the zipper on the first +tree until we found the next leaf, then stored the zipper structure in +a mutable variable while we turned our attention to the other tree. +This solution is efficient: the zipper doesn't visit any leaves beyond +the first mismatch. + +Since zippers are just continuations reified, we expect that the +solution in terms of zippers can be reworked using continuations, and +this is indeed the case. Your assignment is to show how. + +The first step is to review your answer to [[assignment8]], and make +sure you understand what is going on. + + +Two strategies for solving the problem +-------------------------------------- + + +1. Review the list-zipper/list-continuation example given in + class in [[from list zippers to continuations]]; then + figure out how to re-functionalize the zippers used in the zipper + solution. + +2. Review how the continuation-flavored `tree_monadizer` managed to + map a tree to a list of its leaves, in [[manipulating trees with monads]]. + Spend some time trying to understand exactly what it + does: compute the tree-to-list transformation for a tree with two + leaves, performing all beta reduction by hand using the + definitions for `continuation_bind`, `continuation_unit` and so on. + If you take this route, study the description of **streams** (a + particular kind of data structure) below. The goal will be to + arrange for the continuation-flavored `tree_monadizer` to transform + a tree into a stream instead of into a list. Once you've done + that, completing the same-fringe problem will be easy. + +------------------------------------- + +Whichever method you choose, here are some goals to consider. + +1. Make sure that your solution gives the right results on the trees +given above (`ta`, `tb`, and `tc`). + +2. Make sure your function works on trees that contain only a single +leaf, as well as when the two trees have different numbers of leaves. + +3. Figure out a way to prove that your solution satisfies the main +requirement of the problem; in particular, that when the trees differ +in an early position, your code does not waste time visiting the rest +of the tree. One way to do this is to add print statements to your +functions so that every time you visit a leaf (say), a message is +printed on the output. (In OCaml: `print_int 1` prints an `int`, `print_string "foo"` prints a `string`, `print_newline ()` prints a line break, and `print_endline "foo"` prints a string followed by a line break.) If two trees differ in the middle of their fringe, you should show that your solution prints debugging information for the first half of the fringe, but then stops. + +4. What if you had some reason to believe that the trees you were +going to compare were more likely to differ in the rightmost region? +What would you have to change in your solution so that it worked from +right to left? + +Streams +------- + +A stream is like a list in that it wraps a series of elements of a single type. +It differs from a list in that the tail of the series is left uncomputed +until needed. We will turn the stream on and off by thunking it (see +class notes for [[week6]] on thunks, as well as [[assignment5]]). + + type 'a stream = End | Next of 'a * (unit -> 'a stream);; + +There is a special stream called `End` that represents a stream that +contains no (more) elements, analogous to the empty list `[]`. +Streams that are not empty contain a first object, paired with a +thunked stream representing the rest of the series. In order to get +access to the next element in the stream, we must *force* the thunk by +applying it to the unit. Watch the behavior of this stream in detail. +This stream delivers the natural numbers, in order: 1, 2, 3, ... + +

+# let rec make_int_stream i = Next (i, fun () -> make_int_stream (i + 1));; +val make_int_stream : int -> int stream = [fun] + +# let int_stream = make_int_stream 1;; +val int_stream : int stream = Next (1, [fun]) (* First element: 1 *) + +# let tail = match int_stream with Next (i, rest) -> rest;; +val tail : unit -> int stream = [fun] (* Tail: a thunk *) + +(* Force the thunk to compute the second element *) +# tail ();; +- : int stream = Next (2, [fun]) (* Second element: 2 *) + +# match tail () with Next (_, rest) -> rest ();; +- : int stream = Next (3, [fun]) (* Third element: 3 *) ++ +You can think of `int_stream` as a functional object that provides +access to an infinite sequence of integers, one at a time. It's as if +we had written `[1;2;...]` where `...` meant "continue for as long as +some other process needs new integers". + + + diff --git a/exercises/_assignment14.mdwn b/exercises/_assignment14.mdwn new file mode 100644 index 00000000..ba4df372 --- /dev/null +++ b/exercises/_assignment14.mdwn @@ -0,0 +1,179 @@ +***Non-required but strongly suggested work***: Here are some +additional homework problems that will consolidate your understanding +of what we've covered in the last weeks of the seminar. Those who are +taking the class for credit: since we're so late to post these, and +they add up to a substantial chunk of thinking, we won't officially +require you to do them, since we don't want to get into a bureaucratic +bind if you've planned your next month in a way that would not permit +you to get the work done. But ***we strongly encourage*** you to work on +the problem set: solving these problems will produce a qualitatively +deeper understanding of continuations. If you plan to do some or all +of these problems, and would like us to take them into account in our +evaluation of your work for the course, please let us know when to +expect to see them. (Up to the start of next term, which begins on 24 +January 2011, would be viable.) + +Of course, if you need help or want us to review your efforts, we'll be glad to discuss with you at any later point as well. + + +1. This problem is taken from _The Craft of Functional Programming_ by Simon Thompson, Addison-Wesley 1999

+(\x.y)(ww) + * +y ++ +Done! We have a normal form. But if we reduce using a different +strategy, things go wrong: + +

+(\x.y)(ww) = +(\x.y)((\x.xx)w) = + * +(\x.y)(ww) = +(\x.y)((\x.xx)w) = + * +(\x.y)(ww) ++ +Etc. + +As a second reminder of when evaluation order matters, consider using +`Y = \f.(\h.f(hh))(\h.f(hh))` as a fixed point combinator to define a recursive function: + +

+Y (\f n. blah) = +(\f.(\h.f(hh))(\h.f(hh))) (\f n. blah) + * +(\f.f((\h.f(hh))(\h.f(hh)))) (\f n. blah) + * +(\f.f(f((\h.f(hh))(\h.f(hh))))) (\f n. blah) + * +(\f.f(f(f((\h.f(hh))(\h.f(hh)))))) (\f n. blah) ++ +And we never get the recursion off the ground. + + +Using a Continuation Passing Style transform to control order of evaluation +--------------------------------------------------------------------------- + +We'll present a technique for controlling evaluation order by transforming a lambda term +using a Continuation Passing Style transform (CPS), then we'll explore +what the CPS is doing, and how. + +In order for the CPS to work, we have to adopt a new restriction on +beta reduction: beta reduction does not occur underneath a lambda. +That is, `(\x.y)z` reduces to `z`, but `\u.(\x.y)z` does not reduce to +`\u.z`, because the `\u` protects the redex in the body from +reduction. (In this context, a "redex" is a part of a term that matches +the pattern `...((\xM)N)...`, i.e., something that can potentially be +the target of beta reduction.) + +Start with a simple form that has two different reduction paths: + +reducing the leftmost lambda first: `(\x.y)((\x.z)u) ~~> y` + +reducing the rightmost lambda first: `(\x.y)((\x.z)u) ~~> (\x.y)z ~~> y` + +After using the following call-by-name CPS transform---and assuming +that we never evaluate redexes protected by a lambda---only the first +reduction path will be available: we will have gained control over the +order in which beta reductions are allowed to be performed. + +Here's the CPS transform defined: + + [x] = x + [\xM] = \k.k(\x[M]) + [MN] = \k.[M](\m.m[N]k) + +Here's the result of applying the transform to our simple example: + + [(\x.y)((\x.z)u)] = + \k.[\x.y](\m.m[(\x.z)u]k) = + \k.(\k.k(\x.[y]))(\m.m(\k.[\x.z](\m.m[u]k))k) = + \k.(\k.k(\x.y))(\m.m(\k.(\k.k(\x.z))(\m.muk))k) + +Because the initial `\k` protects (i.e., takes scope over) the entire +transformed term, we can't perform any reductions. In order to watch +the computation unfold, we have to apply the transformed term to a +trivial continuation, usually the identity function `I = \x.x`. + + [(\x.y)((\x.z)u)] I = + (\k.[\x.y](\m.m[(\x.z)u]k)) I + * + [\x.y](\m.m[(\x.z)u] I) = + (\k.k(\x.y))(\m.m[(\x.z)u] I) + * * + (\x.y)[(\x.z)u] I --A-- + * + y I + +The application to `I` unlocks the leftmost functor. Because that +functor (`\x.y`) throws away its argument (consider the reduction in the +line marked (A)), we never need to expand the +CPS transform of the argument. This means that we never bother to +reduce redexes inside the argument. + +Compare with a call-by-value xform: + + {x} = \k.kx + {\aM} = \k.k(\a{M}) + {MN} = \k.{M}(\m.{N}(\n.mnk)) + +This time the reduction unfolds in a different manner: + + {(\x.y)((\x.z)u)} I = + (\k.{\x.y}(\m.{(\x.z)u}(\n.mnk))) I + * + {\x.y}(\m.{(\x.z)u}(\n.mnI)) = + (\k.k(\x.{y}))(\m.{(\x.z)u}(\n.mnI)) + * * + {(\x.z)u}(\n.(\x.{y})nI) = + (\k.{\x.z}(\m.{u}(\n.mnk)))(\n.(\x.{y})nI) + * + {\x.z}(\m.{u}(\n.mn(\n.(\x.{y})nI))) = + (\k.k(\x.{z}))(\m.{u}(\n.mn(\n.(\x.{y})nI))) + * * + {u}(\n.(\x.{z})n(\n.(\x.{y})nI)) = + (\k.ku)(\n.(\x.{z})n(\n.(\x.{y})nI)) + * * + (\x.{z})u(\n.(\x.{y})nI) --A-- + * + {z}(\n.(\x.{y})nI) = + (\k.kz)(\n.(\x.{y})nI) + * * + (\x.{y})zI + * + {y}I = + (\k.ky)I + * + I y + +In this case, the argument does get evaluated: consider the reduction +in the line marked (A). + +Both xforms make the following guarantee: as long as redexes +underneath a lambda are never evaluated, there will be at most one +reduction available at any step in the evaluation. +That is, all choice is removed from the evaluation process. + +Now let's verify that the CBN CPS avoids the infinite reduction path +discussed above (remember that `w = \x.xx`): + + [(\x.y)(ww)] I = + (\k.[\x.y](\m.m[ww]k)) I + * + [\x.y](\m.m[ww]I) = + (\k.k(\x.y))(\m.m[ww]I) + * * + (\x.y)[ww]I + * + y I + + +Questions and exercises: + +1. Prove that {(\x.y)(ww)} does not terminate. + +2. Why is the CBN xform for variables `[x] = x` instead of something +involving kappas (i.e., `k`'s)? + +3. Write an Ocaml function that takes a lambda term and returns a +CPS-xformed lambda term. You can use the following data declaration: + + type form = Var of char | Abs of char * form | App of form * form;; + +4. The discussion above talks about the "leftmost" redex, or the +"rightmost". But these words apply accurately only in a special set +of terms. Characterize the order of evaluation for CBN (likewise, for +CBV) more completely and carefully. + +5. What happens (in terms of evaluation order) when the application +rule for CBV CPS is changed to `{MN} = \k.{N}(\n.{M}(\m.mnk))`? + +6. A term and its CPS xform are different lambda terms. Yet in some +sense they "do" the same thing computationally. Make this sense +precise. + + +Thinking through the types +-------------------------- + +This discussion is based on [Meyer and Wand 1985](http://citeseer.ist.psu.edu/viewdoc/download?doi=10.1.1.44.7943&rep=rep1&type=pdf). + +Let's say we're working in the simply-typed lambda calculus. +Then if the original term is well-typed, the CPS xform will also be +well-typed. But what will the type of the transformed term be? + +The transformed terms all have the form `\k.blah`. The rule for the +CBN xform of a variable appears to be an exception, but instead of +writing `[x] = x`, we can write `[x] = \k.xk`, which is +eta-equivalent. The `k`'s are continuations: functions from something +to a result. Let's use σ as the result type. The each `k` in +the transform will be a function of type ρ --> σ for some +choice of ρ. + +We'll need an ancilliary function ': for any ground type a, a' = a; +for functional types a->b, (a->b)' = ((a' -> σ) -> σ) -> (b' -> σ) -> σ. + + Call by name transform + + Terms Types + + [x] = \k.xk [a] = (a'->o)->o + [\xM] = \k.k(\x[M]) [a->b] = ((a->b)'->o)->o + [MN] = \k.[M](\m.m[N]k) [b] = (b'->o)->o + +Remember that types associate to the right. Let's work through the +application xform and make sure the types are consistent. We'll have +the following types: + + M:a->b + N:a + MN:b + k:b'->o + [N]:(a'->o)->o + m:((a'->o)->o)->(b'->o)->o + m[N]:(b'->o)->o + m[N]k:o + [M]:((a->b)'->o)->o = ((((a'->o)->o)->(b'->o)->o)->o)->o + [M](\m.m[N]k):o + [MN]:(b'->o)->o + +Be aware that even though the transform uses the same symbol for the +translation of a variable (i.e., `[x] = x`), in general the variable +in the transformed term will have a different type than in the source +term. + +Excercise: what should the function ' be for the CBV xform? Hint: +see the Meyer and Wand abstract linked above for the answer. + + +Other CPS transforms +-------------------- + +It is easy to think that CBN and CBV are the only two CPS transforms. +(We've already seen a variant on call-by-value one of the excercises above.) + +In fact, the number of distinct transforms is unbounded. For +instance, here is a variant of CBV that uses the same types as CBN: + +

`Θ`

. Expressions using them
+will have non-terminating reductions, with Scheme's eager/call-by-value
+strategy. There are other fixed-point combinators you can use with Scheme (in
+the [week 3 notes](/week3/#index7h2) they were `Y′`

and
+`Θ′`

. But even with them, evaluation order still
+matters: for some (admittedly unusual) evaluation strategies, expressions using
+them will also be non-terminating.
+
+> The fixed-point combinators may be the conceptual stars. They are cool and
+mathematically elegant. But for efficiency and implementation elegance, it's
+best to know how to do as much as you can without them. (Also, that knowledge
+could carry over to settings where the fixed point combinators are in principle
+unavailable.)
+
+
+So again, what if we're using v3 lists? What options would we have then for
+aborting a search or list traversal before it runs to completion?
+
+Suppose we're searching through the list `[5;4;3;2;1]` to see if it
+contains the number `3`. The expression which represents this search would have
+something like the following form:
+
+ ..................`extract_fst ≡ \pair. pair (\x y. x)`

+
+but at a lower level, the pair is still accepting its handler as an argument,
+rather than the handler taking the pair as an argument. (The handler gets *the
+pair's elements*, not the pair itself, as arguments.)
+
+> *Terminology*: we'll try to use names of the form `get_foo` for handlers, and
+names of the form `extract_foo` for lifted versions of them, that accept the
+lists (or whatever data structure we're working with) as arguments. But we may
+sometimes forget.
+
+The v2 implementation of lists followed a similar strategy:
+
+ v2list (\h t. do_something_with_h_and_t) result_if_empty
+
+If the `v2list` here is not empty, then this will reduce to the result of
+supplying the list's head and tail to the handler `(\h t.
+do_something_with_h_and_t)`.
+
+Now, what we've been imagining ourselves doing with the search through the v3
+list is something like this:
+
+
+ larger_computation (search_through_the_list_for_3) other_arguments
+
+That is, the result of our search is supplied as an argument (perhaps together
+with other arguments) to the "larger computation". Without knowing the
+evaluation order/reduction strategy, we can't say whether the search is
+evaluated before or after it's substituted into the larger computation. But
+semantically, the search is the argument and the larger computation is the
+function to which it's supplied.
+
+What if, instead, we did the same kind of thing we did with pairs and v2
+lists? That is, what if we made the larger computation a "handler" that we
+passed as an argument to the search?
+
+ the_search (\search_result. larger_computation search_result other_arguments)
+
+What's the advantage of that, you say. Other than to show off how cleverly
+you can lift.
+
+Well, think about it. Think about the difficulty we were having aborting the
+search. Does this switch-around offer us anything useful?
+
+It could.
+
+What if the way we implemented the search procedure looked something like this?
+
+At a given stage in the search, we wouldn't just apply some function `f` to the
+head at this stage and the result accumulated so far (from folding the same
+function, and a base value, to the tail at this stage)...and then pass the result
+of that application to the embedding, more leftward computation.
+
+We'd *instead* give `f` a "handler" that expects the result of the current
+stage *as an argument*, and then evaluates to what you'd get by passing that
+result leftwards up the list, as before.
+
+Why would we do that, you say? Just more flamboyant lifting?
+
+Well, no, there's a real point here. If we give the function a "handler" that
+encodes the normal continuation of the fold leftwards through the list, we can
+also give it other "handlers" too. For example, we can also give it the underlined handler:
+
+
+ the_search (\search_result. larger_computation search_result other_arguments)
+ ------------------------------------------------------------------
+
+This "handler" encodes the search's having finished, and delivering a final
+answer to whatever else you wanted your program to do with the result of the
+search. If you like, at any stage in the search you might just give an argument
+to *this* handler, instead of giving an argument to the handler that continues
+the list traversal leftwards. Semantically, this would amount to *aborting* the
+list traversal! (As we've said before, whether the rest of the list traversal
+really gets evaluated will depend on what evaluation order is in place. But
+semantically we'll have avoided it. Our larger computation won't depend on the
+rest of the list traversal having been computed.)
+
+Do you have the basic idea? Think about how you'd implement it. A good
+understanding of the v2 lists will give you a helpful model.
+
+In broad outline, a single stage of the search would look like before, except
+now `f` would receive two extra, "handler" arguments. We'll reserve the name `f` for the original fold function, and use `f2` for the function that accepts two additional handler arguments. To get the general idea, you can regard these as interchangeable. If the extra precision might help, then you can pay attention to when we're talking about the handler-taking `f2` or the original `f`. You'll only be *supplying* the `f2` function; the idea will be that the behavior of the original `f` will be implicitly encoded in `f2`'s behavior.
+
+ f2 3