changes to offsite-reading

[lambda.git] / advanced_lambda.mdwn
diff --git a/advanced_lambda.mdwn b/advanced_lambda.mdwn

index df05c9e..d23245c 100644 (file)
--- a/advanced_lambda.mdwn
+++ b/advanced_lambda.mdwn
@@ -1,40 +1,9 @@
  [[!toc]]
  
-#Reversing a list#
  
-How would you define an operation to reverse a list? (Don't peek at the
-[[lambda_library]]! Try to figure it out on your own.) Choose whichever
-implementation of list you like. Even then, there are various strategies you
-can use.
+#Version 4 lists: Efficiently extracting tails#
  
-
-#Efficiently extracting tails and predecessors#
-
-An advantage of the v3 lists and v3 (aka "Church") numerals is that they
-have a recursive capacity built into their skeleton. So for many natural
-operations on them, you won't need to use a fixed point combinator. Why is
-that an advantage? Well, if you use a fixed point combinator, then the terms
-you get
-won't be strongly normalizing: whether their reduction stops at a normal form
-will depend on what evaluation order you use. Our online [[lambda evaluator]]
-uses normal-order reduction, so it finds a normal form if there's one to be
-had. But if you want to build lambda terms in, say, Scheme, and you wanted to
-roll your own recursion as we've been doing, rather than relying on Scheme's
-native `let rec` or `define`, then you can't use the fixed-point combinators
-`Y` or <code>&Theta;</code>. Expressions using them will have non-terminating
-reductions, with Scheme's eager/call-by-value strategy. There are other
-fixed-point combinators you can use with Scheme (in the [week 3 notes](/week3/#index7h2) they
-were <code>Y&prime;</code> and <code>&Theta;&prime;</code>. But even with
-them, evaluation order still matters: for some (admittedly unusual)
-evaluation strategies, expressions using them will also be non-terminating.
-
-The fixed-point combinators may be the conceptual stars. They are cool and
-mathematically elegant. But for efficiency and implementation elegance, it's
-best to know how to do as much as you can without them. (Also, that knowledge
-could carry over to settings where the fixed point combinators are in
-principle unavailable.)
-
-This is why the v3 lists and numbers are so lovely. However, one disadvantage
+Version 3 lists and Church numerals are lovely, because they have their recursive capacity built into their very bones. However, one disadvantage
  to them is that it's relatively inefficient to extract a list's tail, or get a
  number's predecessor. To get the tail of the list `[a;b;c;d;e]`, one will
  basically be performing some operation that builds up the tail afresh: at
@@ -98,597 +67,3 @@ unsurprisingly, he wasn't the first to do so. See for example [Oleg's report
  on P-numerals](http://okmij.org/ftp/Computation/lambda-calc.html#p-numerals).
  
  
-
-#Sets#
-
-You're now already in a position to implement sets: that is, collections with
-no intrinsic order where elements can occur at most once. Like lists, we'll
-understand the basic set structures to be *type-homogenous*. So you might have
-a set of integers, or you might have a set of pairs of integers, but you
-wouldn't have a set that mixed both types of elements. Something *like* the
-last option is also achievable, but it's more difficult, and we won't pursue it
-now. In fact, we won't talk about sets of pairs, either. We'll just talk about
-sets of integers. The same techniques we discuss here could also be applied to
-sets of pairs of integers, or sets of triples of booleans, or sets of pairs
-whose first elements are booleans, and whose second elements are triples of
-integers. And so on.
-
-(You're also now in a position to implement *multi*sets: that is, collections
-with no intrinsic order where elements can occur multiple times: the multiset
-{a,a} is distinct from the multiset {a}. But we'll leave these as an exercise.)
-
-The easiest way to implement sets of integers would just be to use lists. When
-you "add" a member to a set, you'd get back a list that was either identical to
-the original list, if the added member already was present in it, or consisted
-of a new list with the added member prepended to the old list. That is:
-
-       let empty_set = empty  in
-       ; see the library for definitions of any and eq
-       let make_set = \new_member old_set. any (eq new_member) old_set
-                                               ; if any element in old_set was eq new_member
-                                               old_set
-                                               ; else
-                                               make_list new_member old_set
-
-Think about how you'd implement operations like `set_union`,
-`set_intersection`, and `set_difference` with this implementation of sets.
-
-The implementation just described works, and it's the simplest to code.
-However, it's pretty inefficient. If you had a 100-member set, and you wanted
-to create a set which had all those 100-members and some possibly new element
-`e`, you might need to check all 100 members to see if they're equal to `e`
-before concluding they're not, and returning the new list. And comparing for
-numeric equality is a moderately expensive operation, in the first place.
-
-(You might say, well, what's the harm in just prepending `e` to the list even
-if it already occurs later in the list. The answer is, if you don't keep track
-of things like this, it will likely mess up your implementations of
-`set_difference` and so on. You'll have to do the book-keeping for duplicates
-at some point in your code. It goes much more smoothly if you plan this from
-the very beginning.)
-
-How might we make the implementation more efficient? Well, the *semantics* of
-sets says that they have no intrinsic order. That means, there's no difference
-between the set {a,b} and the set {b,a}; whereas there is a difference between
-the *list* `[a;b]` and the list `[b;a]`. But this semantic point can be respected
-even if we *implement* sets with something ordered, like list---as we're
-already doing. And we might *exploit* the intrinsic order of lists to make our
-implementation of sets more efficient.
-
-What we could do is arrange it so that a list that implements a set always
-keeps in elements in some specified order. To do this, there'd have *to be*
-some way to order its elements. Since we're talking now about sets of numbers,
-that's easy. (If we were talking about sets of pairs of numbers, we'd use
-"lexicographic" ordering, where `(a,b) < (c,d)` iff `a < c or (a == c and b <
-d)`.)
-
-So, if we were searching the list that implements some set to see if the number
-`5` belonged to it, once we get to elements in the list that are larger than `5`,
-we can stop. If we haven't found `5` already, we know it's not in the rest of the
-list either.
-
-This is an improvement, but it's still a "linear" search through the list.
-There are even more efficient methods, which employ "binary" searching. They'd
-represent the set in such a way that you could quickly determine whether some
-element fell in one half, call it the left half, of the structure that
-implements the set, if it belonged to the set at all. Or that it fell in the
-right half, it it belonged to the set at all. And then the same sort of
-determination could be made for whichever half you were directed to. And then
-for whichever quarter you were directed to next. And so on. Until you either
-found the element or exhausted the structure and could then conclude that the
-element in question was not part of the set. These sorts of structures are done
-using **binary trees** (see below).
-
-
-#Aborting a search through a list#
-
-We said that the sorted-list implementation of a set was more efficient than
-the unsorted-list implementation, because as you were searching through the
-list, you could come to a point where you knew the element wasn't going to be
-found. So you wouldn't have to continue the search.
-
-If your implementation of lists was, say v1 lists plus the Y-combinator, then
-this is exactly right. When you get to a point where you know the answer, you
-can just deliver that answer, and not branch into any further recursion. If
-you've got the right evaluation strategy in place, everything will work out
-fine.
-
-But what if you're using v3 lists? What options would you have then for
-aborting a search?
-
-Well, suppose we're searching through the list `[5;4;3;2;1]` to see if it
-contains the number `3`. The expression which represents this search would have
-something like the following form:
-
-       ..................<eq? 1 3>  ~~>
-       .................. false     ~~>
-       .............<eq? 2 3>       ~~>
-       ............. false          ~~>
-       .........<eq? 3 3>           ~~>
-       ......... true               ~~>
-       ?
-
-Of course, whether those reductions actually followed in that order would
-depend on what reduction strategy was in place. But the result of folding the
-search function over the part of the list whose head is `3` and whose tail is `[2;
-1]` will *semantically* depend on the result of applying that function to the
-more rightmost pieces of the list, too, regardless of what order the reduction
-is computed by. Conceptually, it will be easiest if we think of the reduction
-happening in the order displayed above.
-
-Well, once we've found a match between our sought number `3` and some member of
-the list, we'd like to avoid any further unnecessary computations and just
-deliver the answer `true` as "quickly" or directly as possible to the larger
-computation in which the search was embedded.
-
-With a Y-combinator based search, as we said, we could do this by just not
-following a recursion branch.
-
-But with the v3 lists, the fold is "pre-programmed" to continue over the whole
-list. There is no way for us to bail out of applying the search function to the
-parts of the list that have head `4` and head `5`, too.
-
-We *can* avoid *some* unneccessary computation. The search function can detect
-that the result we've accumulated so far during the fold is now `true`, so we
-don't need to bother comparing `4` or `5` to `3` for equality. That will simplify the
-computation to some degree, since as we said, numerical comparison in the
-system we're working in is moderately expensive.
-
-However, we're still going to have to traverse the remainder of the list. That
-`true` result will have to be passed along all the way to the leftmost head of
-the list. Only then can we deliver it to the larger computation in which the
-search was embedded.
-
-It would be better if there were some way to "abort" the list traversal. If,
-having found the element we're looking for (or having determined that the
-element isn't going to be found), we could just immediately stop traversing the
-list with our answer. **Continuations** will turn out to let us do that.
-
-We won't try yet to fully exploit the terrible power of continuations. But
-there's a way that we can gain their benefits here locally, without yet having
-a fully general machinery or understanding of what's going on.
-
-The key is to recall how our implementations of booleans and pairs worked.
-Remember that with pairs, we supply the pair "handler" to the pair as *an
-argument*, rather than the other way around:
-
-       pair (\x y. add x y)
-
-or:
-
-       pair (\x y. x)
-
-to get the first element of the pair. Of course you can lift that if you want:
-
-<pre><code>extract_fst &equiv; \pair. pair (\x y. x)</code></pre>
-
-but at a lower level, the pair is still accepting its handler as an argument,
-rather than the handler taking the pair as an argument. (The handler gets *the
-pair's elements*, not the pair itself, as arguments.)
-
->      *Terminology*: we'll try to use names of the form `get_foo` for handlers, and
-names of the form `extract_foo` for lifted versions of them, that accept the
-lists (or whatever data structure we're working with) as arguments. But we may
-sometimes forget.
-
-The v2 implementation of lists followed a similar strategy:
-
-       v2list (\h t. do_something_with_h_and_t) result_if_empty
-
-If the `v2list` here is not empty, then this will reduce to the result of
-supplying the list's head and tail to the handler `(\h t.
-do_something_with_h_and_t)`.
-
-Now, what we've been imagining ourselves doing with the search through the v3
-list is something like this:
-
-
-       larger_computation (search_through_the_list_for_3) other_arguments
-
-That is, the result of our search is supplied as an argument (perhaps together
-with other arguments) to the "larger computation". Without knowing the
-evaluation order/reduction strategy, we can't say whether the search is
-evaluated before or after it's substituted into the larger computation. But
-semantically, the search is the argument and the larger computation is the
-function to which it's supplied.
-
-What if, instead, we did the same kind of thing we did with pairs and v2
-lists? That is, what if we made the larger computation a "handler" that we
-passed as an argument to the search?
-
-       the_search (\search_result. larger_computation search_result other_arguments)
-
-What's the advantage of that, you say. Other than to show off how cleverly
-you can lift.
-
-Well, think about it. Think about the difficulty we were having aborting the
-search. Does this switch-around offer us anything useful?
-
-It could.
-
-What if the way we implemented the search procedure looked something like this?
-
-At a given stage in the search, we wouldn't just apply some function `f` to the
-head at this stage and the result accumulated so far (from folding the same
-function, and a base value, to the tail at this stage)...and then pass the result
-of that application to the embedding, more leftward computation.
-
-We'd *instead* give `f` a "handler" that expects the result of the current
-stage *as an argument*, and then evaluates to what you'd get by passing that
-result leftwards up the list, as before. 
-
-Why would we do that, you say? Just more flamboyant lifting?
-
-Well, no, there's a real point here. If we give the function a "handler" that
-encodes the normal continuation of the fold leftwards through the list, we can
-also give it other "handlers" too. For example, we can also give it the underlined handler:
-
-
-       the_search (\search_result. larger_computation search_result other_arguments)
-                          ------------------------------------------------------------------
-
-This "handler" encodes the search's having finished, and delivering a final
-answer to whatever else you wanted your program to do with the result of the
-search. If you like, at any stage in the search you might just give an argument
-to *this* handler, instead of giving an argument to the handler that continues
-the list traversal leftwards. Semantically, this would amount to *aborting* the
-list traversal! (As we've said before, whether the rest of the list traversal
-really gets evaluated will depend on what evaluation order is in place. But
-semantically we'll have avoided it. Our larger computation  won't depend on the
-rest of the list traversal having been computed.)
-
-Do you have the basic idea? Think about how you'd implement it. A good
-understanding of the v2 lists will give you a helpful model.
-
-In broad outline, a single stage of the search would look like before, except
-now f would receive two extra, "handler" arguments.
-
-       f 3 <result of folding f and z over [2; 1]> <handler to continue folding leftwards> <handler to abort the traversal>
-
-`f`'s job would be to check whether `3` matches the element we're searching for
-(here also `3`), and if it does, just evaluate to the result of passing `true` to
-the abort handler. If it doesn't, then evaluate to the result of passing
-`false` to the continue-leftwards handler.
-
-In this case, `f` wouldn't need to consult the result of folding `f` and `z` over `[2;
-1]`, since if we had found the element `3` in more rightward positions of the
-list, we'd have called the abort handler and this application of `f` to `3` etc
-would never be needed. However, in other applications the result of folding `f`
-and `z` over the more rightward parts of the list would be needed. Consider if
-you were trying to multiply all the elements of the list, and were going to
-abort (with the result `0`) if you came across any element in the list that was
-zero. If you didn't abort, you'd need to know what the more rightward elements
-of the list multiplied to, because that would affect the answer you passed
-along to the continue-leftwards handler.
-
-A **version 5** list encodes the kind of fold operation we're envisaging here, in
-the same way that v3 (and v4) lists encoded the simpler fold operation.
-Roughly, the list `[5;4;3;2;1]` would look like this:
-
-
-       \f z continue_leftwards_handler abort_handler.
-               <fold f and z over [4;3;2;1]>
-               (\result_of_fold_over_4321. f 5 result_of_fold_over_4321  continue_leftwards_handler abort_handler)
-               abort_handler
-
-       ; or, expanding the fold over [4;3;2;1]:
-
-       \f z continue_leftwards_handler abort_handler.
-               (\continue_leftwards_handler abort_handler.
-                       <fold f and z over [3;2;1]>
-                       (\result_of_fold_over_321. f 4 result_of_fold_over_321 continue_leftwards_handler abort_handler)
-                       abort_handler
-               )
-               (\result_of_fold_over_4321. f 5 result_of_fold_over_4321  continue_leftwards_handler abort_handler)
-               abort_handler
-
-       ; and so on             
-       
-Remarks: the `larger_computation` handler should be supplied as both the
-`continue_leftwards_handler` and the `abort_handler` for the leftmost
-application, where the head `5` is supplied to `f`; because the result of this
-application should be passed to the larger computation, whether it's a "fall
-off the left end of the list" result or it's a "I'm finished, possibly early"
-result. The `larger_computation` handler also then gets passed to the next
-rightmost stage, where the head `4` is supplied to `f`, as the `abort_handler` to
-use if that stage decides it has an early answer.
-
-Finally, notice that we don't have the result of applying `f` to `4` etc given as
-an argument to the application of `f` to `5` etc. Instead, we pass
-
-       (\result_of_fold_over_4321. f 5 result_of_fold_over_4321 <one_handler> <another_handler>)
-
-*to* the application of `f` to `4` as its "continue" handler. The application of `f`
-to `4` can decide whether this handler, or the other, "abort" handler, should be
-given an argument and constitute its result.
-
-
-I'll say once again: we're using temporally-loaded vocabulary throughout this,
-but really all we're in a position to mean by that are claims about the result
-of the complex expression semantically depending only on this, not on that. A
-demon evaluator who custom-picked the evaluation order to make things maximally
-bad for you could ensure that all the semantically unnecessary computations got
-evaluated anyway. We don't have any way to prevent that. Later,
-we'll see ways to *semantically guarantee* one evaluation order rather than
-another. Though even then the demonic evaluation-order-chooser could make it
-take unnecessarily long to compute the semantically guaranteed result. Of
-course, in any real computing environment you'll know you're dealing with a
-fixed evaluation order and you'll be able to program efficiently around that.
-
-In detail, then, here's what our v5 lists will look like:
-
-       let empty = \f z continue_handler abort_handler. continue_handler z  in
-       let make_list = \h t. \f z continue_handler abort_handler.
-               t f z (\sofar. f h sofar continue_handler abort_handler) abort_handler  in
-       let isempty = \lst larger_computation. lst
-                       ; here's our f
-                       (\hd sofar continue_handler abort_handler. abort_handler false)
-                       ; here's our z
-                       true
-                       ; here's the continue_handler for the leftmost application of f
-                       larger_computation
-                       ; here's the abort_handler
-                       larger_computation  in
-       let extract_head = \lst larger_computation. lst
-                       ; here's our f
-                       (\hd sofar continue_handler abort_handler. continue_handler hd)
-                       ; here's our z
-                       junk
-                       ; here's the continue_handler for the leftmost application of f
-                       larger_computation
-                       ; here's the abort_handler
-                       larger_computation  in
-       let extract_tail = ; left as exercise
-
-These functions are used like this:
-
-       let my_list = make_list a (make_list b (make_list c empty) in
-       extract_head my_list larger_computation
-
-If you just want to see `my_list`'s head, the use `I` as the
-`larger_computation`.
-
-What we've done here does take some work to follow. But it should be within
-your reach. And once you have followed it, you'll be well on your way to
-appreciating the full terrible power of continuations.
-
-<!-- (Silly [cultural reference](http://www.newgrounds.com/portal/view/33440).) -->
-
-Of course, like everything elegant and exciting in this seminar, [Oleg
-discusses it in much more
-detail](http://okmij.org/ftp/Streams.html#enumerator-stream).
-
-*Comments*:
-
-1.     The technique deployed here, and in the v2 lists, and in our implementations
-       of pairs and booleans, is known as **continuation-passing style** programming.
-
-2.     We're still building the list as a right fold, so in a sense the
-       application of `f` to the leftmost element `5` is "outermost". However,
-       this "outermost" application is getting lifted, and passed as a *handler*
-       to the next right application. Which is in turn getting lifted, and
-       passed to its next right application, and so on. So if you
-       trace the evaluation of the `extract_head` function to the list `[5;4;3;2;1]`,
-       you'll see `1` gets passed as a "this is the head sofar" answer to its
-       `continue_handler`; then that answer is discarded and `2` is
-       passed as a "this is the head sofar" answer to *its* `continue_handler`,
-       and so on. All those steps have to be evaluated to finally get the result
-       that `5` is the outer/leftmost head of the list. That's not an efficient way
-       to get the leftmost head.
-
-       We could improve this by building lists as left folds when implementing them
-       as continuation-passing style folds. We'd just replace above:
-
-               let make_list = \h t. \f z continue_handler abort_handler.
-                       f h z (\z. t f z continue_handler abort_handler) abort_handler
-
-       now `extract_head` should return the leftmost head directly, using its `abort_handler`:
-
-               let extract_head = \lst larger_computation. lst
-                               (\hd sofar continue_handler abort_handler. abort_handler hd)
-                               junk
-                               larger_computation
-                               larger_computation
-
-3.     To extract tails efficiently, too, it'd be nice to fuse the apparatus developed
-       in these v5 lists with the ideas from the v4 lists, above.
-       But that also is left as an exercise.
-
-
-#Implementing trees#
-
-In [[Assignment3]] we proposed a very ad-hoc-ish implementation of trees.
-
-Think about how you'd implement them in a more principled way. You could
-use any of the version 1 -- version 5 implementation of lists as a model.
-
-To keep things simple, we'll stick to binary trees. A node will either be a
-*leaf* of the tree, or it will have exactly two children.
-
-There are two kinds of trees to think about. In one sort of tree, it's only
-the tree's leaves that are labeled:
-
-               .
-          / \ 
-         .   3
-        / \
-       1   2 
-
-Linguists often use trees of this sort. The inner, non-leaf nodes of the
-tree do have associated values. But what values they are can be determined from
-the structure of the tree and the values of the node's left and right children.
-So the inner node doesn't need its own independent label.
-
-In another sort of tree, the tree's inner nodes are also labeled:
-
-               4
-          / \ 
-         2   5
-        / \
-       1   3 
-
-When you want to efficiently arrange an ordered collection, so that it's
-easy to do a binary search through it, this is the way you usually structure
-your data.
-
-These latter sorts of trees can helpfully be thought of as ones where
-*only* the inner nodes are labeled. Leaves can be thought of as special,
-dead-end branches with no label:
-
-                  .4.
-                 /   \ 
-                2     5
-               / \   / \
-          1   3  x x
-         / \ / \
-        x  x x  x
-
-In our earlier discussion of lists, we said they could be thought of as
-data structures of the form:
-
-       Empty_list | Non_empty_list (its_head, its_tail)
-
-And that could in turn be implemented in v2 form as:
-
-       the_list (\head tail. non_empty_handler) empty_handler
-
-Similarly, the leaf-labeled tree:
-
-               .
-          / \ 
-         .   3
-        / \
-       1   2 
-
-can be thought of as a data structure of the form:
-
-       Leaf (its_label) | Non_leaf (its_left_subtree, its_right_subtree)
-
-and that could be implemented in v2 form as:
-
-       the_tree (\left right. non_leaf_handler) (\label. leaf_handler)
-
-And the node-labeled tree:
-
-                  .4.
-                 /   \ 
-                2     5
-               / \   / \
-          1   3  x x
-         / \ / \
-        x  x x  x
-
-can be thought of as a data structure of the form:
-
-       Leaf | Non_leaf (its_left_subtree, its_label, its_right_subtree)
-
-and that could be implemented in v2 form as:
-
-       the_tree (\left label right. non_leaf_handler) leaf_result
-
-
-What would correspond to "folding" a function `f` and base value `z` over a
-tree? Well, if it's an empty tree:
-
-       x
-
-we should presumably get back `z`. And if it's a simple, non-empty tree:
-
-         1
-        / \
-       x   x
-
-we should expect something like `f z 1 z`, or `f <result of folding f and z
-over left subtree> label_of_this_node <result of folding f and z over right
-subtree>`. (It's not important what order we say `f` has to take its arguments
-in.)
-
-A v3-style implementation of node-labeled trees, then, might be:
-
-       let empty_tree = \f z. z  in
-       let make_tree = \left label right. \f z. f (left f z) label (right f z)  in
-       ...
-
-Think about how you might implement other tree operations, such as getting
-the label of the root (topmost node) of a tree; extracting the left subtree of
-a node; and so on.
-
-Think about different ways you might implement leaf-labeled trees.
-
-If you had one tree and wanted to make a larger tree out of it, adding in a
-new element, how would you do that?
-
-When using trees to represent linguistic structures, one doesn't have
-latitude about *how* to build a larger tree. The linguistic structure you're
-trying to represent will determine where the new element should be placed, and
-where the previous tree should be placed.
-
-However, when using trees as a computational tool, one usually does have
-latitude about how to structure a larger tree---in the same way that we had the
-freedom to implement our sets with lists whose members were just appended in
-the order we built the set up, or instead with lists whose members were ordered
-numerically.
-
-When building a new tree, one strategy for where to put the new element and
-where to put the existing tree would be to always lean towards a certain side.
-For instance, to add the element `2` to the tree:
-
-         1
-        / \
-       x   x
-
-we might construct the following tree:
-
-         1
-        / \
-       x   2
-          / \
-         x   x
-
-or perhaps we'd do it like this instead:
-
-         2
-        / \
-       x   1
-          / \
-         x   x
-
-However, if we always leaned to the right side in this way, then the tree
-would get deeper and deeper on that side, but never on the left:
-
-         1
-        / \
-       x   2
-          / \
-         x   3
-                / \
-               x   4
-                  / \
-                 x   5
-                        / \
-                       x   x
-
-and that wouldn't be so useful if you were using the tree as an arrangement
-to enable *binary searches* over the elements it holds. For that, you'd prefer
-the tree to be relatively "balanced", like this:
-
-                  .4.
-                 /   \ 
-                2     5
-               / \   / \
-          1   3  x x
-         / \ / \
-        x  x x  x
-
-Do you have any ideas about how you might efficiently keep the new trees
-you're building pretty "balanced" in this way?
-
-This is a large topic in computer science. There's no need for you to learn
-the various strategies that they've developed for doing this. But
-thinking in broad brush-strokes about what strategies might be promising will
-help strengthen your understanding of trees, and useful ways to implement them
-in a purely functional setting like the lambda calculus.
-
-
-
-