on P-numerals](http://okmij.org/ftp/Computation/lambda-calc.html#p-numerals).
-3. Implementing (self-balancing) trees
+
+3. **Sets**
+
+ You're now already in a position to implement sets: that is, collections with
+ no intrinsic order where elements can occur at most once. Like lists, we'll
+ understand the basic set structures to be *type-homogenous*. So you might have
+ a set of integers, or you might have a set of pairs of integers, but you
+ wouldn't have a set that mixed both types of elements. Something *like* the
+ last option is also achievable, but it's more difficult, and we won't pursue it
+ now. In fact, we won't talk about sets of pairs, either. We'll just talk about
+ sets of integers. The same techniques we discuss here could also be applied to
+ sets of pairs of integers, or sets of triples of booleans, or sets of pairs
+ whose first elements are booleans, and whose second elements are triples of
+ integers. And so on.
+
+ (You're also now in a position to implement *multi*sets: that is, collections
+ with no intrinsic order where elements can occur multiple times: the multiset
+ {a,a} is distinct from the multiset {a}. But we'll leave these as an exercise.)
+
+ The easiest way to implement sets of integers would just be to use lists. When
+ you "add" a member to a set, you'd get back a list that was either identical to
+ the original list, if the added member already was present in it, or consisted
+ of a new list with the added member prepended to the old list. That is:
+
+ let empty_set = empty in
+ ; see the library for definition of any
+ let make_set = \new_member old_set. any (eq new_member) old_set
+ ; if any element in old_set was eq new_member
+ old_set
+ ; else
+ make_list new_member old_set
+
+ Think about how you'd implement operations like `set_union`,
+ `set_intersection`, and `set_difference` with this implementation of sets.
+
+ The implementation just described works, and it's the simplest to code.
+ However, it's pretty inefficient. If you had a 100-member set, and you wanted
+ to create a set which had all those 100-members and some possibly new element
+ `e`, you might need to check all 100 members to see if they're equal to `e`
+ before concluding they're not, and returning the new list. And comparing for
+ numeric equality is a moderately expensive operation, in the first place.
+
+ (You might say, well, what's the harm in just prepending `e` to the list even
+ if it already occurs later in the list. The answer is, if you don't keep track
+ of things like this, it will likely mess up your implementations of
+ `set_difference` and so on. You'll have to do the book-keeping for duplicates
+ at some point in your code. It goes much more smoothly if you plan this from
+ the very beginning.)
+
+ How might we make the implementation more efficient? Well, the *semantics* of
+ sets says that they have no intrinsic order. That means, there's no difference
+ between the set {a,b} and the set {b,a}; whereas there is a difference between
+ the *list* [a;b] and the list [b;a]. But this semantic point can be respected
+ even if we *implement* sets with something ordered, like list---as we're
+ already doing. And we might *exploit* the intrinsic order of lists to make our
+ implementation of sets more efficient.
+
+ What we could do is arrange it so that a list that implements a set always
+ keeps in elements in some specified order. To do this, there'd have *to be*
+ some way to order its elements. Since we're talking now about sets of numbers,
+ that's easy. (If we were talking about sets of pairs of numbers, we'd use
+ "lexicographic" ordering, where `(a,b) < (c,d)` iff `a < c or (a == c and b <
+ d)`.)
+
+ So, if we were searching the list that implements some set to see if the number
+ 5 belonged to it, once we get to elements in the list that are larger than 5,
+ we can stop. If we haven't found 5 already, we know it's not in the rest of the
+ list either.
+
+ This is an improvement, but it's still a "linear" search through the list.
+ There are even more efficient methods, which employ "binary" searching. They'd
+ represent the set in such a way that you could quickly determine whether some
+ element fell in one half, call it the left half, of the structure that
+ implements the set, if it belonged to the set at all. Or that it fell in the
+ right half, it it belonged to the set at all. And then the same sort of
+ determination could be made for whichever half you were directed to. And then
+ for whichever quarter you were directed to next. And so on. Until you either
+ found the element or exhausted the structure and could then conclude that the
+ element in question was not part of the set. These sorts of structures are done
+ using **binary trees** (see below).
+
+
+4. **Aborting a search through a list**
+
+ We said that the sorted-list implementation of a set was more efficient than
+ the unsorted-list implementation, because as you were searching through the
+ list, you could come to a point where you knew the element wasn't going to be
+ found. So you wouldn't have to continue the search.
+
+ If your implementation of lists was, say v1 lists plus the Y-combinator, then
+ this is exactly right. When you get to a point where you know the answer, you
+ can just deliver that answer, and not branch into any further recursion. If
+ you've got the right evaluation strategy in place, everything will work out
+ fine.
+
+ But what if you're using v3 lists? What options would you have then for
+ aborting a search?
+
+ Well, suppose we're searching through the list [5; 4; 3; 2; 1] to see if it
+ contains the number 3. The expression which represents this search would have
+ something like the following form:
+
+ ..................<eq? 1 3> ~~>
+ .................. false ~~>
+ .............<eq? 2 3> ~~>
+ ............. false ~~>
+ .........<eq? 3 3> ~~>
+ ......... true ~~>
+ ?
+
+ Of course, whether those reductions actually followed in that order would
+ depend on what reduction strategy was in place. But the result of folding the
+ search function over the part of the list whose head is 3 and whose tail is [2;
+ 1] will *semantically* depend on the result of applying that function to the
+ more rightmost pieces of the list, too, regardless of what order the reduction
+ is computed by. Conceptually, it will be easiest if we think of the reduction
+ happening in the order displayed above.
+
+ Well, once we've found a match between our sought number 3 and some member of
+ the list, we'd like to avoid any further unnecessary computations and just
+ deliver the answer `true` as "quickly" or directly as possible to the larger
+ computation in which the search was embedded.
+
+ With a Y-combinator based search, as we said, we could do this by just not
+ following a recursion branch.
+
+ But with the v3 lists, the fold is "pre-programmed" to continue over the whole
+ list. There is no way for us to bail out of applying the search function to the
+ parts of the list that have head 4 and head 5, too.
+
+ We *can* avoid some unneccessary computation. The search function can detect
+ that the result we've accumulated so far during the fold is now true, so we
+ don't need to bother comparing 4 or 5 to 3 for equality. That will simplify the
+ computation to some degree, since as we said, numerical comparison in the
+ system we're working in is moderately expensive.
+
+ However, we're still going to have to traverse the remainder of the list. That
+ `true` result will have to be passed along all the way to the leftmost head of
+ the list. Only then can we deliver it to the larger computation in which the
+ search was embedded.
+
+ It would be better if there were some way to "abort" the list traversal. If,
+ having found the element we're looking for (or having determined that the
+ element isn't going to be found), we could just immediately stop traversing the
+ list with our answer. Continuations will turn out to let us do that.
+
+ We won't try yet to fully exploit the terrible power of continuations. But
+ there's a way that we can gain their benefits here locally, without yet having
+ a fully general machinery or understanding of what's going on.
+
+ The key is to recall how our implementations of booleans and pairs worked.
+ Remember that with pairs, we supply the pair "handler" to the pair as *an
+ argument*, rather than the other way around:
+
+ pair (\x y. add x y)
+
+ or:
+
+ pair (\x y. x)
+
+ to get the first element of the pair. Of course you can lift that if you want:
+
+ extract_1st === \pair. pair (\x y. x)
+
+ but at a lower level, the pair is still accepting its handler as an argument,
+ rather than the handler taking the pair as an argument. (The handler gets *the
+ pair's elements*, not the pair itself, as arguments.)
+
+ The v2 implementation of lists followed a similar strategy:
+
+ v2list (\h t. do_something_with_h_and_t) result_if_empty
+
+ If the v2list here is not empty, then this will reduce to the result of
+ supplying the list's head and tail to the handler `(\h t.
+ do_something_with_h_and_t)`.
+
+ Now, what we've been imagining ourselves doing with the search through the v3
+ list is something like this:
+
+
+ larger_computation (search_through_the_list_for_3) other_arguments
+
+ That is, the result of our search is supplied as an argument (perhaps together
+ with other arguments) to the "larger computation". Without knowing the
+ evaluation order/reduction strategy, we can't say whether the search is
+ evaluated before or after it's substituted into the larger computation. But
+ semantically, the search is the argument and the larger computation is the
+ function to which it's supplied.
+
+ What if, instead, we did the same kind of thing we did with pairs and v2 lists? That is, what if we made the larger computation a "handler" that we passed as an argument to the search?
+
+ the_search (\search_result. larger_computation search_result other_arguments)
+
+ What's the advantage of that, you say. Other than to show off how cleverly you can lift.
+
+ Well, think about it. Think about the difficulty we were having aborting the
+ search. Does this switch-around offer us anything useful?
+
+ It could.
+
+ What if the way we implemented the search procedure looked something like this?
+
+ At a given stage in the search, we wouldn't just apply some function f to the
+ head at this stage and the result accumulated so far, from folding the same
+ function (and a base value) to the tail at this stage. And then pass the result
+ of doing so leftward along the rest of the list.
+
+ We'd also give that function a "handler" that expected the result of the
+ current stage as an argument, and evaluated to passing that result leftwards
+ along the rest of the list.
+
+ Why would we do that, you say? Just more flamboyant lifting?
+
+ Well, no, there's a real point here. If we give the function a "handler" that
+ encodes the normal continuation of the fold leftwards through the list. We can
+ give it another "handler" as well. We can also give it the underlined handler:
+
+
+ the_search (\search_result. larger_computation search_result other_arguments)
+ ------------------------------------------------------------------
+
+ This "handler" encodes the search's having finished, and delivering a final
+ answer to whatever else you wanted your program to do with the result of the
+ search. If you like, at any stage in the search you might just give an argument
+ to this handler, instead of giving an argument to the handler that continues
+ the list traversal leftwards. Semantically, this would amount to *aborting* the
+ list traversal! (As we've said before, whether the rest of the list traversal
+ really gets evaluated will depend on what evaluation order is in place. But
+ semantically we'll have avoided it. Our larger computation won't depend on the
+ rest of the list traversal having been computed.)
+
+ Do you have the basic idea? Think about how you'd implement it. A good
+ understanding of the v2 lists will give you a helpful model.
+
+ In broad outline, a single stage of the search would look like before, except
+ now f would receive two extra, "handler" arguments.
+
+ f 3 <result of folding f and z over [2; 1]> <handler to continue folding leftwards> <handler to abort the traversal>
+
+ f's job would be to check whether 3 matches the element we're searching for
+ (here also 3), and if it does, just evaluate to the result of passing `true` to
+ the abort handler. If it doesn't, then evaluate to the result of passing
+ `false` to the continue-leftwards handler.
+
+ In this case, f wouldn't need to consult the result of folding f and z over [2;
+ 1], since if we had found the element 3 in more rightward positions of the
+ list, we'd have called the abort handler and this application of f to 3 etc
+ would never be needed. However, in other applications the result of folding f
+ and z over the more rightward parts of the list would be needed. Consider if
+ you were trying to multiply all the elements of the list, and were going to
+ abort (with the result 0) if you came across any element in the list that was
+ zero. If you didn't abort, you'd need to know what the more rightward elements
+ of the list multiplied to, because that would affect the answer you passed
+ along to the continue-leftwards handler.
+
+ A **version 5** list would encode this kind of fold operation over the list, in
+ the same way that v3 (and v4) lists encoded the simpler fold operation.
+ Roughly, the list [5; 4; 3; 2; 1] would look like this:
+
+
+ \f z continue_leftwards_handler abort_handler.
+ <fold f and z over [4; 3; 2; 1]>
+ (\result_of_fold_over_4321. f 5 result_of_fold_over_4321 continue_leftwards_handler abort_handler)
+ abort_handler
+
+
+ \f z continue_leftwards_handler abort_handler.
+ (\continue_leftwards_handler abort_handler.
+ <fold f and z over [3; 2; 1]>
+ (\result_of_fold_over_321. f 4 result_of_fold_over_321 continue_leftwards_handler abort_handler)
+ abort_handler
+ )
+ (\result_of_fold_over_4321. f 5 result_of_fold_over_4321 continue_leftwards_handler abort_handler)
+ abort_handler
+
+ and so on
+
+ Remarks: the `larger_computation_handler` should be supplied as both the
+ `continue_leftwards_handler` and the `abort_handler` for the leftmost
+ application, where the head 5 is supplied to f. Because the result of this
+ application should be passed to the larger computation, whether it's a "fall
+ off the left end of the list" result or it's a "I'm finished, possibly early"
+ result. The `larger_computation_handler` also then gets passed to the next
+ rightmost stage, where the head 4 is supplied to f, as the `abort_handler` to
+ use if that stage decides it has an early answer.
+
+ Finally, notice that we don't have the result of applying f to 4 etc given as
+ an argument to the application of f to 5 etc. Instead, we pass
+
+ (\result_of_fold_over_4321. f 5 result_of_fold_over_4321 one_handler another_handler)
+
+ *to* the application of f to 4 as its "continue" handler. The application of f
+ to 4 can decide whether this handler, or the other, "abort" handler, should be
+ given an argument and constitute its result.
+
+
+ I'll say once again: we're using temporally-loaded vocabulary throughout this,
+ but really all we're in a position to mean by that are claims about the result
+ of the complex expression semantically depending only on this, not on that. A
+ demon evaluator who custom-picked the evaluation order to make things maximally
+ bad for you could ensure that all the semantically unnecessary computations got
+ evaluated anyway. At this stage, we don't have any way to prevent that. Later,
+ we'll see ways to semantically guarantee one evaluation order rather than
+ another. Though even then the demonic evaluation-order-chooser could make it
+ take unnecessarily long to compute the semantically guaranteed result. Of
+ course, in any real computing environment you'll know you're dealing with a
+ fixed evaluation order and you'll be able to program efficiently around that.
+
+ In detail, then, here's what our v5 lists will look like:
+
+ let empty = \f z continue_handler abort_handler. continue_handler z in
+ let make_list = \h t. \f z continue_handler abort_handler.
+ t f z (\sofar. f h sofar continue_handler abort_handler) abort_handler in
+ let isempty = \lst larger_computation. lst
+ ; here's our f
+ (\hd sofar continue_handler abort_handler. abort_handler false)
+ ; here's our z
+ true
+ ; here's the continue_handler for the leftmost application of f
+ larger_computation
+ ; here's the abort_handler
+ larger_computation in
+ let extract_head = \lst larger_computation. lst
+ ; here's our f
+ (\hd sofar continue_handler abort_handler. continue_handler hd)
+ ; here's our z
+ junk
+ ; here's the continue_handler for the leftmost application of f
+ larger_computation
+ ; here's the abort_handler
+ larger_computation in
+ let extract_tail = ; left as exercise
+ ;; for real efficiency, it'd be nice to fuse the apparatus developed
+ ;; in these v5 lists with the ideas from the v4 lists, above
+ ;; but that also is left as an exercise
+
+ These functions are used like this:
+
+ let my_list = make_list a (make_list b (make_list c empty) in
+ extract_head my_list larger_computation
+
+ If you just want to see `my_list`'s head, the use `I` as the
+ `larger_computation`.
+
+ What we've done here does take some work to follow. But it should be within
+ your reach. And once you have followed it, you'll be well on your way to
+ appreciating the full terrible power of continuations. (Silly [cultural
+ reference](http://www.newgrounds.com/portal/view/33440).)
+
+ Of course, like everything elegant and exciting in this seminar, [Oleg
+ discusses it in much more
+ detail](http://okmij.org/ftp/Streams.html#enumerator-stream).
+
+
+
+5. Implementing (self-balancing) trees