+#Sets#
+
+You're now already in a position to implement sets: that is, collections with
+no intrinsic order where elements can occur at most once. Like lists, we'll
+understand the basic set structures to be *type-homogenous*. So you might have
+a set of integers, or you might have a set of pairs of integers, but you
+wouldn't have a set that mixed both types of elements. Something *like* the
+last option is also achievable, but it's more difficult, and we won't pursue it
+now. In fact, we won't talk about sets of pairs, either. We'll just talk about
+sets of integers. The same techniques we discuss here could also be applied to
+sets of pairs of integers, or sets of triples of booleans, or sets of pairs
+whose first elements are booleans, and whose second elements are triples of
+integers. And so on.
+
+(You're also now in a position to implement *multi*sets: that is, collections
+with no intrinsic order where elements can occur multiple times: the multiset
+{a,a} is distinct from the multiset {a}. But we'll leave these as an exercise.)
+
+The easiest way to implement sets of integers would just be to use lists. When
+you "add" a member to a set, you'd get back a list that was either identical to
+the original list, if the added member already was present in it, or consisted
+of a new list with the added member prepended to the old list. That is:
+
+ let empty_set = empty in
+ ; see the library for definitions of any and eq
+ let make_set = \new_member old_set. any (eq new_member) old_set
+ ; if any element in old_set was eq new_member
+ old_set
+ ; else
+ make_list new_member old_set
+
+Think about how you'd implement operations like `set_union`,
+`set_intersection`, and `set_difference` with this implementation of sets.
+
+The implementation just described works, and it's the simplest to code.
+However, it's pretty inefficient. If you had a 100-member set, and you wanted
+to create a set which had all those 100-members and some possibly new element
+`e`, you might need to check all 100 members to see if they're equal to `e`
+before concluding they're not, and returning the new list. And comparing for
+numeric equality is a moderately expensive operation, in the first place.
+
+(You might say, well, what's the harm in just prepending `e` to the list even
+if it already occurs later in the list. The answer is, if you don't keep track
+of things like this, it will likely mess up your implementations of
+`set_difference` and so on. You'll have to do the book-keeping for duplicates
+at some point in your code. It goes much more smoothly if you plan this from
+the very beginning.)
+
+How might we make the implementation more efficient? Well, the *semantics* of
+sets says that they have no intrinsic order. That means, there's no difference
+between the set {a,b} and the set {b,a}; whereas there is a difference between
+the *list* `[a;b]` and the list `[b;a]`. But this semantic point can be respected
+even if we *implement* sets with something ordered, like list---as we're
+already doing. And we might *exploit* the intrinsic order of lists to make our
+implementation of sets more efficient.
+
+What we could do is arrange it so that a list that implements a set always
+keeps in elements in some specified order. To do this, there'd have *to be*
+some way to order its elements. Since we're talking now about sets of numbers,
+that's easy. (If we were talking about sets of pairs of numbers, we'd use
+"lexicographic" ordering, where `(a,b) < (c,d)` iff `a < c or (a == c and b <
+d)`.)
+
+So, if we were searching the list that implements some set to see if the number
+`5` belonged to it, once we get to elements in the list that are larger than `5`,
+we can stop. If we haven't found `5` already, we know it's not in the rest of the
+list either.
+
+> *Comment*: This is an improvement, but it's still a "linear" search through the list.
+There are even more efficient methods, which employ "binary" searching. They'd
+represent the set in such a way that you could quickly determine whether some
+element fell in one half, call it the left half, of the structure that
+implements the set, if it belonged to the set at all. Or that it fell in the
+right half, it it belonged to the set at all. And then the same sort of
+determination could be made for whichever half you were directed to. And then
+for whichever quarter you were directed to next. And so on. Until you either
+found the element or exhausted the structure and could then conclude that the
+element in question was not part of the set. These sorts of structures are done
+using [binary trees](/implementing_trees).
+
+
+#Aborting a search through a list#
+
+We said that the sorted-list implementation of a set was more efficient than
+the unsorted-list implementation, because as you were searching through the
+list, you could come to a point where you knew the element wasn't going to be
+found. So you wouldn't have to continue the search.
+
+If your implementation of lists was, say v1 lists plus the Y-combinator, then
+this is exactly right. When you get to a point where you know the answer, you
+can just deliver that answer, and not branch into any further recursion. If
+you've got the right evaluation strategy in place, everything will work out
+fine.
+
+But what if we wanted to use v3 lists instead?
+
+> Why would we want to do that? The advantage of the v3 lists and v3 (aka
+"Church") numerals is that they have their recursive capacity built into their
+very bones. So for many natural operations on them, you won't need to use a fixed
+point combinator.
+
+> Why is that an advantage? Well, if you use a fixed point combinator, then
+the terms you get won't be strongly normalizing: whether their reduction stops
+at a normal form will depend on what evaluation order you use. Our online
+[[lambda evaluator]] uses normal-order reduction, so it finds a normal form if
+there's one to be had. But if you want to build lambda terms in, say, Scheme,
+and you wanted to roll your own recursion as we've been doing, rather than
+relying on Scheme's native `let rec` or `define`, then you can't use the
+fixed-point combinators `Y` or <code>Θ</code>. Expressions using them
+will have non-terminating reductions, with Scheme's eager/call-by-value
+strategy. There are other fixed-point combinators you can use with Scheme (in
+the [week 3 notes](/week3/#index7h2) they were <code>Y′</code> and
+<code>Θ′</code>. But even with them, evaluation order still
+matters: for some (admittedly unusual) evaluation strategies, expressions using
+them will also be non-terminating.
+
+> The fixed-point combinators may be the conceptual stars. They are cool and
+mathematically elegant. But for efficiency and implementation elegance, it's
+best to know how to do as much as you can without them. (Also, that knowledge
+could carry over to settings where the fixed point combinators are in principle
+unavailable.)
+
+
+So again, what if we're using v3 lists? What options would we have then for
+aborting a search or list traversal before it runs to completion?
+
+Suppose we're searching through the list `[5;4;3;2;1]` to see if it
+contains the number `3`. The expression which represents this search would have
+something like the following form:
+
+ ..................<eq? 1 3> ~~>
+ .................. false ~~>
+ .............<eq? 2 3> ~~>
+ ............. false ~~>
+ .........<eq? 3 3> ~~>
+ ......... true ~~>
+ ?
+
+Of course, whether those reductions actually followed in that order would
+depend on what reduction strategy was in place. But the result of folding the
+search function over the part of the list whose head is `3` and whose tail is `[2;
+1]` will *semantically* depend on the result of applying that function to the
+more rightmost pieces of the list, too, regardless of what order the reduction
+is computed by. Conceptually, it will be easiest if we think of the reduction
+happening in the order displayed above.
+
+Once we've found a match between our sought number `3` and some member of
+the list, we'd like to avoid any further unnecessary computations and just
+deliver the answer `true` as "quickly" or directly as possible to the larger
+computation in which the search was embedded.
+
+With a Y-combinator based search, as we said, we could do this by just not
+following a recursion branch.