X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=blobdiff_plain;f=week4.mdwn;h=5e219484b7cae7ff81e112bad9481044747c1204;hp=7cd8a921dbfebc07558180aa7717a3b83e0cd0d5;hb=7b3b1c578a8c6d50c71ab570a1a49a1c3385ca62;hpb=5dad3bae051905473c6cc3d01bf261cbcce0968e diff --git a/week4.mdwn b/week4.mdwn index 7cd8a921..5e219484 100644 --- a/week4.mdwn +++ b/week4.mdwn @@ -6,11 +6,11 @@ A: That's easy: let `T` be an arbitrary term in the lambda calculus. If `T` has a fixed point, then there exists some `X` such that `X <~~> TX` (that's what it means to *have* a fixed point). -
-let W = \x.T(xx) in
-let X = WW in
-X = WW = (\x.T(xx))W = T(WW) = TX
-
+

+let L = \x. T (x x) in
+let X = L L in
+X ≡ L L ≡ (\x. T (x x)) L ~~> T (L L) ≡ T X
+
Please slow down and make sure that you understand what justified each of the equalities in the last line. @@ -173,129 +173,399 @@ so A 4 x is to A 3 x as hyper-exponentiation is to exponentiation... Is leastness important? -##The simply-typed lambda calculus## - -The uptyped lambda calculus is pure computation. It is much more -common, however, for practical programming languages to be typed. -Likewise, systems used to investigate philosophical or linguistic -issues are almost always typed. Types will help us reason about our -computations. They will also facilitate a connection between logic -and computation. -Soon we will consider polymorphic type systems. First, however, we -will consider the simply-typed lambda calculus. There's good news and -bad news: the good news is that the simply-type lambda calculus is -strongly normalizing: every term has a normal form. We shall see that -self-application is outlawed, so Ω can't even be written, let -alone undergo reduction. The bad news is that fixed-point combinators -are also forbidden, so recursion is neither simple nor direct. +#Sets# + +You're now already in a position to implement sets: that is, collections with +no intrinsic order where elements can occur at most once. Like lists, we'll +understand the basic set structures to be *type-homogenous*. So you might have +a set of integers, or you might have a set of pairs of integers, but you +wouldn't have a set that mixed both types of elements. Something *like* the +last option is also achievable, but it's more difficult, and we won't pursue it +now. In fact, we won't talk about sets of pairs, either. We'll just talk about +sets of integers. The same techniques we discuss here could also be applied to +sets of pairs of integers, or sets of triples of booleans, or sets of pairs +whose first elements are booleans, and whose second elements are triples of +integers. And so on. + +(You're also now in a position to implement *multi*sets: that is, collections +with no intrinsic order where elements can occur multiple times: the multiset +{a,a} is distinct from the multiset {a}. But we'll leave these as an exercise.) + +The easiest way to implement sets of integers would just be to use lists. When +you "add" a member to a set, you'd get back a list that was either identical to +the original list, if the added member already was present in it, or consisted +of a new list with the added member prepended to the old list. That is: + + let empty_set = empty in + ; see the library for definitions of any and eq + let make_set = \new_member old_set. any (eq new_member) old_set + ; if any element in old_set was eq new_member + old_set + ; else + make_list new_member old_set + +Think about how you'd implement operations like `set_union`, +`set_intersection`, and `set_difference` with this implementation of sets. + +The implementation just described works, and it's the simplest to code. +However, it's pretty inefficient. If you had a 100-member set, and you wanted +to create a set which had all those 100-members and some possibly new element +`e`, you might need to check all 100 members to see if they're equal to `e` +before concluding they're not, and returning the new list. And comparing for +numeric equality is a moderately expensive operation, in the first place. + +(You might say, well, what's the harm in just prepending `e` to the list even +if it already occurs later in the list. The answer is, if you don't keep track +of things like this, it will likely mess up your implementations of +`set_difference` and so on. You'll have to do the book-keeping for duplicates +at some point in your code. It goes much more smoothly if you plan this from +the very beginning.) + +How might we make the implementation more efficient? Well, the *semantics* of +sets says that they have no intrinsic order. That means, there's no difference +between the set {a,b} and the set {b,a}; whereas there is a difference between +the *list* `[a;b]` and the list `[b;a]`. But this semantic point can be respected +even if we *implement* sets with something ordered, like list---as we're +already doing. And we might *exploit* the intrinsic order of lists to make our +implementation of sets more efficient. + +What we could do is arrange it so that a list that implements a set always +keeps in elements in some specified order. To do this, there'd have *to be* +some way to order its elements. Since we're talking now about sets of numbers, +that's easy. (If we were talking about sets of pairs of numbers, we'd use +"lexicographic" ordering, where `(a,b) < (c,d)` iff `a < c or (a == c and b < +d)`.) + +So, if we were searching the list that implements some set to see if the number +`5` belonged to it, once we get to elements in the list that are larger than `5`, +we can stop. If we haven't found `5` already, we know it's not in the rest of the +list either. + +This is an improvement, but it's still a "linear" search through the list. +There are even more efficient methods, which employ "binary" searching. They'd +represent the set in such a way that you could quickly determine whether some +element fell in one half, call it the left half, of the structure that +implements the set, if it belonged to the set at all. Or that it fell in the +right half, it it belonged to the set at all. And then the same sort of +determination could be made for whichever half you were directed to. And then +for whichever quarter you were directed to next. And so on. Until you either +found the element or exhausted the structure and could then conclude that the +element in question was not part of the set. These sorts of structures are done +using **binary trees** (see below). + + +#Aborting a search through a list# + +We said that the sorted-list implementation of a set was more efficient than +the unsorted-list implementation, because as you were searching through the +list, you could come to a point where you knew the element wasn't going to be +found. So you wouldn't have to continue the search. + +If your implementation of lists was, say v1 lists plus the Y-combinator, then +this is exactly right. When you get to a point where you know the answer, you +can just deliver that answer, and not branch into any further recursion. If +you've got the right evaluation strategy in place, everything will work out +fine. + +But what if you're using v3 lists? What options would you have then for +aborting a search? + +Well, suppose we're searching through the list `[5;4;3;2;1]` to see if it +contains the number `3`. The expression which represents this search would have +something like the following form: + + .................. ~~> + .................. false ~~> + ............. ~~> + ............. false ~~> + ......... ~~> + ......... true ~~> + ? + +Of course, whether those reductions actually followed in that order would +depend on what reduction strategy was in place. But the result of folding the +search function over the part of the list whose head is `3` and whose tail is `[2; +1]` will *semantically* depend on the result of applying that function to the +more rightmost pieces of the list, too, regardless of what order the reduction +is computed by. Conceptually, it will be easiest if we think of the reduction +happening in the order displayed above. + +Well, once we've found a match between our sought number `3` and some member of +the list, we'd like to avoid any further unnecessary computations and just +deliver the answer `true` as "quickly" or directly as possible to the larger +computation in which the search was embedded. + +With a Y-combinator based search, as we said, we could do this by just not +following a recursion branch. + +But with the v3 lists, the fold is "pre-programmed" to continue over the whole +list. There is no way for us to bail out of applying the search function to the +parts of the list that have head `4` and head `5`, too. -#Types# +We *can* avoid *some* unneccessary computation. The search function can detect +that the result we've accumulated so far during the fold is now `true`, so we +don't need to bother comparing `4` or `5` to `3` for equality. That will simplify the +computation to some degree, since as we said, numerical comparison in the +system we're working in is moderately expensive. + +However, we're still going to have to traverse the remainder of the list. That +`true` result will have to be passed along all the way to the leftmost head of +the list. Only then can we deliver it to the larger computation in which the +search was embedded. + +It would be better if there were some way to "abort" the list traversal. If, +having found the element we're looking for (or having determined that the +element isn't going to be found), we could just immediately stop traversing the +list with our answer. **Continuations** will turn out to let us do that. -We will have at least one ground type, `o`. From a linguistic point -of view, think of the ground types as the bar-level 0 categories, that -is, the lexical types, such as Noun, Verb, Preposition (glossing over -the internal complexity of those categories in modern theories). +We won't try yet to fully exploit the terrible power of continuations. But +there's a way that we can gain their benefits here locally, without yet having +a fully general machinery or understanding of what's going on. -In addition, there will be a recursively-defined class of complex -types `T`, the smallest set such that +The key is to recall how our implementations of booleans and pairs worked. +Remember that with pairs, we supply the pair "handler" to the pair as *an +argument*, rather than the other way around: -* ground types, including `o`, are in `T` + pair (\x y. add x y) -* for any types σ and τ in `T`, the type σ --> - τ is in `T`. +or: -For instance, here are some types in `T`: + pair (\x y. x) - o - o --> o - o --> o --> o - (o --> o) --> o - (o --> o) --> o --> o +to get the first element of the pair. Of course you can lift that if you want: -and so on. +
extract_fst ≡ \pair. pair (\x y. x)
-#Typed lambda terms# +but at a lower level, the pair is still accepting its handler as an argument, +rather than the handler taking the pair as an argument. (The handler gets *the +pair's elements*, not the pair itself, as arguments.) -Given a set of types `T`, we define the set of typed lambda terms Λ_T, -which is the smallest set such that +> *Terminology*: we'll try to use names of the form `get_foo` for handlers, and +names of the form `extract_foo` for lifted versions of them, that accept the +lists (or whatever data structure we're working with) as arguments. But we may +sometimes forget. -* each type `t` has an infinite set of distinct variables, {x^t}_1, - {x^t}_2, {x^t}_3, ... +The v2 implementation of lists followed a similar strategy: -* If a term `M` has type σ --> τ, and a term `N` has type - σ, then the application `(M N)` has type τ. + v2list (\h t. do_something_with_h_and_t) result_if_empty -* If a variable `a` has type σ, and term `M` has type τ, - then the abstract λ a M has type σ --> τ. +If the `v2list` here is not empty, then this will reduce to the result of +supplying the list's head and tail to the handler `(\h t. +do_something_with_h_and_t)`. -The definitions of types and of typed terms should be highly familiar -to semanticists, except that instead of writing σ --> τ, -linguists (following Montague, who followed Church) write <σ, -τ>. We will use the arrow notation, since it is more iconic. +Now, what we've been imagining ourselves doing with the search through the v3 +list is something like this: -Some examples (assume that `x` has type `o`): - x o - \x.x o --> o - ((\x.x) x) o + larger_computation (search_through_the_list_for_3) other_arguments -Excercise: write down terms that have the following types: +That is, the result of our search is supplied as an argument (perhaps together +with other arguments) to the "larger computation". Without knowing the +evaluation order/reduction strategy, we can't say whether the search is +evaluated before or after it's substituted into the larger computation. But +semantically, the search is the argument and the larger computation is the +function to which it's supplied. - o --> o --> o - (o --> o) --> o --> o - (o --> o --> o) --> o +What if, instead, we did the same kind of thing we did with pairs and v2 +lists? That is, what if we made the larger computation a "handler" that we +passed as an argument to the search? -#Associativity of types versus terms# + the_search (\search_result. larger_computation search_result other_arguments) -As we have seen many times, in the lambda calculus, function -application is left associative, so that `f x y z == (((f x) y) z)`. -Types, *THEREFORE*, are right associative: if `f`, `x`, `y`, and `z` -have types `a`, `b`, `c`, and `d`, respectively, then `f` has type `a ---> b --> c --> d == (a --> (b --> (c --> d)))`. +What's the advantage of that, you say. Other than to show off how cleverly +you can lift. -It is a serious faux pas to associate to the left for types. You may -as well use your salad fork to stir your tea. +Well, think about it. Think about the difficulty we were having aborting the +search. Does this switch-around offer us anything useful? -#The simply-typed lambda calculus is strongly normalizing# +It could. -If `M` is a term with type τ in Λ_T, then `M` has a -normal form. The proof is not particularly complex, but we will not -present it here; see Berendregt or Hankin. +What if the way we implemented the search procedure looked something like this? -Since Ω does not have a normal form, it follows that Ω -cannot have a type in Λ_T. We can easily see why: +At a given stage in the search, we wouldn't just apply some function `f` to the +head at this stage and the result accumulated so far (from folding the same +function, and a base value, to the tail at this stage)...and then pass the result +of that application to the embedding, more leftward computation. - Ω = (\x.xx)(\x.xx) +We'd *instead* give `f` a "handler" that expects the result of the current +stage *as an argument*, and then evaluates to what you'd get by passing that +result leftwards up the list, as before. -Assume Ω has type τ, and `\x.xx` has type σ. Then -because `\x.xx` takes an argument of type σ and returns -something of type τ, `\x.xx` must also have type `σ --> -τ`. By repeating this reasoning, `\x.xx` must also have type -`(σ --> τ) --> τ`; and so on. Since variables have -finite types, there is no way to choose a type for the variable `x` -that can satisfy all of the requirements imposed on it. +Why would we do that, you say? Just more flamboyant lifting? -In general, there is no way for a function to have a type that can -take itself for an argument. It follows that there is no way to -define the identity function in such a way that it can take itself as -an argument. Instead, there must be many different identity -functions, one for each type. +Well, no, there's a real point here. If we give the function a "handler" that +encodes the normal continuation of the fold leftwards through the list, we can +also give it other "handlers" too. For example, we can also give it the underlined handler: -#Typing numerals# -Version 1 type numerals are not a good choice for the simply-typed -lambda calculus. The reason is that each different numberal has a -different type! For instance, if zero has type σ, then `false` -has type τ --> τ --> &tau, for some τ. Since one is -represented by the function `\x.x false 0`, one must have type `(τ ---> τ --> &tau) --> &sigma --> σ`. But this is a different -type than zero! Because each number has a different type, it becomes -impossible to write arithmetic operations that can combine zero with -one. We would need as many different addition operations as we had -pairs of numbers that we wanted to add. + the_search (\search_result. larger_computation search_result other_arguments) + ------------------------------------------------------------------ -Fortunately, the Church numberals are well behaved with respect to -types. They can all be given the type `(σ --> σ) --> -σ --> σ`. +This "handler" encodes the search's having finished, and delivering a final +answer to whatever else you wanted your program to do with the result of the +search. If you like, at any stage in the search you might just give an argument +to *this* handler, instead of giving an argument to the handler that continues +the list traversal leftwards. Semantically, this would amount to *aborting* the +list traversal! (As we've said before, whether the rest of the list traversal +really gets evaluated will depend on what evaluation order is in place. But +semantically we'll have avoided it. Our larger computation won't depend on the +rest of the list traversal having been computed.) + +Do you have the basic idea? Think about how you'd implement it. A good +understanding of the v2 lists will give you a helpful model. + +In broad outline, a single stage of the search would look like before, except +now f would receive two extra, "handler" arguments. + + f 3 + +`f`'s job would be to check whether `3` matches the element we're searching for +(here also `3`), and if it does, just evaluate to the result of passing `true` to +the abort handler. If it doesn't, then evaluate to the result of passing +`false` to the continue-leftwards handler. + +In this case, `f` wouldn't need to consult the result of folding `f` and `z` over `[2; +1]`, since if we had found the element `3` in more rightward positions of the +list, we'd have called the abort handler and this application of `f` to `3` etc +would never be needed. However, in other applications the result of folding `f` +and `z` over the more rightward parts of the list would be needed. Consider if +you were trying to multiply all the elements of the list, and were going to +abort (with the result `0`) if you came across any element in the list that was +zero. If you didn't abort, you'd need to know what the more rightward elements +of the list multiplied to, because that would affect the answer you passed +along to the continue-leftwards handler. + +A **version 5** list encodes the kind of fold operation we're envisaging here, in +the same way that v3 (and [v4](/advanced/#v4)) lists encoded the simpler fold operation. +Roughly, the list `[5;4;3;2;1]` would look like this: + + + \f z continue_leftwards_handler abort_handler. + + (\result_of_fold_over_4321. f 5 result_of_fold_over_4321 continue_leftwards_handler abort_handler) + abort_handler + + ; or, expanding the fold over [4;3;2;1]: + + \f z continue_leftwards_handler abort_handler. + (\continue_leftwards_handler abort_handler. + + (\result_of_fold_over_321. f 4 result_of_fold_over_321 continue_leftwards_handler abort_handler) + abort_handler + ) + (\result_of_fold_over_4321. f 5 result_of_fold_over_4321 continue_leftwards_handler abort_handler) + abort_handler + + ; and so on + +Remarks: the `larger_computation` handler should be supplied as both the +`continue_leftwards_handler` and the `abort_handler` for the leftmost +application, where the head `5` is supplied to `f`; because the result of this +application should be passed to the larger computation, whether it's a "fall +off the left end of the list" result or it's a "I'm finished, possibly early" +result. The `larger_computation` handler also then gets passed to the next +rightmost stage, where the head `4` is supplied to `f`, as the `abort_handler` to +use if that stage decides it has an early answer. + +Finally, notice that we don't have the result of applying `f` to `4` etc given as +an argument to the application of `f` to `5` etc. Instead, we pass + + (\result_of_fold_over_4321. f 5 result_of_fold_over_4321 ) + +*to* the application of `f` to `4` as its "continue" handler. The application of `f` +to `4` can decide whether this handler, or the other, "abort" handler, should be +given an argument and constitute its result. + + +I'll say once again: we're using temporally-loaded vocabulary throughout this, +but really all we're in a position to mean by that are claims about the result +of the complex expression semantically depending only on this, not on that. A +demon evaluator who custom-picked the evaluation order to make things maximally +bad for you could ensure that all the semantically unnecessary computations got +evaluated anyway. We don't have any way to prevent that. Later, +we'll see ways to *semantically guarantee* one evaluation order rather than +another. Though even then the demonic evaluation-order-chooser could make it +take unnecessarily long to compute the semantically guaranteed result. Of +course, in any real computing environment you'll know you're dealing with a +fixed evaluation order and you'll be able to program efficiently around that. + +In detail, then, here's what our v5 lists will look like: + + let empty = \f z continue_handler abort_handler. continue_handler z in + let make_list = \h t. \f z continue_handler abort_handler. + t f z (\sofar. f h sofar continue_handler abort_handler) abort_handler in + let isempty = \lst larger_computation. lst + ; here's our f + (\hd sofar continue_handler abort_handler. abort_handler false) + ; here's our z + true + ; here's the continue_handler for the leftmost application of f + larger_computation + ; here's the abort_handler + larger_computation in + let extract_head = \lst larger_computation. lst + ; here's our f + (\hd sofar continue_handler abort_handler. continue_handler hd) + ; here's our z + junk + ; here's the continue_handler for the leftmost application of f + larger_computation + ; here's the abort_handler + larger_computation in + let extract_tail = ; left as exercise + +These functions are used like this: + + let my_list = make_list a (make_list b (make_list c empty) in + extract_head my_list larger_computation + +If you just want to see `my_list`'s head, the use `I` as the +`larger_computation`. + +What we've done here does take some work to follow. But it should be within +your reach. And once you have followed it, you'll be well on your way to +appreciating the full terrible power of continuations. + + + +Of course, like everything elegant and exciting in this seminar, [Oleg +discusses it in much more +detail](http://okmij.org/ftp/Streams.html#enumerator-stream). + +*Comments*: + +1. The technique deployed here, and in the v2 lists, and in our implementations + of pairs and booleans, is known as **continuation-passing style** programming. + +2. We're still building the list as a right fold, so in a sense the + application of `f` to the leftmost element `5` is "outermost". However, + this "outermost" application is getting lifted, and passed as a *handler* + to the next right application. Which is in turn getting lifted, and + passed to its next right application, and so on. So if you + trace the evaluation of the `extract_head` function to the list `[5;4;3;2;1]`, + you'll see `1` gets passed as a "this is the head sofar" answer to its + `continue_handler`; then that answer is discarded and `2` is + passed as a "this is the head sofar" answer to *its* `continue_handler`, + and so on. All those steps have to be evaluated to finally get the result + that `5` is the outer/leftmost head of the list. That's not an efficient way + to get the leftmost head. + + We could improve this by building lists as left folds when implementing them + as continuation-passing style folds. We'd just replace above: + + let make_list = \h t. \f z continue_handler abort_handler. + f h z (\z. t f z continue_handler abort_handler) abort_handler + + now `extract_head` should return the leftmost head directly, using its `abort_handler`: + + let extract_head = \lst larger_computation. lst + (\hd sofar continue_handler abort_handler. abort_handler hd) + junk + larger_computation + larger_computation + +3. To extract tails efficiently, too, it'd be nice to fuse the apparatus developed + in these v5 lists with the ideas from [v4](/advanced/#v4) lists. + But that also is left as an exercise.