X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=blobdiff_plain;f=topics%2F_week4_fixed_point_combinator.mdwn;h=c93a16ba2c887ff8fb64bc8c320f3e8d99fb3bff;hp=bc714f374dd84d884accdff0c06c604180aa39a0;hb=ebbcaf727cfdc7d8f183a2b6af0b77262efb691f;hpb=31cebc8050836005ee17dd1d20ae81b2ab9afa3c diff --git a/topics/_week4_fixed_point_combinator.mdwn b/topics/_week4_fixed_point_combinator.mdwn index bc714f37..c93a16ba 100644 --- a/topics/_week4_fixed_point_combinator.mdwn +++ b/topics/_week4_fixed_point_combinator.mdwn @@ -19,15 +19,15 @@ How could we compute the length of a list? Without worrying yet about what lambd In OCaml, you'd define that like this: - let rec get_length = fun lst -> - if lst == [] then 0 else 1 + get_length (tail lst) - in ... (* here you go on to use the function "get_length" *) + let rec length = fun lst -> + if lst == [] then 0 else 1 + length (tail lst) + in ... (* here you go on to use the function "length" *) In Scheme you'd define it like this: - (letrec [(get_length - (lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )] - ... ; here you go on to use the function "get_length" + (letrec [(length + (lambda (lst) (if (null? lst) 0 [+ 1 (length (cdr lst))] )) )] + ... ; here you go on to use the function "length" ) Some comments on this: @@ -36,67 +36,67 @@ Some comments on this: 2. `cdr` is function that gets the tail of a Scheme list. (By definition, it's the function for getting the second member of an ordered pair. It just turns out to return the tail of a list because of the particular way Scheme implements lists.) -3. I use `get_length` instead of the convention we've been following so far of hyphenated names, as in `make-list`, because we're discussing OCaml code here, too, and OCaml doesn't permit the hyphenated variable names. OCaml requires variables to always start with a lower-case letter (or `_`), and then continue with only letters, numbers, `_` or `'`. Most other programming languages are similar. Scheme is very relaxed, and permits you to use `-`, `?`, `/`, and all sorts of other crazy characters in your variable names. +3. I use `length` instead of the convention we've been following so far of hyphenated names, as in `make-list`, because we're discussing OCaml code here, too, and OCaml doesn't permit the hyphenated variable names. OCaml requires variables to always start with a lower-case letter (or `_`), and then continue with only letters, numbers, `_` or `'`. Most other programming languages are similar. Scheme is very relaxed, and permits you to use `-`, `?`, `/`, and all sorts of other crazy characters in your variable names. 4. I alternate between `[ ]`s and `( )`s in the Scheme code just to make it more readable. These have no syntactic difference. The main question for us to dwell on here is: What are the `let rec` in the OCaml code and the `letrec` in the Scheme code? -Answer: These work like the `let` expressions we've already seen, except that they let you use the variable `get_length` *inside* the body of the function being bound to it---with the understanding that it will there refer to the same function that you're then in the process of binding to `get_length`. So our recursively-defined function works the way we'd expect it to. In OCaml: +Answer: These work like the `let` expressions we've already seen, except that they let you use the variable `length` *inside* the body of the function being bound to it---with the understanding that it will there refer to the same function that you're then in the process of binding to `length`. So our recursively-defined function works the way we'd expect it to. In OCaml: - let rec get_length = fun lst -> - if lst == [] then 0 else 1 + get_length (tail lst) - in get_length [20; 30] + let rec length = fun lst -> + if lst == [] then 0 else 1 + length (tail lst) + in length [20; 30] (* this evaluates to 2 *) In Scheme: - (letrec [(get_length - (lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )] - (get_length (list 20 30))) + (letrec [(length + (lambda (lst) (if (null? lst) 0 [+ 1 (length (cdr lst))] )) )] + (length (list 20 30))) ; this evaluates to 2 If you instead use an ordinary `let` (or `let*`), here's what would happen, in OCaml: - let get_length = fun lst -> - if lst == [] then 0 else 1 + get_length (tail lst) - in get_length [20; 30] + let length = fun lst -> + if lst == [] then 0 else 1 + length (tail lst) + in length [20; 30] (* fails with error "Unbound value length" *) Here's Scheme: - (let* [(get_length - (lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )] - (get_length (list 20 30))) - ; fails with error "reference to undefined identifier: get_length" + (let* [(length + (lambda (lst) (if (null? lst) 0 [+ 1 (length (cdr lst))] )) )] + (length (list 20 30))) + ; fails with error "reference to undefined identifier: length" Why? Because we said that constructions of this form: - let get_length = A + let length = A in B really were just another way of saying: - (\get_length. B) A + (\length. B) A -and so the occurrences of `get_length` in A *aren't bound by the `\get_length` that wraps B*. Those occurrences are free. +and so the occurrences of `length` in A *aren't bound by the `\length` that wraps B*. Those occurrences are free. -We can verify this by wrapping the whole expression in a more outer binding of `get_length` to some other function, say the constant function from any list to the integer 99: +We can verify this by wrapping the whole expression in a more outer binding of `length` to some other function, say the constant function from any list to the integer 99: - let get_length = fun lst -> 99 - in let get_length = fun lst -> - if lst == [] then 0 else 1 + get_length (tail lst) - in get_length [20; 30] + let length = fun lst -> 99 + in let length = fun lst -> + if lst == [] then 0 else 1 + length (tail lst) + in length [20; 30] (* evaluates to 1 + 99 *) -Here the use of `get_length` in `1 + get_length (tail lst)` can clearly be seen to be bound by the outermost `let`. +Here the use of `length` in `1 + length (tail lst)` can clearly be seen to be bound by the outermost `let`. -And indeed, if you tried to define `get_length` in the lambda calculus, how would you do it? +And indeed, if you tried to define `length` in the lambda calculus, how would you do it? - \lst. (isempty lst) zero (add one (get_length (extract-tail lst))) + \lst. (isempty lst) zero (add one (length (extract-tail lst))) -We've defined all of `isempty`, `zero`, `add`, `one`, and `extract-tail` in earlier discussion. But what about `get_length`? That's not yet defined! In fact, that's the very formula we're trying here to specify. +We've defined all of `isempty`, `zero`, `add`, `one`, and `extract-tail` in earlier discussion. But what about `length`? That's not yet defined! In fact, that's the very formula we're trying here to specify. What we really want to do is something like this: @@ -116,14 +116,14 @@ So how could we do it? And how do OCaml and Scheme manage to do it, with their ` 2. If you tried this in Scheme: - (define get_length - (lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) ) + (define length + (lambda (lst) (if (null? lst) 0 [+ 1 (length (cdr lst))] )) ) - (get_length (list 20 30)) + (length (list 20 30)) You'd find that it works! This is because `define` in Scheme is really shorthand for `letrec`, not for plain `let` or `let*`. So we should regard this as cheating, too. -3. In fact, it *is* possible to define the `get_length` function in the lambda calculus despite these obstacles. This depends on using the "version 3" implementation of lists, and exploiting its internal structure: that it takes a function and a base value and returns the result of folding that function over the list, with that base value. So we could use this as a definition of `get_length`: +3. In fact, it *is* possible to define the `length` function in the lambda calculus despite these obstacles. This depends on using the "version 3" implementation of lists, and exploiting its internal structure: that it takes a function and a base value and returns the result of folding that function over the list, with that base value. So we could use this as a definition of `length`: \lst. lst (\x sofar. successor sofar) zero @@ -134,9 +134,9 @@ lists and numbers. The reason we can do this is that our "version 3," fold-based implementation of lists, and Church's implementations of numbers, have a internal structure that *mirrors* the common recursive operations we'd use lists and numbers for. In a sense, the recursive -structure of the `get_length` operation is built into the data +structure of the `length` operation is built into the data structure we are using to represent the list. The non-recursive -version of get_length exploits this embedding of the recursion into +version of length exploits this embedding of the recursion into the data type. This is one of the themes of the course: using data structures to @@ -183,9 +183,9 @@ But functions like the Ackermann function require us to develop a more general t ###Fixed points### -In general, we call a **fixed point** of a function f any value *x* -such that f x is equivalent to *x*. For example, -consider the squaring function `sqare` that maps natural numbers to their squares. +In general, a **fixed point** of a function `f` is any value `x` +such that `f x` is equivalent to `x`. For example, +consider the squaring function `square` that maps natural numbers to their squares. `square 2 = 4`, so `2` is not a fixed point. But `square 1 = 1`, so `1` is a fixed point of the squaring function. @@ -193,7 +193,7 @@ There are many beautiful theorems guaranteeing the existence of a fixed point for various classes of interesting functions. For instance, imainge that you are looking at a map of Manhattan, and you are standing somewhere in Manhattan. The the [[!wikipedia Brouwer -fixed point]] guarantees that there is a spot on the map that is +fixed-point theorem]] guarantees that there is a spot on the map that is directly above the corresponding spot in Manhattan. It's the spot where the blue you-are-here dot should be. @@ -204,13 +204,13 @@ attention to the natural numbers, then this function has no fixed point. (See the discussion below concerning a way of understanding the successor function on which it does have a fixed point.) -In the lambda calculus, we say a fixed point of an expression `f` is any formula `X` such that: +In the lambda calculus, we say a fixed point of a term `f` is any term `X` such that: X <~~> f X You should be able to immediately provide a fixed point of the -identity combinator I. In fact, you should be able to provide a whole -bunch of distinct fixed points. +identity combinator I. In fact, you should be able to provide a +whole bunch of distinct fixed points. With a little thought, you should be able to provide a fixed point of the false combinator, KI. Here's how to find it: recall that KI @@ -227,13 +227,13 @@ a fixed point. In fact, it will have infinitely many, non-equivalent fixed points. And we don't just know that they exist: for any given formula, we can explicit define many of them. -Yes, even the formula that you're using the define the successor -function will have a fixed point. Isn't that weird? Think about how it -might be true. We'll return to this point below. +Yes, as we've mentioned, even the formula that you're using the define +the successor function will have a fixed point. Isn't that weird? +Think about how it might be true. We'll return to this point below. -###How fixed points help definie recursive functions### +###How fixed points help define recursive functions### -Recall our initial, abortive attempt above to define the `get_length` function in the lambda calculus. We said "What we really want to do is something like this: +Recall our initial, abortive attempt above to define the `length` function in the lambda calculus. We said "What we really want to do is something like this: \list. if empty list then zero else add one (... (tail lst)) @@ -249,29 +249,34 @@ it's not complete, since we don't know what value to use for the symbol `length`. Technically, it has the status of an unbound variable. -Imagine now binding the mysterious variable: +Imagine now binding the mysterious variable, and calling the resulting +function `h`: h := \length \list . if empty list then zero else add one (length (tail list)) Now we have no unbound variables, and we have complete non-recursive definitions of each of the other symbols. -Let's call this function `h`. Then `h` takes an argument, and returns -a function that accurately computes the length of a list---as long as -the argument we supply is already the length function we are trying to -define. (Dehydrated water: to reconstitute, just add water!) +So `h` takes an argument, and returns a function that accurately +computes the length of a list---as long as the argument we supply is +already the length function we are trying to define. (Dehydrated +water: to reconstitute, just add water!) -But this is just another way of saying that we are looking for a fixed point. -Assume that `h` has a fixed point, call it `LEN`. To say that `LEN` -is a fixed point means that +Here is where the discussion of fixed points becomes relevant. Saying +that `h` is looking for an argument (call it `LEN`) that has the same +behavior as the result of applying `h` to `LEN` is just another way of +saying that we are looking for a fixed point for `h`. h LEN <~~> LEN -But this means that +Replacing `h` with its definition, we have (\list . if empty list then zero else add one (LEN (tail list))) <~~> LEN -So at this point, we are going to search for fixed point. +If we can find a value for `LEN` that satisfies this constraint, we'll +have a function we can use to compute the length of an arbitrary list. +All we have to do is find a fixed point for `h`. + The strategy we will present will turn out to be a general way of finding a fixed point for any lambda term. @@ -285,22 +290,24 @@ work, but examining the way in which it fails will lead to a solution. h h <~~> \list . if empty list then zero else 1 + h (tail list) -There's a problem. The diagnosis is that in the subexpression `h -(tail list)`, we've applied `h` to a list, but `h` expects as its -first argument the length function. +The problem is that in the subexpression `h (tail list)`, we've +applied `h` to a list, but `h` expects as its first argument the +length function. So let's adjust h, calling the adjusted function H: H = \h \list . if empty list then zero else one plus ((h h) (tail list)) -This is the key creative step. Since `h` is expecting a -length-computing function as its first argument, the adjustment -tries supplying the closest candidate avaiable, namely, `h` itself. +This is the key creative step. Instead of applying `h` to a list, we +apply it first to itself. After applying `h` to an argument, it's +ready to apply to a list, so we've solved the problem just noted. +We're not done yet, of course; we don't yet know what argument to give +to `H` that will behave in the desired way. -We now reason about `H`. What exactly is H expecting as its first -argument? Based on the excerpt `(h h) (tail l)`, it appears that `H`'s -argument, `h`, should be a function that is ready to take itself as an -argument, and that returns a function that takes a list as an +So let's reason about `H`. What exactly is H expecting as its first +argument? Based on the excerpt `(h h) (tail l)`, it appears that +`H`'s argument, `h`, should be a function that is ready to take itself +as an argument, and that returns a function that takes a list as an argument. `H` itself fits the bill: H H <~~> (\h \list . if empty list then zero else 1 + ((h h) (tail list))) H @@ -322,11 +329,9 @@ finite lambda term with no undefined symbols. Since `H H` turns out to be the length function, we can think of `H` by itself as half of the length function (which is why we called it -`H`, of course). Given the implementation of addition as function -application for Church numerals, this (H H) is quite literally H + H. -Can you think up a recursion strategy that involves "dividing" the -recursive function into equal thirds `T`, such that the length -function <~~> T T T? +`H`, of course). Can you think up a recursion strategy that involves +"dividing" the recursive function into equal thirds `T`, such that the +length function <~~> T T T? We've starting with a particular recursive definition, and arrived at a fixed point for that definition. @@ -352,9 +357,67 @@ That is, Yh is a fixed point for h. Works! +Let's do one more example to illustrate. We'll do `K`, since we +wondered above whether it had a fixed point. + +Before we begin, we can reason a bit about what the fixed point must +be like. We're looking for a fixed point for `K`, i.e., `\xy.x`. `K` +ignores its second argument. That means that no matter what we give +`K` as its first argument, the result will ignore the next argument +(that is, `KX` ignores its first argument, no matter what `X` is). So +if `KX <~~> X`, `X` had also better ignore its first argument. But we +also have `KX == (\xy.x)X ~~> \y.X`. This means that if `X` ignores +its first argument, then `\y.X` will ignore its first two arguments. +So once again, if `KX <~~> X`, `X` also had better ignore at least its +first two arguments. Repeating this reasoning, we realize that `X` +must be a function that ignores an infinite series of arguments. +Our expectation, then, is that our recipe for finding fixed points +will build us a function that somehow manages to ignore an infinite +series of arguments. + + h := \xy.x + H := \f.h(ff) == \f.(\xy.x)(ff) ~~> \fy.ff + H H := (\fy.ff)(\fy.ff) ~~> \y.(\fy.ff)(\fy.ff) + +Let's check that it is in fact a fixed point: + + K(H H) == (\xy.x)((\fy.ff)(\fy.ff) + ~~> \y.(\fy.ff)(\fy.ff) + +Yep, `H H` and `K(H H)` both reduce to the same term. + +To see what this fixed point does, let's reduce it a bit more: + + H H == (\fy.ff)(\fy.ff) + ~~> \y.(\fy.ff)(\fy.ff) + ~~> \yy.(\fy.ff)(\fy.ff) + ~~> \yyy.(\fy.ff)(\fy.ff) + +Sure enough, this fixed point ignores an endless, infinite series of +arguments. It's a write-only memory, a black hole. + +Now that we have one fixed point, we can find others, for instance, + + (\fy.fff)(\fy.fff) + ~~> \y.(\fy.fff)(\fy.fff)(\fy.fff) + ~~> \yy.(\fy.fff)(\fy.fff)(\fy.fff)(\fy.fff) + ~~> \yyy.(\fy.fff)(\fy.fff)(\fy.fff)(\fy.fff)(\fy.fff) + +Continuing in this way, you can now find an infinite number of fixed +points, all of which have the crucial property of ignoring an infinite +series of arguments. ##What is a fixed point for the successor function?## +As we've seen, the recipe just given for finding a fixed point worked +great for our `h`, which we wrote as a definition for the length +function. But the recipe doesn't make any assumptions about the +internal structure of the function it works with. That means it can +find a fixed point for literally any function whatsoever. + +In particular, what could the fixed point for the +successor function possibly be like? + Well, you might think, only some of the formulas that we might give to the `successor` as arguments would really represent numbers. If we said something like: successor make-pair @@ -363,18 +426,20 @@ who knows what we'd get back? Perhaps there's some non-number-representing formu Yes! That's exactly right. And which formula this is will depend on the particular way you've implemented the successor function. -Moreover, the recipes that enable us to name fixed points for any -given formula aren't *guaranteed* to give us *terminating* fixed -points. They might give us formulas X such that neither `X` nor `f X` -have normal forms. (Indeed, what they give us for the square function -isn't any of the Church numerals, but is rather an expression with no -normal form.) However, if we take care we can ensure that we *do* get -terminating fixed points. And this gives us a principled, fully -general strategy for doing recursion. It lets us define even functions -like the Ackermann function, which were until now out of our reach. It -would also let us define arithmetic and list functions on the "version -1" and "version 2" implementations, where it wasn't always clear how -to force the computation to "keep going." +One (by now obvious) upshot is that the recipes that enable us to name +fixed points for any given formula aren't *guaranteed* to give us +*terminating* fixed points. They might give us formulas X such that +neither `X` nor `f X` have normal forms. (Indeed, what they give us +for the square function isn't any of the Church numerals, but is +rather an expression with no normal form.) However, if we take care we +can ensure that we *do* get terminating fixed points. And this gives +us a principled, fully general strategy for doing recursion. It lets +us define even functions like the Ackermann function, which were until +now out of our reach. It would also let us define arithmetic and list +functions on the "version 1" and "version 2" implementations, where it +wasn't always clear how to force the computation to "keep going." + +###Varieties of fixed-point combinators### OK, so how do we make use of this?