From: Chris Date: Thu, 12 Feb 2015 15:50:55 +0000 (-0500) Subject: added discussion of computation X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?a=commitdiff_plain;ds=sidebyside;h=31cebc8050836005ee17dd1d20ae81b2ab9afa3c;hp=-c;p=lambda.git added discussion of computation --- 31cebc8050836005ee17dd1d20ae81b2ab9afa3c diff --combined topics/_week4_fixed_point_combinator.mdwn index 253fd975,17b7b1c3..bc714f37 --- a/topics/_week4_fixed_point_combinator.mdwn +++ b/topics/_week4_fixed_point_combinator.mdwn @@@ -1,5 -1,14 +1,14 @@@ [[!toc]] + #Recursion: fixed points in the lambda calculus## + + Sometimes when you type in a web search, Google will suggest + alternatives. For instance, if you type in "Lingusitics", it will ask + you "Did you mean Linguistics?". But the engineers at Google have + added some playfulness to the system. For instance, if you search for + "anagram", Google asks you "Did you mean: nag a ram?" And if you + search for "recursion", Google asks: "Did you mean: recursion?" + ##What is the "rec" part of "letrec" doing?## How could we compute the length of a list? Without worrying yet about what lambda-calculus implementation we're using for the list, the basic idea would be to define this recursively: @@@ -43,11 -52,11 +52,11 @@@ Answer: These work like the `let` expre In Scheme: - (letrec [(get_length + (letrec [(get_length (lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )] (get_length (list 20 30))) ; this evaluates to 2 - + If you instead use an ordinary `let` (or `let*`), here's what would happen, in OCaml: let get_length = fun lst -> @@@ -57,7 -66,7 +66,7 @@@ Here's Scheme: - (let* [(get_length + (let* [(get_length (lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )] (get_length (list 20 30))) ; fails with error "reference to undefined identifier: get_length" @@@ -107,9 -116,9 +116,9 @@@ So how could we do it? And how do OCam 2. If you tried this in Scheme: - (define get_length + (define get_length (lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) ) - + (get_length (list 20 30)) You'd find that it works! This is because `define` in Scheme is really shorthand for `letrec`, not for plain `let` or `let*`. So we should regard this as cheating, too. @@@ -151,7 -160,7 +160,7 @@@ an implementation may not terminate doe isn't well-defined. The point of interest here is that its definition requires recursion in the function definition.) -Neither do the resources we've so far developed suffice to define the +Neither do the resources we've so far developed suffice to define the [[!wikipedia Ackermann function]]: A(m,n) = @@@ -175,10 -184,10 +184,10 @@@ But functions like the Ackermann functi ###Fixed points### In general, we call a **fixed point** of a function f any value *x* -such that f x is equivalent to *x*. For example, +such that f x is equivalent to *x*. For example, consider the squaring function `sqare` that maps natural numbers to their squares. `square 2 = 4`, so `2` is not a fixed point. But `square 1 = 1`, so `1` is a -fixed point of the squaring function. +fixed point of the squaring function. There are many beautiful theorems guaranteeing the existence of a fixed point for various classes of interesting functions. For @@@ -201,7 -210,7 +210,7 @@@ In the lambda calculus, we say a fixed You should be able to immediately provide a fixed point of the identity combinator I. In fact, you should be able to provide a whole -bunch of distinct fixed points. +bunch of distinct fixed points. With a little thought, you should be able to provide a fixed point of the false combinator, KI. Here's how to find it: recall that KI @@@ -238,7 -247,7 +247,7 @@@ length function. Call that function `l At this point, we have a definition of the length function, though it's not complete, since we don't know what value to use for the symbol `length`. Technically, it has the status of an unbound -variable. +variable. Imagine now binding the mysterious variable: @@@ -252,13 -261,13 +261,13 @@@ a function that accurately computes th the argument we supply is already the length function we are trying to define. (Dehydrated water: to reconstitute, just add water!) -But this is just another way of saying that we are looking for a fixed point. +But this is just another way of saying that we are looking for a fixed point. Assume that `h` has a fixed point, call it `LEN`. To say that `LEN` is a fixed point means that h LEN <~~> LEN -But this means that +But this means that (\list . if empty list then zero else add one (LEN (tail list))) <~~> LEN @@@ -266,7 -275,7 +275,7 @@@ So at this point, we are going to searc The strategy we will present will turn out to be a general way of finding a fixed point for any lambda term. -##Deriving Y, a fixed point combinator## +##Deriving Y, a fixed point combinator## How shall we begin? Well, we need to find an argument to supply to `h`. The argument has to be a function that computes the length of a @@@ -274,11 -283,11 +283,11 @@@ list. The function `h` is *almost* a f length of a list. Let's try applying `h` to itself. It won't quite work, but examining the way in which it fails will lead to a solution. - h h <~~> \list . if empty list then zero else 1 + h (tail list) + h h <~~> \list . if empty list then zero else 1 + h (tail list) There's a problem. The diagnosis is that in the subexpression `h (tail list)`, we've applied `h` to a list, but `h` expects as its -first argument the length function. +first argument the length function. So let's adjust h, calling the adjusted function H: @@@ -297,12 -306,12 +306,12 @@@ argument. `H` itself fits the bill H H <~~> (\h \list . if empty list then zero else 1 + ((h h) (tail list))) H <~~> \list . if empty list then zero else 1 + ((H H) (tail list)) == \list . if empty list then zero else 1 + ((\list . if empty list then zero else 1 + ((H H) (tail list))) (tail list)) - <~~> \list . if empty list then zero + <~~> \list . if empty list then zero else 1 + (if empty (tail list) then zero else 1 + ((H H) (tail (tail list)))) - + We're in business! -How does the recursion work? +How does the recursion work? We've defined `H` in such a way that `H H` turns out to be the length function. In order to evaluate `H H`, we substitute `H` into the body of the lambda term. Inside the lambda term, once the substitution has @@@ -344,7 -353,7 +353,7 @@@ That is, Yh is a fixed point for h Works! -##What is a fixed point for the successor function?## +##What is a fixed point for the successor function?## Well, you might think, only some of the formulas that we might give to the `successor` as arguments would really represent numbers. If we said something like: @@@ -448,9 -457,9 +457,9 @@@ returns itself (a copy of `sink`); if t sink true true false ~~> I sink true true true false ~~> I -So we make `sink = Y (\f b. b f I)`: +So we make `sink = Y (\f b. b f I)`: - 1. sink false + 1. sink false 2. Y (\fb.bfI) false 3. (\f. (\h. f (h h)) (\h. f (h h))) (\fb.bfI) false 4. (\h. [\fb.bfI] (h h)) (\h. [\fb.bfI] (h h)) false @@@ -467,7 -476,7 +476,7 @@@ argument, we can throw it away unreduce Now we try the next most complex example: - 1. sink true false + 1. sink true false 2. Y (\fb.bfI) true false 3. (\f. (\h. f (h h)) (\h. f (h h))) (\fb.bfI) true false 4. (\h. [\fb.bfI] (h h)) (\h. [\fb.bfI] (h h)) true false @@@ -481,7 -490,7 +490,7 @@@ is again I You should be able to see that `sink` will consume as many `true`s as we throw at it, then turn into the identity function after it -encounters the first `false`. +encounters the first `false`. The key to the recursion is that, thanks to Y, the definition of `sink` contains within it the ability to fully regenerate itself as @@@ -533,7 -542,7 +542,7 @@@ true, then (1) is not true Without pretending to give a serious analysis of the paradox, let's assume that sentences can have for their meaning boolean functions like the ones we have been working with here. Then the sentence *John -is John* might denote the function `\x y. x`, our `true`. +is John* might denote the function `\x y. x`, our `true`. Then (1) denotes a function from whatever the referent of *this sentence* is to a boolean. So (1) denotes `\f. f true false`, where @@@ -576,7 -585,7 +585,7 @@@ context, (2) might denot Y C (\f. (\h. f (h h)) (\h. f (h h))) I - (\h. C (h h)) (\h. C (h h))) + (\h. C (h h)) (\h. C (h h))) C ((\h. C (h h)) (\h. C (h h))) C (C ((\h. C (h h))(\h. C (h h)))) C (C (C ((\h. C (h h))(\h. C (h h))))) @@@ -618,7 -627,7 +627,7 @@@ just this juncture One obstacle to thinking this through is the fact that a sentence normally has only two truth values. We might consider instead a noun -phrase such as +phrase such as (3) the entity that this noun phrase refers to @@@ -630,7 -639,7 +639,7 @@@ for any object The chameleon nature of (3), by the way (a description that is equally good at describing any object), makes it particularly well suited as a -gloss on pronouns such as *it*. In the system of +gloss on pronouns such as *it*. In the system of [Jacobson 1999](http://www.springerlink.com/content/j706674r4w217jj5/), pronouns denote (you guessed it!) identity functions... diff --combined topics/week3_what_is_computation.mdwn index 47748a16,79fd3999..32a4572d --- a/topics/week3_what_is_computation.mdwn +++ b/topics/week3_what_is_computation.mdwn @@@ -6,13 -6,26 +6,36 @@@ expression and replacing it with an equ 3 + 4 == 7 This equation can be interpreted as expressing the thought that the ++<<<<<<< HEAD:topics/_week3_what_is_computation.mdwn +complex expression `3 + 4` evaluates to `7`. The evaluation of the +expression computing a sum. There is a clear sense in which the +expression `7` is simpler than the expression `3 + 4`: `7` is +syntactically simple, and `3 + 4` is syntactically complex. + +Now let's take this folk notion of computation, and put some pressure +on it. ++======= + complex expression `3 + 4` evaluates to `7`. In this case, the + evaluation of the expression involves computing a sum. There is a + clear sense in which the expression `7` is simpler than the expression + `3 + 4`: `7` is syntactically simple, and `3 + 4` is syntactically + complex. + + It's worth pausing a moment and wondering why we feel that replacing a + complex expression like `3 + 4` with a simplex expression like `7` + feels like we've accomplished something. If they really are + equivalent, why shouldn't we consider them to be equally valuable, or + even to prefer the longer expression? For instance, should we prefer + 2^9, or 512? Likewise, in the realm of logic, why shold we ever + prefer `B` to the conjunction of `A` with `A --> B`? + + The question to ask here is whether our intuitions about what counts + as more evaluated always tracks simplicity of expression, or whether + it tracks what is more useful to us in a given larger situation. + + But even deciding which expression ought to count as simpler is not + always so clear. ++>>>>>>> working:topics/week3_what_is_computation.mdwn ##Church arithmetic## @@@ -64,14 -77,14 +87,14 @@@ But now consider multiplication Is the final result simpler? This time, the answer is not so straightfoward. Compare the starting expression with the final expression: - * 3 4 + * 3 4 (\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(fz)))) ~~> 12 (\fz.f(f(f(f(f(f(f(f(f(f(f(fz)))))))))))) And if we choose different numbers, the result is even less clear: - * 3 6 + * 3 6 (\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(f(f(fz)))))) ~~> 18 (\fz.f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(fz)))))))))))))))))) @@@ -84,7 -97,7 +107,7 @@@ encoding of `18` is just a uniform sequ This example shows that computation can't be just simplicity as measured by the number of symbols in the representation. There is still some sense in which the evaluated expression is simpler, but the - right way to characterize simpler is elusive. + right way to characterize "simpler" is elusive. One possibility is to define simpler in terms of irreversability. The reduction rules of the lambda calculus define an asymmetric relation @@@ -97,16 -110,16 +120,22 @@@ that reduce to that term (y((\xx)y)) ~~> yy etc. ++<<<<<<< HEAD:topics/_week3_what_is_computation.mdwn +In the arithmetic example, there is only one number that corresponds +to the sum of 3 and 4 (namely, 7). But there are many sums that add +up to 7: 3+4, 4+3, 5+2, 2+5, 6+1, 1+6, etc. ++======= + Likewise, in the arithmetic example, there is only one number that + corresponds to the sum of 3 and 4 (namely, 7). But there are many + sums that add up to 7: 3+4, 4+3, 5+2, 2+5, 6+1, 1+6, etc. ++>>>>>>> working:topics/week3_what_is_computation.mdwn So the unevaluated expression contains information that is missing from the evaluated value: information about *how* that value was arrived at. So this suggests the following way of thinking about what counts as evaluated: - Given two expressions such that one reduced to the other, + Given two expressions such that one reduces to the other, the more evaluated one is the one that contains less information. This definition is problematic, though, if we try to define the amount @@@ -124,12 -137,14 +153,14 @@@ pathological examples where the result In this example, reduction returns the exact same lambda term. There is no simplification at all. - (\x.xxx)(\x.xxx) ~~> ((\x.xxxx)(\x.xxxx)(\x.xxxx)) + (\x.xxx)(\x.xxx) ~~> ((\x.xxxx)(\x.xxxx)(\x.xxxx)) Even worse, in this case, the "reduced" form is longer and more complex by any measure. + We may have to settle for the idea that a well-chosen reduction system + will characterize our intuitive notion of evaluation in most cases, or + in some useful class of non-pathological cases. + These are some of the deeper issues to keep in mind as we discuss the ins and outs of reduction strategies. - -