added discussion of computation

author Chris <chris.barker@nyu.edu>

Thu, 12 Feb 2015 15:50:55 +0000 (10:50 -0500)

committer Chris <chris.barker@nyu.edu>

Thu, 12 Feb 2015 15:50:55 +0000 (10:50 -0500)
author Chris <chris.barker@nyu.edu>
Thu, 12 Feb 2015 15:50:55 +0000 (10:50 -0500)
committer Chris <chris.barker@nyu.edu>
Thu, 12 Feb 2015 15:50:55 +0000 (10:50 -0500)
diff --combined topics/_week4_fixed_point_combinator.mdwn

index 253fd97,17b7b1c..bc714f3
--- 1/topics/_week4_fixed_point_combinator.mdwn
--- 2/topics/_week4_fixed_point_combinator.mdwn
+++ b/topics/_week4_fixed_point_combinator.mdwn
@@@ -1,5 -1,14 +1,14 @@@
   [[!toc]]
   
+ #Recursion: fixed points in the lambda calculus##
+ 
+ Sometimes when you type in a web search, Google will suggest
+ alternatives.  For instance, if you type in "Lingusitics", it will ask
+ you "Did you mean Linguistics?".  But the engineers at Google have
+ added some playfulness to the system.  For instance, if you search for
+ "anagram", Google asks you "Did you mean: nag a ram?"  And if you
+ search for "recursion", Google asks: "Did you mean: recursion?"
+ 
   ##What is the "rec" part of "letrec" doing?##
   
   How could we compute the length of a list? Without worrying yet about what lambda-calculus implementation we're using for the list, the basic idea would be to define this recursively:
@@@ -43,11 -52,11 +52,11 @@@ Answer: These work like the `let` expre
   
   In Scheme:
   
- -      (letrec [(get_length 
+ +      (letrec [(get_length
                                 (lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )]
                         (get_length (list 20 30)))
         ; this evaluates to 2
- -      
+ +
   If you instead use an ordinary `let` (or `let*`), here's what would happen, in OCaml:
   
         let get_length = fun lst ->
@@@ -57,7 -66,7 +66,7 @@@
   
   Here's Scheme:
   
- -      (let* [(get_length 
+ +      (let* [(get_length
                                 (lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )]
                         (get_length (list 20 30)))
         ; fails with error "reference to undefined identifier: get_length"
@@@ -107,9 -116,9 +116,9 @@@ So how could we do it? And how do OCam
   
   2.    If you tried this in Scheme:
   
- -              (define get_length 
+ +              (define get_length
                                 (lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )
- -              
+ +
                 (get_length (list 20 30))
   
         You'd find that it works! This is because `define` in Scheme is really shorthand for `letrec`, not for plain `let` or `let*`. So we should regard this as cheating, too.
@@@ -151,7 -160,7 +160,7 @@@ an implementation may not terminate doe
   isn't well-defined.  The point of interest here is that its definition
   requires recursion in the function definition.)
   
- -Neither do the resources we've so far developed suffice to define the 
+ +Neither do the resources we've so far developed suffice to define the
   [[!wikipedia Ackermann function]]:
   
         A(m,n) =
@@@ -175,10 -184,10 +184,10 @@@ But functions like the Ackermann functi
   ###Fixed points###
   
   In general, we call a **fixed point** of a function f any value *x*
- -such that f <em>x</em> is equivalent to *x*. For example, 
+ +such that f <em>x</em> is equivalent to *x*. For example,
   consider the squaring function `sqare` that maps natural numbers to their squares.
   `square 2 = 4`, so `2` is not a fixed point.  But `square 1 = 1`, so `1` is a
- -fixed point of the squaring function.  
+ +fixed point of the squaring function.
   
   There are many beautiful theorems guaranteeing the existence of a
   fixed point for various classes of interesting functions.  For
@@@ -201,7 -210,7 +210,7 @@@ In the lambda calculus, we say a fixed 
   
   You should be able to immediately provide a fixed point of the
   identity combinator I.  In fact, you should be able to provide a whole
- -bunch of distinct fixed points.  
+ +bunch of distinct fixed points.
   
   With a little thought, you should be able to provide a fixed point of
   the false combinator, KI.  Here's how to find it: recall that KI
@@@ -238,7 -247,7 +247,7 @@@ length function.  Call that function `l
   At this point, we have a definition of the length function, though
   it's not complete, since we don't know what value to use for the
   symbol `length`.  Technically, it has the status of an unbound
- -variable.  
+ +variable.
   
   Imagine now binding the mysterious variable:
   
@@@ -252,13 -261,13 +261,13 @@@ a function that accurately computes th
   the argument we supply is already the length function we are trying to
   define.  (Dehydrated water: to reconstitute, just add water!)
   
- -But this is just another way of saying that we are looking for a fixed point.  
+ +But this is just another way of saying that we are looking for a fixed point.
   Assume that `h` has a fixed point, call it `LEN`.  To say that `LEN`
   is a fixed point means that
   
       h LEN <~~> LEN
   
- -But this means that 
+ +But this means that
   
       (\list . if empty list then zero else add one (LEN (tail list))) <~~> LEN
   
@@@ -266,7 -275,7 +275,7 @@@ So at this point, we are going to searc
   The strategy we will present will turn out to be a general way of
   finding a fixed point for any lambda term.
   
- -##Deriving Y, a fixed point combinator## 
+ +##Deriving Y, a fixed point combinator##
   
   How shall we begin?  Well, we need to find an argument to supply to
   `h`.  The argument has to be a function that computes the length of a
@@@ -274,11 -283,11 +283,11 @@@ list.  The function `h` is *almost* a f
   length of a list.  Let's try applying `h` to itself.  It won't quite
   work, but examining the way in which it fails will lead to a solution.
   
- -    h h <~~> \list . if empty list then zero else 1 + h (tail list) 
+ +    h h <~~> \list . if empty list then zero else 1 + h (tail list)
   
   There's a problem.  The diagnosis is that in the subexpression `h
   (tail list)`, we've applied `h` to a list, but `h` expects as its
- -first argument the length function.  
+ +first argument the length function.
   
   So let's adjust h, calling the adjusted function H:
   
@@@ -297,12 -306,12 +306,12 @@@ argument.  `H` itself fits the bill
       H H <~~> (\h \list . if empty list then zero else 1 + ((h h) (tail list))) H
           <~~> \list . if empty list then zero else 1 + ((H H) (tail list))
           == \list . if empty list then zero else 1 + ((\list . if empty list then zero else 1 + ((H H) (tail list))) (tail list))
- -        <~~> \list . if empty list then zero 
+ +        <~~> \list . if empty list then zero
                       else 1 + (if empty (tail list) then zero else 1 + ((H H) (tail (tail list))))
- -   
+ +
   We're in business!
   
- -How does the recursion work?  
+ +How does the recursion work?
   We've defined `H` in such a way that `H H` turns out to be the length function.
   In order to evaluate `H H`, we substitute `H` into the body of the
   lambda term.  Inside the lambda term, once the substitution has
@@@ -344,7 -353,7 +353,7 @@@ That is, Yh is a fixed point for h
   Works!
   
   
- -##What is a fixed point for the successor function?## 
+ +##What is a fixed point for the successor function?##
   
   Well, you might think, only some of the formulas that we might give to the `successor` as arguments would really represent numbers. If we said something like:
   
@@@ -448,9 -457,9 +457,9 @@@ returns itself (a copy of `sink`); if t
       sink true true false ~~> I
       sink true true true false ~~> I
   
- -So we make `sink = Y (\f b. b f I)`: 
+ +So we make `sink = Y (\f b. b f I)`:
   
- -    1. sink false 
+ +    1. sink false
       2. Y (\fb.bfI) false
       3. (\f. (\h. f (h h)) (\h. f (h h))) (\fb.bfI) false
       4. (\h. [\fb.bfI] (h h)) (\h. [\fb.bfI] (h h)) false
@@@ -467,7 -476,7 +476,7 @@@ argument, we can throw it away unreduce
   
   Now we try the next most complex example:
   
- -    1. sink true false 
+ +    1. sink true false
       2. Y (\fb.bfI) true false
       3. (\f. (\h. f (h h)) (\h. f (h h))) (\fb.bfI) true false
       4. (\h. [\fb.bfI] (h h)) (\h. [\fb.bfI] (h h)) true false
@@@ -481,7 -490,7 +490,7 @@@ is again I
   
   You should be able to see that `sink` will consume as many `true`s as
   we throw at it, then turn into the identity function after it
- -encounters the first `false`. 
+ +encounters the first `false`.
   
   The key to the recursion is that, thanks to Y, the definition of
   `sink` contains within it the ability to fully regenerate itself as
@@@ -533,7 -542,7 +542,7 @@@ true, then (1) is not true
   Without pretending to give a serious analysis of the paradox, let's
   assume that sentences can have for their meaning boolean functions
   like the ones we have been working with here.  Then the sentence *John
- -is John* might denote the function `\x y. x`, our `true`.  
+ +is John* might denote the function `\x y. x`, our `true`.
   
   Then (1) denotes a function from whatever the referent of *this
   sentence* is to a boolean.  So (1) denotes `\f. f true false`, where
@@@ -576,7 -585,7 +585,7 @@@ context, (2) might denot
   
        Y C
        (\f. (\h. f (h h)) (\h. f (h h))) I
- -     (\h. C (h h)) (\h. C (h h))) 
+ +     (\h. C (h h)) (\h. C (h h)))
        C ((\h. C (h h)) (\h. C (h h)))
        C (C ((\h. C (h h))(\h. C (h h))))
        C (C (C ((\h. C (h h))(\h. C (h h)))))
@@@ -618,7 -627,7 +627,7 @@@ just this juncture
   
   One obstacle to thinking this through is the fact that a sentence
   normally has only two truth values.  We might consider instead a noun
- -phrase such as 
+ +phrase such as
   
   (3)  the entity that this noun phrase refers to
   
@@@ -630,7 -639,7 +639,7 @@@ for any object
   
   The chameleon nature of (3), by the way (a description that is equally
   good at describing any object), makes it particularly well suited as a
- -gloss on pronouns such as *it*.  In the system of 
+ +gloss on pronouns such as *it*.  In the system of
   [Jacobson 1999](http://www.springerlink.com/content/j706674r4w217jj5/),
   pronouns denote (you guessed it!) identity functions...
   
diff --combined topics/week3_what_is_computation.mdwn

index 47748a1,79fd399..32a4572
--- 1/topics/_week3_what_is_computation.mdwn
--- 2/topics/week3_what_is_computation.mdwn
+++ b/topics/week3_what_is_computation.mdwn
@@@ -6,13 -6,26 +6,36 @@@ expression and replacing it with an equ
       3 + 4 == 7
   
   This equation can be interpreted as expressing the thought that the
++<<<<<<< HEAD:topics/_week3_what_is_computation.mdwn
+ +complex expression `3 + 4` evaluates to `7`.  The evaluation of the
+ +expression computing a sum.  There is a clear sense in which the
+ +expression `7` is simpler than the expression `3 + 4`: `7` is
+ +syntactically simple, and `3 + 4` is syntactically complex.
+ +
+ +Now let's take this folk notion of computation, and put some pressure
+ +on it.
++=======
+ complex expression `3 + 4` evaluates to `7`.  In this case, the
+ evaluation of the expression involves computing a sum.  There is a
+ clear sense in which the expression `7` is simpler than the expression
+ `3 + 4`: `7` is syntactically simple, and `3 + 4` is syntactically
+ complex.
+ 
+ It's worth pausing a moment and wondering why we feel that replacing a
+ complex expression like `3 + 4` with a simplex expression like `7`
+ feels like we've accomplished something.  If they really are
+ equivalent, why shouldn't we consider them to be equally valuable, or
+ even to prefer the longer expression?  For instance, should we prefer
+ 2^9, or 512?  Likewise, in the realm of logic, why shold we ever
+ prefer `B` to the conjunction of `A` with `A --> B`?
+ 
+ The question to ask here is whether our intuitions about what counts
+ as more evaluated always tracks simplicity of expression, or whether
+ it tracks what is more useful to us in a given larger situation.
+ 
+ But even deciding which expression ought to count as simpler is not
+ always so clear.
++>>>>>>> working:topics/week3_what_is_computation.mdwn
   
   ##Church arithmetic##
   
@@@ -64,14 -77,14 +87,14 @@@ But now consider multiplication
   Is the final result simpler?  This time, the answer is not so straightfoward.
   Compare the starting expression with the final expression:
   
- -        *           3             4 
+ +        *           3             4
           (\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(fz))))
       ~~> 12
           (\fz.f(f(f(f(f(f(f(f(f(f(f(fz))))))))))))
   
   And if we choose different numbers, the result is even less clear:
   
- -        *           3             6 
+ +        *           3             6
           (\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(f(f(fz))))))
       ~~> 18
           (\fz.f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(fz))))))))))))))))))
@@@ -84,7 -97,7 +107,7 @@@ encoding of `18` is just a uniform sequ
   This example shows that computation can't be just simplicity as
   measured by the number of symbols in the representation.  There is
   still some sense in which the evaluated expression is simpler, but the
- right way to characterize simpler is elusive.
+ right way to characterize "simpler" is elusive.
   
   One possibility is to define simpler in terms of irreversability.  The
   reduction rules of the lambda calculus define an asymmetric relation
@@@ -97,16 -110,16 +120,22 @@@ that reduce to that term
       (y((\xx)y)) ~~> yy
       etc.
   
++<<<<<<< HEAD:topics/_week3_what_is_computation.mdwn
+ +In the arithmetic example, there is only one number that corresponds
+ +to the sum of 3 and 4 (namely, 7).  But there are many sums that add
+ +up to 7: 3+4, 4+3, 5+2, 2+5, 6+1, 1+6, etc.
++=======
+ Likewise, in the arithmetic example, there is only one number that
+ corresponds to the sum of 3 and 4 (namely, 7).  But there are many
+ sums that add up to 7: 3+4, 4+3, 5+2, 2+5, 6+1, 1+6, etc.
++>>>>>>> working:topics/week3_what_is_computation.mdwn
   
   So the unevaluated expression contains information that is missing
   from the evaluated value: information about *how* that value was
   arrived at.  So this suggests the following way of thinking about what
   counts as evaluated:
   
-     Given two expressions such that one reduced to the other,
+     Given two expressions such that one reduces to the other,
       the more evaluated one is the one that contains less information.
   
   This definition is problematic, though, if we try to define the amount
@@@ -124,12 -137,14 +153,14 @@@ pathological examples where the result
   In this example, reduction returns the exact same lambda term.  There
   is no simplification at all.
   
- -    (\x.xxx)(\x.xxx) ~~> ((\x.xxxx)(\x.xxxx)(\x.xxxx)) 
+ +    (\x.xxx)(\x.xxx) ~~> ((\x.xxxx)(\x.xxxx)(\x.xxxx))
   
   Even worse, in this case, the "reduced" form is longer and more
   complex by any measure.
   
+ We may have to settle for the idea that a well-chosen reduction system
+ will characterize our intuitive notion of evaluation in most cases, or
+ in some useful class of non-pathological cases.  
+ 
   These are some of the deeper issues to keep in mind as we discuss the
   ins and outs of reduction strategies.
- 
-
author	Chris <chris.barker@nyu.edu>
	Thu, 12 Feb 2015 15:50:55 +0000 (10:50 -0500)
committer	Chris <chris.barker@nyu.edu>
	Thu, 12 Feb 2015 15:50:55 +0000 (10:50 -0500)
		1	2
topics/_week4_fixed_point_combinator.mdwn	patch \|	diff1 \|	diff2 \|	blob \| history
topics/week3_what_is_computation.mdwn	patch \|	diff1 \|	diff2 \|	blob \| history