added discussion of computation

author Chris <chris.barker@nyu.edu>

Thu, 12 Feb 2015 02:21:44 +0000 (21:21 -0500)

committer Chris <chris.barker@nyu.edu>

Thu, 12 Feb 2015 02:21:44 +0000 (21:21 -0500)
author Chris <chris.barker@nyu.edu>
Thu, 12 Feb 2015 02:21:44 +0000 (21:21 -0500)
committer Chris <chris.barker@nyu.edu>
Thu, 12 Feb 2015 02:21:44 +0000 (21:21 -0500)
diff --git a/topics/_week3_what_is_computation.mdwn b/topics/_week3_what_is_computation.mdwn

new file mode 100644 (file)

index 0000000..7575abc
--- /dev/null
+++ b/topics/_week3_what_is_computation.mdwn
@@ -0,0 +1,135 @@
+#What is computation?#
+
+The folk notion of computation involves taking a complicated
+expression and replacing it with an equivalent simpler expression.
+
+    3 + 4 == 7
+
+This equation can be interpreted as expressing the thought that the
+complex expression `3 + 4` evaluates to `7`.  The evaluation of the
+expression computing a sum.  There is a clear sense in which the
+expression `7` is simpler than the expression `3 + 4`: `7` is
+syntactically simple, and `3 + 4` is syntactically complex.  
+
+Now let's take this folk notion of computation, and put some pressure
+on it.  
+
+##Church arithmetic##
+
+In our system of computing with Church encodings, we have the
+following representations:
+
+    3 :=  \f z . f (f (f z))
+    4 :=  \f z . f (f (f (f z)))
+    7 :=  \f z . f (f (f (f (f (f (f z))))))
+
+[By the way, those of you who did the extra credit problem on homework
+2, solving the reverse function with Oleg's holes---do you see the
+resemblance?]
+
+The addition of 3 and 4, then, clearly needs to just tack 4's string
+of f's onto the end of 3's string of f's.  This suggests the following
+implementation of the arithmetic addition:
+
+    + := \lrfz.lf(rfz)
+
+    + 3 4 == (\lrfz.lf(rfz)) 3 4
+          ~~> \fz.3f(4fz)
+          ==  \fz.(\fz.f(f(fz))) f (4fz)
+          ~~> \fz.f(f(f(4fz)))
+          ==  \fz.f(f(f((\fz.f(f(f(fz))) f z))))
+          ~~> \fz.f(f(f(f(f(f(fz))))))
+          ==  7
+
+as desired.  Is there still a sense in which the encoded version of `3
++ 4` is simpler than the encoded version of `7`?  Well, yes: once the
+numerals `3` and `4` have been replaced with their encodings, the
+encoding of `3 + 4` contains more symbols than the encoding of `7`.
+
+But now consider multiplication:
+
+    * := \lrf.l(rf)
+    * 3 4 :=  (\lrf.l(rf)) 3 4
+          ~~> (\rf.3(rf)) 4
+          ~~> \f.3(4f)
+          ==  \f.(\fz.f(f(fz)))(4f)
+          ~~> \fz.(4f)((4f)((4f)z))
+          ==  \fz.((\fz.f(f(f(fz))))f) (((\fz.f(f(f(fz))))f) (((\fz.f(f(f(fz))))f) z))
+          ~~> \fz.((\z.f(f(f(fz)))) ((\z.f(f(f(fz)))) ((\z.f(f(f(fz))))z)))
+          ~~> \fz.((\z.f(f(f(fz)))) ((\z.f(f(f(fz)))) (f(f(f(fz))))))
+          ~~> \fz.((\z.f(f(f(fz)))) (f(f(f(f(f(f(f(fz)))))))))
+          ~~> \fz.f(f(f(f(f(f(f(f(f(f(f(fz)))))))))))
+          ==  12
+
+Is the final result simpler?  This time, the answer is not so straightfoward.
+Compare the starting expression with the final expression:
+
+        *           3             4 
+        (\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(fz))))
+    ~~> 12
+        (\fz.f(f(f(f(f(f(f(f(f(f(f(fz))))))))))))
+
+And if we choose different numbers, the result is even less clear:
+
+        *           3             6 
+        (\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(f(f(fz))))))
+    ~~> 18
+        (\fz.f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(fz))))))))))))))))))
+
+On the on hand, there are more symbols in the encoding of `18` than in
+the encoding of `* 3 6`.  On the other hand, there is more complexity
+(speaking pre-theoretically) in the encoding of `* 3 6`, since the
+encoding of `18` is just a uniform sequence of nested `f`'s.
+
+This example shows that computation can't be just simplicity as
+measured by the number of symbols in the representation.  There is
+still some sense in which the evaluated expression is simpler, but the
+right way to characterize simpler is elusive.
+
+One possibility is to define simpler in terms of irreversability.  The
+reduction rules of the lambda calculus define an asymmetric relation
+over terms: for any given redex, there is a unique reduced term (up to
+alphabetic variance).  But for any given term, there are many redexes
+that reduce to that term.
+
+    ((\x.xx)y) ~~> yy
+    ((\xx)yy) ~~> yy
+    (y((\xx)y)) ~~> yy
+    etc.
+
+In the arithmetic example, there is only one number that corresponds
+to the sum of 3 and 4 (namely, 7).  But there are many sums that add
+up to 7: 3+4, 4+3, 5+2, 2+5, 6+1, 1+6, etc.  
+
+So the unevaluated expression contains information that is missing
+from the evaluated value: information about *how* that value was
+arrived at.  So this suggests the following way of thinking about what
+counts as evaluated:
+
+    Given two expressions such that one reduced to the other,
+    the more evaluated one is the one that contains less information.
+
+This definition is problematic, though, if we try to define the amount
+of information using, say, [[!wikipedia Komolgorov complexity]].
+
+Ultimately, we have to decide that the reduction rules determine what
+counts as evaluated.  If we're lucky, that will align well with our
+intuitive desires about what should count as simpler.  But we're not
+always lucky.  In particular, although beta reduction in general lines
+up well with our intuitive sense of simplification, there are
+pathological examples where the results do not align so well:
+
+    (\x.xx)(\x.xx) ~~> (\x.xx)(\x.xx) ~~> (\x.xx)(\x.xx)
+
+In this example, reduction returns the exact same lambda term.  There
+is no simplification at all.
+
+    (\x.xxx)(\x.xxx) ~~> ((\x.xxxx)(\x.xxxx)(\x.xxxx)) 
+
+Even worse, in this case, the "reduced" form is longer and more
+complex by any measure.
+
+These are some of the deeper issues to keep in mind as we discuss the
+ins and outs of reduction strategies.
+
+
author	Chris <chris.barker@nyu.edu>
	Thu, 12 Feb 2015 02:21:44 +0000 (21:21 -0500)
committer	Chris <chris.barker@nyu.edu>
	Thu, 12 Feb 2015 02:21:44 +0000 (21:21 -0500)