topics/_week3_what_is_computation.mdwn

   1 #What is computation?#
   2
   3 The folk notion of computation involves taking a complicated
   4 expression and replacing it with an equivalent simpler expression.
   5
   6     3 + 4 == 7
   7
   8 This equation can be interpreted as expressing the thought that the
   9 complex expression `3 + 4` evaluates to `7`.  The evaluation of the
  10 expression computing a sum.  There is a clear sense in which the
  11 expression `7` is simpler than the expression `3 + 4`: `7` is
  12 syntactically simple, and `3 + 4` is syntactically complex.
  13
  14 Now let's take this folk notion of computation, and put some pressure
  15 on it.
  16
  17 ##Church arithmetic##
  18
  19 In our system of computing with Church encodings, we have the
  20 following representations:
  21
  22     3 :=  \f z . f (f (f z))
  23     4 :=  \f z . f (f (f (f z)))
  24     7 :=  \f z . f (f (f (f (f (f (f z))))))
  25
  26 [By the way, those of you who did the extra credit problem on homework
  27 2, solving the reverse function with Oleg's holes---do you see the
  28 resemblance?]
  29
  30 The addition of 3 and 4, then, clearly needs to just tack 4's string
  31 of f's onto the end of 3's string of f's.  This suggests the following
  32 implementation of the arithmetic addition:
  33
  34     + := \lrfz.lf(rfz)
  35
  36     + 3 4 == (\lrfz.lf(rfz)) 3 4
  37           ~~> \fz.3f(4fz)
  38           ==  \fz.(\fz.f(f(fz))) f (4fz)
  39           ~~> \fz.f(f(f(4fz)))
  40           ==  \fz.f(f(f((\fz.f(f(f(fz))) f z))))
  41           ~~> \fz.f(f(f(f(f(f(fz))))))
  42           ==  7
  43
  44 as desired.  Is there still a sense in which the encoded version of `3
  45 + 4` is simpler than the encoded version of `7`?  Well, yes: once the
  46 numerals `3` and `4` have been replaced with their encodings, the
  47 encoding of `3 + 4` contains more symbols than the encoding of `7`.
  48
  49 But now consider multiplication:
  50
  51     * := \lrf.l(rf)
  52     * 3 4 :=  (\lrf.l(rf)) 3 4
  53           ~~> (\rf.3(rf)) 4
  54           ~~> \f.3(4f)
  55           ==  \f.(\fz.f(f(fz)))(4f)
  56           ~~> \fz.(4f)((4f)((4f)z))
  57           ==  \fz.((\fz.f(f(f(fz))))f) (((\fz.f(f(f(fz))))f) (((\fz.f(f(f(fz))))f) z))
  58           ~~> \fz.((\z.f(f(f(fz)))) ((\z.f(f(f(fz)))) ((\z.f(f(f(fz))))z)))
  59           ~~> \fz.((\z.f(f(f(fz)))) ((\z.f(f(f(fz)))) (f(f(f(fz))))))
  60           ~~> \fz.((\z.f(f(f(fz)))) (f(f(f(f(f(f(f(fz)))))))))
  61           ~~> \fz.f(f(f(f(f(f(f(f(f(f(f(fz)))))))))))
  62           ==  12
  63
  64 Is the final result simpler?  This time, the answer is not so straightfoward.
  65 Compare the starting expression with the final expression:
  66
  67         *           3             4
  68         (\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(fz))))
  69     ~~> 12
  70         (\fz.f(f(f(f(f(f(f(f(f(f(f(fz))))))))))))
  71
  72 And if we choose different numbers, the result is even less clear:
  73
  74         *           3             6
  75         (\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(f(f(fz))))))
  76     ~~> 18
  77         (\fz.f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(fz))))))))))))))))))
  78
  79 On the on hand, there are more symbols in the encoding of `18` than in
  80 the encoding of `* 3 6`.  On the other hand, there is more complexity
  81 (speaking pre-theoretically) in the encoding of `* 3 6`, since the
  82 encoding of `18` is just a uniform sequence of nested `f`'s.
  83
  84 This example shows that computation can't be just simplicity as
  85 measured by the number of symbols in the representation.  There is
  86 still some sense in which the evaluated expression is simpler, but the
  87 right way to characterize simpler is elusive.
  88
  89 One possibility is to define simpler in terms of irreversability.  The
  90 reduction rules of the lambda calculus define an asymmetric relation
  91 over terms: for any given redex, there is a unique reduced term (up to
  92 alphabetic variance).  But for any given term, there are many redexes
  93 that reduce to that term.
  94
  95     ((\x.xx)y) ~~> yy
  96     ((\xx)yy) ~~> yy
  97     (y((\xx)y)) ~~> yy
  98     etc.
  99
 100 In the arithmetic example, there is only one number that corresponds
 101 to the sum of 3 and 4 (namely, 7).  But there are many sums that add
 102 up to 7: 3+4, 4+3, 5+2, 2+5, 6+1, 1+6, etc.
 103
 104 So the unevaluated expression contains information that is missing
 105 from the evaluated value: information about *how* that value was
 106 arrived at.  So this suggests the following way of thinking about what
 107 counts as evaluated:
 108
 109     Given two expressions such that one reduced to the other,
 110     the more evaluated one is the one that contains less information.
 111
 112 This definition is problematic, though, if we try to define the amount
 113 of information using, say, [[!wikipedia Komolgorov complexity]].
 114
 115 Ultimately, we have to decide that the reduction rules determine what
 116 counts as evaluated.  If we're lucky, that will align well with our
 117 intuitive desires about what should count as simpler.  But we're not
 118 always lucky.  In particular, although beta reduction in general lines
 119 up well with our intuitive sense of simplification, there are
 120 pathological examples where the results do not align so well:
 121
 122     (\x.xx)(\x.xx) ~~> (\x.xx)(\x.xx) ~~> (\x.xx)(\x.xx)
 123
 124 In this example, reduction returns the exact same lambda term.  There
 125 is no simplification at all.
 126
 127     (\x.xxx)(\x.xxx) ~~> ((\x.xxxx)(\x.xxxx)(\x.xxxx))
 128
 129 Even worse, in this case, the "reduced" form is longer and more
 130 complex by any measure.
 131
 132 These are some of the deeper issues to keep in mind as we discuss the
 133 ins and outs of reduction strategies.
 134
 135