topics/week3_what_is_computation.mdwn

   1 #What is computation?#
   2
   3 The folk notion of computation involves taking a complicated
   4 expression and replacing it with an equivalent simpler expression.
   5
   6     3 + 4 == 7
   7
   8 This equation can be interpreted as expressing the thought that the
   9 <<<<<<< HEAD:topics/_week3_what_is_computation.mdwn
  10 complex expression `3 + 4` evaluates to `7`.  The evaluation of the
  11 expression computing a sum.  There is a clear sense in which the
  12 expression `7` is simpler than the expression `3 + 4`: `7` is
  13 syntactically simple, and `3 + 4` is syntactically complex.
  14
  15 Now let's take this folk notion of computation, and put some pressure
  16 on it.
  17 =======
  18 complex expression `3 + 4` evaluates to `7`.  In this case, the
  19 evaluation of the expression involves computing a sum.  There is a
  20 clear sense in which the expression `7` is simpler than the expression
  21 `3 + 4`: `7` is syntactically simple, and `3 + 4` is syntactically
  22 complex.
  23
  24 It's worth pausing a moment and wondering why we feel that replacing a
  25 complex expression like `3 + 4` with a simplex expression like `7`
  26 feels like we've accomplished something.  If they really are
  27 equivalent, why shouldn't we consider them to be equally valuable, or
  28 even to prefer the longer expression?  For instance, should we prefer
  29 2^9, or 512?  Likewise, in the realm of logic, why shold we ever
  30 prefer `B` to the conjunction of `A` with `A --> B`?
  31
  32 The question to ask here is whether our intuitions about what counts
  33 as more evaluated always tracks simplicity of expression, or whether
  34 it tracks what is more useful to us in a given larger situation.
  35
  36 But even deciding which expression ought to count as simpler is not
  37 always so clear.
  38 >>>>>>> working:topics/week3_what_is_computation.mdwn
  39
  40 ##Church arithmetic##
  41
  42 In our system of computing with Church encodings, we have the
  43 following representations:
  44
  45     3 :=  \f z . f (f (f z))
  46     4 :=  \f z . f (f (f (f z)))
  47     7 :=  \f z . f (f (f (f (f (f (f z))))))
  48
  49 [By the way, those of you who did the extra credit problem on homework
  50 2, solving the reverse function with Oleg's holes---do you see the
  51 resemblance?]
  52
  53 The addition of 3 and 4, then, clearly needs to just tack 4's string
  54 of f's onto the end of 3's string of f's.  This suggests the following
  55 implementation of the arithmetic addition:
  56
  57     + := \lrfz.lf(rfz)
  58
  59     + 3 4 == (\lrfz.lf(rfz)) 3 4
  60           ~~> \fz.3f(4fz)
  61           ==  \fz.(\fz.f(f(fz))) f (4fz)
  62           ~~> \fz.f(f(f(4fz)))
  63           ==  \fz.f(f(f((\fz.f(f(f(fz))) f z))))
  64           ~~> \fz.f(f(f(f(f(f(fz))))))
  65           ==  7
  66
  67 as desired.  Is there still a sense in which the encoded version of `3
  68 + 4` is simpler than the encoded version of `7`?  Well, yes: once the
  69 numerals `3` and `4` have been replaced with their encodings, the
  70 encoding of `3 + 4` contains more symbols than the encoding of `7`.
  71
  72 But now consider multiplication:
  73
  74     * := \lrf.l(rf)
  75     * 3 4 :=  (\lrf.l(rf)) 3 4
  76           ~~> (\rf.3(rf)) 4
  77           ~~> \f.3(4f)
  78           ==  \f.(\fz.f(f(fz)))(4f)
  79           ~~> \fz.(4f)((4f)((4f)z))
  80           ==  \fz.((\fz.f(f(f(fz))))f) (((\fz.f(f(f(fz))))f) (((\fz.f(f(f(fz))))f) z))
  81           ~~> \fz.((\z.f(f(f(fz)))) ((\z.f(f(f(fz)))) ((\z.f(f(f(fz))))z)))
  82           ~~> \fz.((\z.f(f(f(fz)))) ((\z.f(f(f(fz)))) (f(f(f(fz))))))
  83           ~~> \fz.((\z.f(f(f(fz)))) (f(f(f(f(f(f(f(fz)))))))))
  84           ~~> \fz.f(f(f(f(f(f(f(f(f(f(f(fz)))))))))))
  85           ==  12
  86
  87 Is the final result simpler?  This time, the answer is not so straightfoward.
  88 Compare the starting expression with the final expression:
  89
  90         *           3             4
  91         (\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(fz))))
  92     ~~> 12
  93         (\fz.f(f(f(f(f(f(f(f(f(f(f(fz))))))))))))
  94
  95 And if we choose different numbers, the result is even less clear:
  96
  97         *           3             6
  98         (\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(f(f(fz))))))
  99     ~~> 18
 100         (\fz.f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(fz))))))))))))))))))
 101
 102 On the on hand, there are more symbols in the encoding of `18` than in
 103 the encoding of `* 3 6`.  On the other hand, there is more complexity
 104 (speaking pre-theoretically) in the encoding of `* 3 6`, since the
 105 encoding of `18` is just a uniform sequence of nested `f`'s.
 106
 107 This example shows that computation can't be just simplicity as
 108 measured by the number of symbols in the representation.  There is
 109 still some sense in which the evaluated expression is simpler, but the
 110 right way to characterize "simpler" is elusive.
 111
 112 One possibility is to define simpler in terms of irreversability.  The
 113 reduction rules of the lambda calculus define an asymmetric relation
 114 over terms: for any given redex, there is a unique reduced term (up to
 115 alphabetic variance).  But for any given term, there are many redexes
 116 that reduce to that term.
 117
 118     ((\x.xx)y) ~~> yy
 119     ((\xx)yy) ~~> yy
 120     (y((\xx)y)) ~~> yy
 121     etc.
 122
 123 <<<<<<< HEAD:topics/_week3_what_is_computation.mdwn
 124 In the arithmetic example, there is only one number that corresponds
 125 to the sum of 3 and 4 (namely, 7).  But there are many sums that add
 126 up to 7: 3+4, 4+3, 5+2, 2+5, 6+1, 1+6, etc.
 127 =======
 128 Likewise, in the arithmetic example, there is only one number that
 129 corresponds to the sum of 3 and 4 (namely, 7).  But there are many
 130 sums that add up to 7: 3+4, 4+3, 5+2, 2+5, 6+1, 1+6, etc.
 131 >>>>>>> working:topics/week3_what_is_computation.mdwn
 132
 133 So the unevaluated expression contains information that is missing
 134 from the evaluated value: information about *how* that value was
 135 arrived at.  So this suggests the following way of thinking about what
 136 counts as evaluated:
 137
 138     Given two expressions such that one reduces to the other,
 139     the more evaluated one is the one that contains less information.
 140
 141 This definition is problematic, though, if we try to define the amount
 142 of information using, say, [[!wikipedia Komolgorov complexity]].
 143
 144 Ultimately, we have to decide that the reduction rules determine what
 145 counts as evaluated.  If we're lucky, that will align well with our
 146 intuitive desires about what should count as simpler.  But we're not
 147 always lucky.  In particular, although beta reduction in general lines
 148 up well with our intuitive sense of simplification, there are
 149 pathological examples where the results do not align so well:
 150
 151     (\x.xx)(\x.xx) ~~> (\x.xx)(\x.xx) ~~> (\x.xx)(\x.xx)
 152
 153 In this example, reduction returns the exact same lambda term.  There
 154 is no simplification at all.
 155
 156     (\x.xxx)(\x.xxx) ~~> ((\x.xxxx)(\x.xxxx)(\x.xxxx))
 157
 158 Even worse, in this case, the "reduced" form is longer and more
 159 complex by any measure.
 160
 161 We may have to settle for the idea that a well-chosen reduction system
 162 will characterize our intuitive notion of evaluation in most cases, or
 163 in some useful class of non-pathological cases.
 164
 165 These are some of the deeper issues to keep in mind as we discuss the
 166 ins and outs of reduction strategies.