What is computation?
The folk notion of computation involves taking a complicated expression and replacing it with an equivalent simpler expression.
4 + 3 == 7
This equation can be interpreted as expressing the thought that the
4 + 3 evaluates to
7. In this case, the
evaluation of the expression involves computing a sum. There is a
clear sense in which the expression
7 is simpler than the expression
4 + 3:
7 is syntactically simple, and
4 + 3 is syntactically
It's worth pausing a moment and wondering why we feel that replacing a
complex expression like
4 + 3 with a simpler expression like
feels like we've accomplished something. If they really are
equivalent, why shouldn't we consider them to be equally valuable, or
even to prefer the longer expression? For instance, should we prefer
512? Likewise, in the realm of logic, why should we ever
B to the conjunction of
A ⊃ B?
The question to ask here is whether our intuitions about what counts as more evaluated always tracks simplicity of expression, or whether it tracks what is more useful to us in a given larger situation.
But even deciding which expression ought to count as simpler is not always so clear.
In our system of computing with Church encodings, we have the following representations:
3 ≡ \f z. f (f (f z)) 4 ≡ \f z. f (f (f (f z))) 7 ≡ \f z. f (f (f (f (f (f (f z))))))
The addition of
3, then, clearly needs to just insert
fs in front of
4's string of
fs. That guides our
implementation of addition:
add ≡ \l r. \f z. r f (l f z) add 4 3 ≡ (\l r. \f z. r f (l f z)) 4 3 ~~> \f z. 3 f (4 f z) ≡ \f z. (\f z. f (f (f z))) f (4 f z) ~~> \f z. f (f (f (4 f z))) ≡ \f z. f (f (f ((\f z. f (f (f (f z))) f z)))) ~~> \f z. f (f (f (f (f (f (f z)))))) ≡ 7
as desired. Is there still a sense in which the encoded version of
add 4 3
is simpler than the encoded version of
7? Well, yes: once the
3 have been replaced with their encodings, the
add 4 3 contains more symbols than the encoding of
But now consider multiplication:
mul ≡ \l r. \f z. r (l f) z mul 4 3 ≡ (\l r. \f z. r (l f) z) 4 3 ~~> \f z. 3 (4 f) z ≡ \f z. (\f z. f (f (f z))) (4 f) z ~~> \f z. (4 f) ((4 f) (4 f z)) ≡ \f z. ((\f z. f (f (f (f z)))) f) (((\f z. f (f (f (f z)))) f) ((\f z. f (f (f (f z)))) f z)) ~~> \f z. (\z. f (f (f (f z)))) ((\z. f (f (f (f z)))) (f (f (f (f z))))) ~~> \f z. (\z. f (f (f (f z)))) (f (f (f (f (f (f (f (f z)))))))) ~~> \f z. f (f (f (f (f (f (f (f (f (f (f (f z))))))))))) ≡ 12
Is the final result simpler? This time, the answer is not so straightforward. Compare the starting expression with the final expression:
mul 4 3 (\l r. \f z. r (l f) z) (\f z. f (f (f (f z)))) (\f z. f (f (f z)))) ~~> 12 (\f z. f (f (f (f (f (f (f (f (f (f (f (f z))))))))))))
And if we choose different numbers, the result is even less clear:
mul 6 3 (\l r. \f z. r (l f) z) (\f z. f ( f (f (f (f (f z)))))) (\f z. f (f (f z)))) ~~> 18 (\f z. f (f (f (f (f (f (f (f (f (f (f (f (f (f (f (f (f (f z))))))))))))))))))
On the on hand, there are more symbols in the encoding of
18 than in
the encoding of
mul 6 3. On the other hand, there is more complexity
(speaking pre-theoretically) in the encoding of
mul 6 3, since the
18 is just a uniform sequence of nested
This example shows that computation can't be just simplicity as measured by the number of symbols in the representation. There is still some sense in which the evaluated expression is simpler, but the right way to characterize "simpler" is elusive.
One possibility is to define simpler in terms of irreversability. The reduction rules of the lambda calculus define an non-symmetric relation over terms: for any given redex, there is a unique reduced term (up to alphabetic variance). But for any given term, there will be many redexes that reduce to that term.
((\x. x x) y) ~~> y y ((\x x) y y) ~~> y y (y ((\x x) y)) ~~> y y etc.
Likewise, in the arithmetic example, there is only one number that
corresponds to the sum of
7). But there are many
ways to add numbers to get
And that's only looking at sums.
So the unevaluated expression contains information that is missing from the evaluated value: information about how that value was arrived at. So this suggests the following way of thinking about what counts as evaluated:
Given two expressions such that one reduces to the other, the more evaluated one is the one that contains less information.
This definition is problematic, though, if we try to define the amount of information using, say, Komolgorov complexity.
Ultimately, we have to decide that the reduction rules determine what counts as evaluated. If we're lucky, that will align well with our intuitive desires about what should count as simpler. But we're not always lucky. In particular, although beta reduction in general lines up well with our intuitive sense of simplification, there are pathological examples where the results do not align so well:
(\x. x x) (\x. x x) ~~> (\x. x x) (\x. x x) ~~> (\x. x x) (\x. x x) ~~> ...
In this example, reduction returns the exact same lambda term. There
is no simplification at all. (As we mentioned in class, the term
(\x. x x) is often referred to in these discussions as (little) ω or "omega", or sometimes M; and its self-application
ω ω, displayed above, is called (big) Ω or "Omega".)
Even worse, consider this term:
(\x. x x x) (\x. x x x) ~~> (\x. x x x) (\x. x x x) (\x. x x x) ~~> ...
Here, the "reduced" form is longer and more complex by any reasonable measure.
We may have to settle for the idea that a well-chosen reduction system will characterize our intuitive notion of evaluation in most cases, or in some useful class of non-pathological cases.
These are some of the deeper issues to keep in mind as we discuss the ins and outs of reduction strategies.