--- /dev/null
+This discussion elaborates on the discussion of evaluation order in the
+class notes from week 2. It makes use of the reduction diagrams drawn
+in class, which makes choice of evaluation strategy a choice of which
+direction to move through a network of reduction links.
+
+
+Some lambda terms can be reduced in different ways:
+
+<pre>
+ ((\x.x)((\y.y) z))
+ / \
+ / \
+ / \
+ / \
+ ((\y.y) z) ((\x.x) z)
+ \ /
+ \ /
+ \ /
+ \ /
+ \ /
+ \ /
+ z
+</pre>
+
+But because the lambda calculus is confluent (has the diamond
+property, named after the shape of the diagram above), no matter which
+lambda you choose to target for reduction first, you end up at the
+same place. It's like travelling in Manhattan: if you walk uptown
+first and then head east, you end up in the same place as if you walk
+east and then head uptown.
+
+But which lambda you target has implications for efficiency and for
+termination. (Later in the course, it will have implications for
+the order in which side effects occur.)
+
+First, efficiency:
+
+<pre>
+ ((\x.w)((\y.y) z))
+ \ \
+ \ ((\x.w) z)
+ \ /
+ \ /
+ \ /
+ \ /
+ w
+</pre>
+
+If a function discards its argument (as `\x.w` does), it makes sense
+to target that function for reduction, rather than wasting effort
+reducing the argument only to have the result of all that work thrown
+away. So in this situation, the strategy of "always reduce the
+leftmost reducible lambda" wins.
+
+But:
+
+<pre>
+ ((\x.xx)((\y.y) z))
+ / \
+ (((\y.y) z)((\y.y) z) ((\x.xx) z)
+ / | /
+ / (((\y.y)z)z) /
+ / | /
+ / | /
+ / | /
+ (z ((\y.y)z)) | /
+ \ | /
+ -----------.---------
+ |
+ zz
+</pre>
+
+This time, the leftmost function `\x.xx` copies its argument.
+If we reduce the rightmost lambda first (rightmost branch of the
+diagram), the argument is already simplified before we do the
+copying. We arrive at the normal form (i.e., the form that cannot be
+further reduced) in two steps.
+
+But if we reduce the rightmost lambda first (the two leftmost branches
+of the diagram), we copy the argument before it has been evaluated.
+In effect, when we copy the unreduced argument, we double the amount
+of work we need to do to deal with that argument.
+
+So when the function copies its argument, the "always reduce the
+rightmost reducible lambda" wins.
+
+So evaluation strategies have a strong effect on how many reduction
+steps it takes to arrive at a stopping point (e.g., normal form).
+
+Now for termination:
+
+<pre>
+(\x.w)((\x.xxx)(\x.xxx))
+ | \
+ | (\x.w)((\x.xxx)(\x.xxx)(\x.xxx))
+ | / \
+ | / (\x.w)((\x.xxx)(\x.xxx)(\x.xxx)(\x.xxx))
+ | / / \
+ .----------------- etc.
+ |
+ w
+</pre>
+
+Here we have a function that discards its argument on the left, and a
+non-terminating term on the right. It's even more evil than Omega,
+because it not only never reduces, it generates more and more work
+with each so-called reduction. If we unluckily adopt the "always
+reduce the rightmost reducible lambda" strategy, we'll spend our days
+relentlessly copying out new copies of \x.xxx. But if we even once
+reduce the leftmost reducible lambda, we'll immediately jump to the
+normal form, `w`.
+
+We can roughly associate the "leftmost always" strategy with call by
+name (normal order), and the "rightmost always" strategy with call by
+value (applicative order). There are fine-grained distinctions among
+these terms of art that we will not pause to untangle here.
+
+If a term has a normal form (a reduction such that no further
+reduction is possible), then a leftmost-always reduction will find it.
+(That's why it's called "normal order": it's the evaluation order that
+guarantees finding a normal form if one exists.)
+
+Preview: if the evaluation of a function has side effects, then the
+choice of an evaluation strategy will make a big difference in which
+side effect occur and in which order.