+ (\z z)
+
+both represent the same function, the identity function. However, we said
+(FIXME in the advanced notes) that we would be regarding these expressions as
+synactically equivalent, so they aren't yet really examples of *distinct*
+lambda expressions representing a single function. However, all three of these
+are distinct lambda expressions:
+
+ (\y x. y x) (\z z)
+
+ (\x. (\z z) x)
+
+ (\z z)
+
+yet when applied to any argument M, all of these will always return M. So they
+have the same extension. It's also true, though you may not yet be in a
+position to see, that no other function can differentiate between them when
+they're supplied as an argument to it. However, these expressions are all
+syntactically distinct.
+
+The first two expressions are (beta-)*convertible*: in particular the first
+reduces to the second via a single instance of beta reduction. So they
+can be regarded as proof-theoretically equivalent even though they're
+not syntactically identical. However, the proof theory we've given so
+far doesn't permit you to reduce the second expression to the
+third. So these lambda expressions are non-equivalent.
+
+In other words, we have here different (non-equivalent) lambda terms with the
+same extension. This introduces some tension between the popular way of
+thinking of functions as represented by (identical to?) their graphs or
+extensions, and the idea that lambda terms express functions. Well, perhaps the
+lambda terms are just a finer-grained way of expressing functions than
+extensions are?
+
+As we'll see, though, there are even further sources of
+tension between the idea of functions-as-extensions and the idea of functions
+embodied in the lambda calculus.
+
+
+1. One reason is that that general
+mathematical conception permits many *uncomputable* functions, but the
+lambda calculus can't express those.
+
+2. More problematically, some lambda terms express "functions" that can take
+themselves as arguments. If we wanted to represent that set-theoretically, and
+identified functions with their extensions, then we'd have to have some
+extension that contained (an ordered pair containing) itself as a member. Which
+we're not allowed to do in mainstream set-theory. But in the lambda calculus
+this is permitted and common --- and in fact will turn out to be indispensable.
+
+ Here are some simple examples. We can apply the identity function to itself:
+
+ (\x x) (\x x)
+
+ This is a redex that reduces to the identity function (of course). We can
+apply the **K** function to another argument and itself:
+
+ > <code><b>K</b> z <b>K</b></code>
+
+ That is:
+
+ (\x (\y x)) z (\x (\y x))
+
+ That reduces to just `z`.
+
+3. As we'll see in coming weeks, some lambda terms will turn out to be
+impossible to associate with any extension. This is related to the previous point.
+
+
+In fact it *does* turn out to be possible to represent the Lambda Calculus
+set-theoretically. But not in the straightforward way that identifies
+functions with their graphs. For years, it wasn't known whether it would be possible to do this. But then
+[[!wikipedia Dana Scott]] figured out how to do it in late 1969, that is,
+he formulated the first "denotational semantics" for this lambda calculus.
+Scott himself had expected that this wouldn't be possible to do. He argued
+for its unlikelihood in a paper he wrote only a month before the discovery.
+
+(You can find an exposition of Scott's semantics in Chapters 15 and 16 of the
+Hindley & Seldin book we recommended, and an exposition of some simpler
+models that have since been discovered in Chapter 5 of the Hankin book, and
+Section 16F of Hindley & Seldin. But realistically, you really ought to
+wait a while and get more familiar with these systems before you'll be at all
+ready to engage with those ideas.)
+
+We've characterized the rule of beta-reduction as a kind of "proof theory" for
+the Lambda Calculus. In fact, the usual proof theory is expressed in terms of
+*convertibility* rather than in terms of reduction; but it's natural to
+understand reduction as being the conceptually more fundamental notion. And
+there are some proof theories for this system that are expressed in terms of
+reduction.
+
+To use a linguistic analogy, you can think of what we're calling
+"proof theory" as a kind of syntactic transformation. The
+sentences *John saw Mary* and *Mary was seen by John* are not
+syntactically identical, yet (on some theories) one can be derived
+from the other. The key element in the analogy is that this kind of
+syntactic derivation is supposed to preserve meaning, so that the two
+sentences mean (roughly) the same thing.
+
+There's an extension of the proof-theory we've presented so far which
+does permit a further move of just that sort. It would permit us to
+also count these functions:
+
+ (\x. (\z z) x)
+
+ (\z z)
+
+as equivalent. This additional move is called **eta-reduction**. It's
+crucial to eta-reduction that the outermost variable binding in the
+abstract we begin with (here, `\x`) be of a variable that occurs free
+exactly once in the body of that abstract, and that it be in the
+rightmost position.