--Each variable is an expression. For any expressions M and N and variable a, the following are also expressions: +Each variable is an expression. For any variable `a` and (possibly complex) expressions `M` and `N`, the following are also expressions:Variables:`x`

,`y`

,`z`

... +Variables:`x`

,`y`

,`z`

...

Abstract:`(λa M)`

@@ -66,13 +63,13 @@ Expressions in the lambda calculus are called "terms". Here is the syntax of the lambda calculus given in the form of a context-free grammar: - T --> Var - T --> ( lambda Var T) - T --> ( T T ) - Var --> x - Var --> y - Var --> z - ... +> T --> Var +> T --> ( λ Var T) +> T --> ( T T ) +> Var --> x +> Var --> y +> Var --> z +> ... Very, very simple. @@ -107,11 +104,11 @@ called the **argument**, and `M` is called the **body**. The rule of beta-reduction permits a transition from that expression to the following: - M [a <-- N] +>`M`

[`a`

<--`N`

] What this means is just `M`, with any *free occurrences* inside `M` of the variable `a` replaced with the term `N` (--- "without capture", which -we'll explain in the [[advanced notes|FIXME]]). +we'll explain in the [[advanced notes|week2_lambda_advanced]]). What is a free occurrence? @@ -160,7 +157,7 @@ say "reduction" for one or more reduction steps. So when we write: M ~~> N -we'll mean that you can get from M to N by one or more reduction +we'll mean that you can get from `M` to `N` by one or more reduction steps. When `M` and `N` are such that there is some common term (perhaps just one @@ -171,9 +168,7 @@ like this: M <~~> N More details about the notation and metatheory of -the lambda calculus here: - -* [[ week2 lambda calculus fine points]] +the lambda calculus are in [[this week's advanced notes|topics/week2_lambda_advanced]]. ## Shorthand ## @@ -203,9 +198,7 @@ and: z (\x (x y)) -**Dot notation** Dot means "assume a left paren here, and the matching right -paren as far the right as possible without creating unbalanced -parentheses". So: +**Dot notation** Dot means "Insert a left parenthesis here, and the matching right parenthesis as far to the right as possible without creating unbalanced parentheses---so long as doing so would enclose an application or abstract not already wrapped in parentheses." Thus: \x (\y (x y)) @@ -233,7 +226,15 @@ abbreviates: ((\x (\y (x y))) x) -We didn't have to insert any parentheses around the inner body of `\y. (x y)` because they were already there. +The outermost parentheses were added because we have an application. `(\x. \y. +...)` became `(\x (\y. ...))` because of the rule for dots. We didn't +insert any parentheses around the inner body of `\y. (x y)` because they were +already there. That is, in expressions of the form `\y. (...)`, the dot abbreviates +nothing. It's harmless to write such a dot, though, and it can be conceptually +helpful especially in light of the next convention. + +Similarly, we permit `\x. x`, which is shorthand for `\x x`, not for `\x (x)`, which +our syntax forbids. (The [[lambda evaluator|/code/lambda_evaluator]] however tolerates such expressions.) **Merging lambdas** An expression of the form `(\x (\y M))`, or equivalently, `(\x. \y. M)`, can be abbreviated as: @@ -252,7 +253,7 @@ function? One popular answer is that a function can be represented by a set of ordered pairs. This set is called the **graph** of the function. If the ordered pair `(a, b)` is a member of the graph of `f`, that means that `f` maps the argument `a` to the value `b`. In -symbols, `f: a` ↦ `b`, or `f (a) == b`. +symbols, `f: a` ↦ `b`, or `f (a) == b`. In order to count as a *function* (rather than as merely a more general *relation*), we require that the graph not contain two @@ -262,10 +263,10 @@ a unique result. The lambda calculus seems to be wonderfully well-suited for representing functions. In fact, the untyped -lambda calculus is Turing Complete (see [[!wikipedia Turing Completeness]]): +lambda calculus is Turing Complete (see [[!wikipedia Turing completeness]]): all (recursively computable) functions can be represented by lambda terms. Which, by most people's lights, means that all functions we can "effectively decide" --- -that is, compute in a mechanical way without requiring any ingenuity or insight --- +that is, always apply in a mechanical way without requiring any ingenuity or insight, and be guaranteed of a correct answer after some finite number of steps --- can be represented by lambda terms. As we'll see, though, it will be fun (that is, not straightforward) unpacking how these things can be so "represented." @@ -275,7 +276,7 @@ For some lambda terms, it is easy to see what function they represent: simply returns `M`: `((\x x) M) ~~> M`. > `(\x (x x))` duplicates its argument (applies it to itself): -`((\x (x x)) M) ~~> (M M)` +`((\x (x x)) M) ~~> (M M)` > `(\x (\y (y x)))` reorders its two arguments: `(((\x (\y (y x))) M) N) ~~> (N M)` @@ -288,7 +289,7 @@ lambda calculus, note that duplicating, reordering, and deleting elements is all that it takes to simulate the behavior of a general word processing program. That means that if we had a big enough lambda term, it could take a representation of *Emma* as input and -produce *Hamlet* as a result. +produce *Hamlet* as a result. Some of these functions are so useful that we'll give them special names. In particular, we'll call the identity function `(\x x)` @@ -306,7 +307,7 @@ and: (\z z) both represent the same function, the identity function. However, we said -(FIXME in the advanced notes) that we would be regarding these expressions as +[[in the advanced notes|week2_lambda_advanced]] that we would be regarding these expressions as synactically equivalent, so they aren't yet really examples of *distinct* lambda expressions representing a single function. However, all three of these are distinct lambda expressions: @@ -317,7 +318,7 @@ are distinct lambda expressions: (\z z) -yet when applied to any argument M, all of these will always return M. So they +yet when applied to any argument `M`, all of these will always return `M`. So they have the same extension. It's also true, though you may not yet be in a position to see, that no other function can differentiate between them when they're supplied as an argument to it. However, these expressions are all @@ -346,8 +347,7 @@ embodied in the lambda calculus. mathematical conception permits many *uncomputable* functions, but the lambda calculus can't express those. -2. More problematically, some lambda terms express "functions" that can take -themselves as arguments. If we wanted to represent that set-theoretically, and +2. More problematically, lambda terms express "functions" that can *take themselves* as arguments. If we wanted to represent that set-theoretically, and identified functions with their extensions, then we'd have to have some extension that contained (an ordered pair containing) itself as a member. Which we're not allowed to do in mainstream set-theory. But in the lambda calculus @@ -357,8 +357,9 @@ this is permitted and common --- and in fact will turn out to be indispensable. (\x x) (\x x) - This is a redex that reduces to the identity function (of course). We can -apply the **K** function to another argument and itself: + This is a redex that reduces to the identity function (of course). + + We can apply the **K** function to another argument and itself: >`@@ -376,7 +377,7 @@ In fact it *does* turn out to be possible to represent the Lambda Calculus set-theoretically. But not in the straightforward way that identifies functions with their graphs. For years, it wasn't known whether it would be possible to do this. But then [[!wikipedia Dana Scott]] figured out how to do it in late 1969, that is, -he formulated the first "denotational semantics" for this lambda calculus. +he formulated the first "denotational semantics" for this Lambda Calculus. Scott himself had expected that this wouldn't be possible to do. He argued for its unlikelihood in a paper he wrote only a month before the discovery. @@ -413,11 +414,15 @@ also count these functions: as equivalent. This additional move is called **eta-reduction**. It's crucial to eta-reduction that the outermost variable binding in the abstract we begin with (here, `\x`) be of a variable that occurs free -exactly once in the body of that abstract, and that it be in the -rightmost position. +exactly once in the body of that abstract, and that that free occurrence be the rightmost outermost constituent. -In the extended proof theory/theories we get be permitting eta-reduction/conversion -as well as beta-reduction, *all computable functions with the same +The expression: + + (\x (\y (y x)) + +can't be eta-reduced, because the rightmost outermost constituent is not `x` but `(\y (y x)`. + +In the extended proof theory/theories we get be permitting eta-reduction/conversion as well as beta-reduction, *all computable functions with the same extension do turn out to be equivalent*, that is, convertible. However, we still shouldn't assume we're working with functions @@ -428,18 +433,20 @@ reasons sketched above. ## The analogy with `let` ## In our basic functional programming language, we used `let` -expressions to assign values to variables. For instance, +expressions to assign values to variables. For instance, let x match 2 - in (x, x) + in (x, x) -evaluates to the ordered pair (2, 2). It may be helpful to think of +evaluates to the ordered pair `(2, 2)`. It may be helpful to think of a redex in the lambda calculus as a particular sort of `let` construction. - ((\x BODY) ARG) is analogous to + ((\x BODY) ARG) + +is analogous to - let x match ARG + let x match ARG in BODY This analogy should be treated with caution. For one thing, our `letrec` @@ -461,7 +468,7 @@ calculus, an abstract such as `(\x (x x))` is perfectly well-formed and coherent, but it is not possible to write a `let` expression that does not have an `ARG`. That would be like: - `let x match` *missing* + `let x match` *missing* `in x x` Nevertheless, the correspondence is close enough that it can guide our`

KzK