15. Define a `neg` operator that negates `true` and `false`.
- Expected behavior:
+ Expected behavior:
(((neg true) 10) 20)
- evaluates to `20`, and
+ evaluates to `20`, and
(((neg false) 10) 20)
15. Define a `neg` operator that negates `true` and `false`.
- Expected behavior:
+ Expected behavior:
(((neg true) 10) 20)
- evaluates to `20`, and
+ evaluates to `20`, and
(((neg false) 10) 20)
8. Recall our proposed encoding for the numbers, called "Church's encoding". As we explained last week, it's similar to our proposed encoding of lists in terms of their folds. In last week's homework, you defined `succ` for numbers so encoded. Can you now define `pred` in the Lambca Calculus? Let `pred 0` result in whatever `err` is bound to. This is challenging. For some time theorists weren't sure it could be done. (Here is [some interesting commentary](http://okmij.org/ftp/Computation/lambda-calc.html#predecessor).) However, in this week's notes we examined one strategy for defining `tail` for our chosen encodings of lists, and given the similarities we explained between lists and numbers, perhaps that will give you some guidance in defining `pred` for numbers.
-9. Define `leq` for numbers (that is, ≤) in the Lambda Calculus. Here is the expected behavior,
+9. Define `leq` for numbers (that is, ≤) in the Lambda Calculus. Here is the expected behavior,
where `one` abbreviates `succ zero`, and `two` abbreviates `succ (succ zero)`.
leq zero zero ~~> true
Calculus. Call it Skomega.
24. Are there evaluation strategies in CL corresponding to leftmost
-reduction and rightmost reduction in the lambda calculus?
-What counts as a redex in CL?
+reduction and rightmost reduction in the lambda calculus?
+What counts as a redex in CL?
25. Consider the CL term K I Skomega. Does leftmost (alternatively,
rightmost) evaluation give results similar to the behavior of K I
<pre>
((\x.x)((\y.y) z))
/ \
- / \
+ / \
/ \
/ \
((\y.y) z) ((\x.x) z)
first and then head east, you end up in the same place as if you walk
east and then head uptown.
-But which lambda you target has implications for efficiency and for
+But which lambda you target has implications for efficiency and for
termination. (Later in the course, it will have implications for
the order in which side effects occur.)
\ /
\ /
\ /
- w
-</pre>
+ w
+</pre>
If a function discards its argument (as `\x.w` does), it makes sense
to target that function for reduction, rather than wasting effort
(((\y.y) z)((\y.y) z) ((\x.xx) z)
/ | /
/ (((\y.y)z)z) /
- / | /
+ / | /
/ | /
/ | /
- (z ((\y.y)z)) | /
+ (z ((\y.y)z)) | /
\ | /
-----------.---------
|
If we reduce the rightmost lambda first (rightmost branch of the
diagram), the argument is already simplified before we do the
copying. We arrive at the normal form (i.e., the form that cannot be
-further reduced) in two steps.
+further reduced) in two steps.
But if we reduce the rightmost lambda first (the two leftmost branches
of the diagram), we copy the argument before it has been evaluated.
complex expression `3 + 4` evaluates to `7`. The evaluation of the
expression computing a sum. There is a clear sense in which the
expression `7` is simpler than the expression `3 + 4`: `7` is
-syntactically simple, and `3 + 4` is syntactically complex.
+syntactically simple, and `3 + 4` is syntactically complex.
Now let's take this folk notion of computation, and put some pressure
-on it.
+on it.
##Church arithmetic##
Is the final result simpler? This time, the answer is not so straightfoward.
Compare the starting expression with the final expression:
- * 3 4
+ * 3 4
(\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(fz))))
~~> 12
(\fz.f(f(f(f(f(f(f(f(f(f(f(fz))))))))))))
And if we choose different numbers, the result is even less clear:
- * 3 6
+ * 3 6
(\lrf.l(rf))(\fz.f(f(fz)))(\fz.f(f(f(f(f(fz))))))
~~> 18
(\fz.f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(fz))))))))))))))))))
In the arithmetic example, there is only one number that corresponds
to the sum of 3 and 4 (namely, 7). But there are many sums that add
-up to 7: 3+4, 4+3, 5+2, 2+5, 6+1, 1+6, etc.
+up to 7: 3+4, 4+3, 5+2, 2+5, 6+1, 1+6, etc.
So the unevaluated expression contains information that is missing
from the evaluated value: information about *how* that value was
In this example, reduction returns the exact same lambda term. There
is no simplification at all.
- (\x.xxx)(\x.xxx) ~~> ((\x.xxxx)(\x.xxxx)(\x.xxxx))
+ (\x.xxx)(\x.xxx) ~~> ((\x.xxxx)(\x.xxxx)(\x.xxxx))
Even worse, in this case, the "reduced" form is longer and more
complex by any measure.
In Scheme:
- (letrec [(get_length
+ (letrec [(get_length
(lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )]
(get_length (list 20 30)))
; this evaluates to 2
-
+
If you instead use an ordinary `let` (or `let*`), here's what would happen, in OCaml:
let get_length = fun lst ->
Here's Scheme:
- (let* [(get_length
+ (let* [(get_length
(lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )]
(get_length (list 20 30)))
; fails with error "reference to undefined identifier: get_length"
2. If you tried this in Scheme:
- (define get_length
+ (define get_length
(lambda (lst) (if (null? lst) 0 [+ 1 (get_length (cdr lst))] )) )
-
+
(get_length (list 20 30))
You'd find that it works! This is because `define` in Scheme is really shorthand for `letrec`, not for plain `let` or `let*`. So we should regard this as cheating, too.
isn't well-defined. The point of interest here is that its definition
requires recursion in the function definition.)
-Neither do the resources we've so far developed suffice to define the
+Neither do the resources we've so far developed suffice to define the
[[!wikipedia Ackermann function]]:
A(m,n) =
###Fixed points###
In general, we call a **fixed point** of a function f any value *x*
-such that f <em>x</em> is equivalent to *x*. For example,
+such that f <em>x</em> is equivalent to *x*. For example,
consider the squaring function `sqare` that maps natural numbers to their squares.
`square 2 = 4`, so `2` is not a fixed point. But `square 1 = 1`, so `1` is a
-fixed point of the squaring function.
+fixed point of the squaring function.
There are many beautiful theorems guaranteeing the existence of a
fixed point for various classes of interesting functions. For
You should be able to immediately provide a fixed point of the
identity combinator I. In fact, you should be able to provide a whole
-bunch of distinct fixed points.
+bunch of distinct fixed points.
With a little thought, you should be able to provide a fixed point of
the false combinator, KI. Here's how to find it: recall that KI
At this point, we have a definition of the length function, though
it's not complete, since we don't know what value to use for the
symbol `length`. Technically, it has the status of an unbound
-variable.
+variable.
Imagine now binding the mysterious variable:
the argument we supply is already the length function we are trying to
define. (Dehydrated water: to reconstitute, just add water!)
-But this is just another way of saying that we are looking for a fixed point.
+But this is just another way of saying that we are looking for a fixed point.
Assume that `h` has a fixed point, call it `LEN`. To say that `LEN`
is a fixed point means that
h LEN <~~> LEN
-But this means that
+But this means that
(\list . if empty list then zero else add one (LEN (tail list))) <~~> LEN
The strategy we will present will turn out to be a general way of
finding a fixed point for any lambda term.
-##Deriving Y, a fixed point combinator##
+##Deriving Y, a fixed point combinator##
How shall we begin? Well, we need to find an argument to supply to
`h`. The argument has to be a function that computes the length of a
length of a list. Let's try applying `h` to itself. It won't quite
work, but examining the way in which it fails will lead to a solution.
- h h <~~> \list . if empty list then zero else 1 + h (tail list)
+ h h <~~> \list . if empty list then zero else 1 + h (tail list)
There's a problem. The diagnosis is that in the subexpression `h
(tail list)`, we've applied `h` to a list, but `h` expects as its
-first argument the length function.
+first argument the length function.
So let's adjust h, calling the adjusted function H:
H H <~~> (\h \list . if empty list then zero else 1 + ((h h) (tail list))) H
<~~> \list . if empty list then zero else 1 + ((H H) (tail list))
== \list . if empty list then zero else 1 + ((\list . if empty list then zero else 1 + ((H H) (tail list))) (tail list))
- <~~> \list . if empty list then zero
+ <~~> \list . if empty list then zero
else 1 + (if empty (tail list) then zero else 1 + ((H H) (tail (tail list))))
-
+
We're in business!
-How does the recursion work?
+How does the recursion work?
We've defined `H` in such a way that `H H` turns out to be the length function.
In order to evaluate `H H`, we substitute `H` into the body of the
lambda term. Inside the lambda term, once the substitution has
Works!
-##What is a fixed point for the successor function?##
+##What is a fixed point for the successor function?##
Well, you might think, only some of the formulas that we might give to the `successor` as arguments would really represent numbers. If we said something like:
sink true true false ~~> I
sink true true true false ~~> I
-So we make `sink = Y (\f b. b f I)`:
+So we make `sink = Y (\f b. b f I)`:
- 1. sink false
+ 1. sink false
2. Y (\fb.bfI) false
3. (\f. (\h. f (h h)) (\h. f (h h))) (\fb.bfI) false
4. (\h. [\fb.bfI] (h h)) (\h. [\fb.bfI] (h h)) false
Now we try the next most complex example:
- 1. sink true false
+ 1. sink true false
2. Y (\fb.bfI) true false
3. (\f. (\h. f (h h)) (\h. f (h h))) (\fb.bfI) true false
4. (\h. [\fb.bfI] (h h)) (\h. [\fb.bfI] (h h)) true false
You should be able to see that `sink` will consume as many `true`s as
we throw at it, then turn into the identity function after it
-encounters the first `false`.
+encounters the first `false`.
The key to the recursion is that, thanks to Y, the definition of
`sink` contains within it the ability to fully regenerate itself as
Without pretending to give a serious analysis of the paradox, let's
assume that sentences can have for their meaning boolean functions
like the ones we have been working with here. Then the sentence *John
-is John* might denote the function `\x y. x`, our `true`.
+is John* might denote the function `\x y. x`, our `true`.
Then (1) denotes a function from whatever the referent of *this
sentence* is to a boolean. So (1) denotes `\f. f true false`, where
Y C
(\f. (\h. f (h h)) (\h. f (h h))) I
- (\h. C (h h)) (\h. C (h h)))
+ (\h. C (h h)) (\h. C (h h)))
C ((\h. C (h h)) (\h. C (h h)))
C (C ((\h. C (h h))(\h. C (h h))))
C (C (C ((\h. C (h h))(\h. C (h h)))))
One obstacle to thinking this through is the fact that a sentence
normally has only two truth values. We might consider instead a noun
-phrase such as
+phrase such as
(3) the entity that this noun phrase refers to
The chameleon nature of (3), by the way (a description that is equally
good at describing any object), makes it particularly well suited as a
-gloss on pronouns such as *it*. In the system of
+gloss on pronouns such as *it*. In the system of
[Jacobson 1999](http://www.springerlink.com/content/j706674r4w217jj5/),
pronouns denote (you guessed it!) identity functions...
a set of ordered pairs. This set is called the **graph** of the
function. If the ordered pair `(a, b)` is a member of the graph of `f`,
that means that `f` maps the argument `a` to the value `b`. In
-symbols, `f: a` ↦ `b`, or `f (a) == b`.
+symbols, `f: a` ↦ `b`, or `f (a) == b`.
In order to count as a *function* (rather
than as merely a more general *relation*), we require that the graph not contain two
elements is all that it takes to simulate the behavior of a general
word processing program. That means that if we had a big enough
lambda term, it could take a representation of *Emma* as input and
-produce *Hamlet* as a result.
+produce *Hamlet* as a result.
Some of these functions are so useful that we'll give them special
names. In particular, we'll call the identity function `(\x x)`
## The analogy with `let` ##
In our basic functional programming language, we used `let`
-expressions to assign values to variables. For instance,
+expressions to assign values to variables. For instance,
let x match 2
- in (x, x)
+ in (x, x)
evaluates to the ordered pair `(2, 2)`. It may be helpful to think of
a redex in the lambda calculus as a particular sort of `let`
is analogous to
- let x match ARG
+ let x match ARG
in BODY
This analogy should be treated with caution. For one thing, our `letrec`
everyone hit himself
S/(S!NP) (S!NP)/NP (S!NP)!((S!NP)/NP)
- \fAx[fx] \y\z[HIT y z] \h\u[huu]
+ \fAx[fx] \y\z[HIT y z] \h\u[huu]
---------------------------------
S!NP \u[HIT u u]
--------------------------------------------
Here, "A" is our crude markdown approximation of the universal quantifier.
Notice that the semantic value of *himself* is exactly `W`.
-The reflexive pronoun in direct object position combines with the transitive verb. The result is an intransitive verb phrase that takes a subject argument, duplicates that argument, and feeds the two copies to the transitive verb meaning.
+The reflexive pronoun in direct object position combines with the transitive verb. The result is an intransitive verb phrase that takes a subject argument, duplicates that argument, and feeds the two copies to the transitive verb meaning.
Note that `W <~~> S(CI)`:
###A different set of reduction rules###
Ok, here comes a shift in thinking. Instead of defining combinators as equivalent to certain lambda terms,
-we can define combinators by what they do. If we have the I combinator followed by any expression X,
+we can define combinators by what they do. If we have the I combinator followed by any expression X,
I will take that expression as its argument and return that same expression as the result. In pictures,
IX ~~> X
KXY ~~> X
-That is, K throws away its second argument. The reduction rule for S can be constructed by examining
+That is, K throws away its second argument. The reduction rule for S can be constructed by examining
the defining lambda term:
<pre><code>S ≡ \fgx.fx(gx)</code></pre>
SFGX ~~> FX(GX)
-If the meaning of a function is nothing more than how it behaves with respect to its arguments,
+If the meaning of a function is nothing more than how it behaves with respect to its arguments,
these reduction rules capture the behavior of the combinators S, K, and I completely.
-We can use these rules to compute without resorting to beta reduction.
+We can use these rules to compute without resorting to beta reduction.
For instance, we can show how the I combinator is equivalent to a
certain crafty combination of Ss and Ks:
form). It's worth noting that the reduction rules for Combinatory
Logic are considerably more simple than, say, beta reduction. Also, since
there are no variables in Combiantory Logic, there is no need to worry
-about variable collision.
+about variable collision.
Combinatory Logic is what you have when you choose a set of combinators and regulate their behavior with a set of reduction rules. As we said, the most common system uses S, K, and I as defined here.
since we've just seen that we can simulate I using only S and K. In
order to get an intuition about what it takes to be Turing complete,
recall our discussion of the lambda calculus in terms of a text editor.
-A text editor has the power to transform any arbitrary text into any other arbitrary text. The way it does this is by deleting, copying, and reordering characters. We've already seen that K deletes its second argument, so we have deletion covered. S duplicates and reorders, so we have some reason to hope that S and K are enough to define arbitrary functions.
+A text editor has the power to transform any arbitrary text into any other arbitrary text. The way it does this is by deleting, copying, and reordering characters. We've already seen that K deletes its second argument, so we have deletion covered. S duplicates and reorders, so we have some reason to hope that S and K are enough to define arbitrary functions.
We've already established that the behavior of combinatory terms can
be perfectly mimicked by lambda terms: just replace each combinator
with its equivalent lambda term, i.e., replace I with `\x.x`, replace
K with `\fxy.x`, and replace S with `\fgx.fx(gx)`. So the behavior of
any combination of combinators in Combinatory Logic can be exactly
-reproduced by a lambda term.
+reproduced by a lambda term.
How about the other direction? Here is a method for converting an
arbitrary lambda term into an equivalent Combinatory Logic term using
5. [\a.(M N)] S[\a.M][\a.N]
6. [\a\b.M] [\a[\b.M]]
-It's easy to understand these rules based on what S, K and I do. The first rule says
+It's easy to understand these rules based on what S, K and I do. The first rule says
that variables are mapped to themselves.
-The second rule says that the way to translate an application is to translate the
+The second rule says that the way to translate an application is to translate the
first element and the second element separately.
The third rule should be obvious.
The fourth rule should also be fairly self-evident: since what a lambda term such as `\x.y` does it throw away its first argument and return `y`, that's exactly what the combinatory logic translation should do. And indeed, `Ky` is a function that throws away its argument and returns `y`.
The fifth rule deals with an abstract whose body is an application: the S combinator takes its next argument (which will fill the role of the original variable a) and copies it, feeding one copy to the translation of \a.M, and the other copy to the translation of \a.N. This ensures that any free occurrences of a inside M or N will end up taking on the appropriate value. Finally, the last rule says that if the body of an abstract is itself an abstract, translate the inner abstract first, and then do the outermost. (Since the translation of [\b.M] will not have any lambdas in it, we can be sure that we won't end up applying rule 6 again in an infinite loop.)
-[Fussy notes: if the original lambda term has free variables in it, so will the combinatory logic translation. Feel free to worry about this, though you should be confident that it makes sense. You should also convince yourself that if the original lambda term contains no free variables---i.e., is a combinator---then the translation will consist only of S, K, and I (plus parentheses). One other detail: this translation algorithm builds expressions that combine lambdas with combinators. For instance, the translation of our boolean false `\x.\y.y` is `[\x[\y.y]] = [\x.I] = KI`. In the intermediate stage, we have `\x.I`, which mixes combinators in the body of a lambda abstract. It's possible to avoid this if you want to, but it takes some careful thought. See, e.g., Barendregt 1984, page 156.]
+[Fussy notes: if the original lambda term has free variables in it, so will the combinatory logic translation. Feel free to worry about this, though you should be confident that it makes sense. You should also convince yourself that if the original lambda term contains no free variables---i.e., is a combinator---then the translation will consist only of S, K, and I (plus parentheses). One other detail: this translation algorithm builds expressions that combine lambdas with combinators. For instance, the translation of our boolean false `\x.\y.y` is `[\x[\y.y]] = [\x.I] = KI`. In the intermediate stage, we have `\x.I`, which mixes combinators in the body of a lambda abstract. It's possible to avoid this if you want to, but it takes some careful thought. See, e.g., Barendregt 1984, page 156.]
[Various, slightly differing translation schemes from combinatorial
logic to the lambda calculus are also possible. These generate
Voilà: the combinator takes any X and Y as arguments, and returns Y applied to X.
One very nice property of combinatory logic is that there is no need to worry about alphabetic variance, or
-variable collision---since there are no (bound) variables, there is no possibility of accidental variable capture,
+variable collision---since there are no (bound) variables, there is no possibility of accidental variable capture,
and so reduction can be performed without any fear of variable collision. We haven't mentioned the intricacies of
-alpha equivalence or safe variable substitution, but they are in fact quite intricate. (The best way to gain
+alpha equivalence or safe variable substitution, but they are in fact quite intricate. (The best way to gain
an appreciation of that intricacy is to write a program that performs lambda reduction.)
-Back to linguistic applications: one consequence of the equivalence between the lambda calculus and combinatory
+Back to linguistic applications: one consequence of the equivalence between the lambda calculus and combinatory
logic is that anything that can be done by binding variables can just as well be done with combinators.
-This has given rise to a style of semantic analysis called Variable Free Semantics (in addition to
+This has given rise to a style of semantic analysis called Variable Free Semantics (in addition to
Szabolcsi's papers, see, for instance,
-Pauline Jacobson's 1999 *Linguistics and Philosophy* paper, "Towards a variable-free Semantics").
-Somewhat ironically, reading strings of combinators is so difficult that most practitioners of variable-free semantics
+Pauline Jacobson's 1999 *Linguistics and Philosophy* paper, "Towards a variable-free Semantics").
+
+Somewhat ironically, reading strings of combinators is so difficult that most practitioners of variable-free semantics
express their meanings using the lambda-calculus rather than combinatory logic; perhaps they should call their
enterprise Free Variable Free Semantics.
Cresswell has also developed a variable-free approach of some philosophical and linguistic interest
in two books in the 1990's.
-A final linguistic application: Steedman's Combinatory Categorial Grammar, where the "Combinatory" is
+A final linguistic application: Steedman's Combinatory Categorial Grammar, where the "Combinatory" is
from combinatory logic (see especially his 2012 book, <cite>Taking Scope</cite>). Steedman attempts to build
a syntax/semantics interface using a small number of combinators, including T ≡ `\xy.yx`, B ≡ `\fxy.f(xy)`,
-and our friend S. Steedman used Smullyan's fanciful bird
+and our friend S. Steedman used Smullyan's fanciful bird
names for the combinators, Thrush, Bluebird, and Starling.
-Many of these combinatory logics, in particular, the SKI system,
+Many of these combinatory logics, in particular, the SKI system,
are Turing complete. In other words: every computation we know how to describe can be represented in a logical system consisting of only a single primitive operation!
The combinators K and S correspond to two well-known axioms of sentential logic: