+Armed with the encoding we developed for lists above, Church's method for encoding numbers in the Lambda Calculus is very natural. But this is not the order that these ideas were historically developed.
+
+It's noteworthy that when he was developing the Lambda Calculus, Church had not predicted it would be possible to encode the numbers. It came as a welcome surprise.
+
+The idea here is that a number is something like a list, only without any slot for an interesting, possibly varying head element at each level. Whereas in a list like:
+
+ [a, b, c]
+
+we have:
+
+ head=a, sublist=(head=b, sublist=(head=c, sublist=[]))
+
+with the number 3 we instead only have:
+
+ head=(), subnumber=(head=(), subnumber=(head=(), subnumber=zero))
+
+where `()` is the zero-length tuple, the one and only member of its type. That is, the head here is a type of thing that cannot interestingly vary. Or we could just eliminate the head, and have:
+
+ subnumber=(subnumber=(subnumber=zero))
+
+probably more familiar to you as:
+
+ succ (succ (succ zero))
+
+In other words, where a list is inductively built up out of some interesting new datum and its "sublist" or tail, a number is inductively built up out of no interesting new datum and its "subnumber" or predecessor. In this light, a number can be seen as a kind of stunted or undernourished list.
+
+Now, we had a strategy for encoding the list `[a, b, c]` as:
+
+ \f z. f a (f b (f c z))
+
+So perhaps we can apply the same kind of idea to the number 3, and encode it as:
+
+ \f z. f (f (f z))
+
+In fact, there's a way of looking at this that makes it look incredibly natural. You may have encountered the idea of a composition of two functions `f` and `g`, written as <code>f ○ g</code>, or in Kapulet as `f comp g`. This is defined to be:
+
+ \x. f (g x)
+
+The composition of a function `f` with itself, namely:
+
+ \x. f (f x)
+
+is sometimes abbreviated as <code>f<sup>2</sup></code>. Similarly, <code>f ○ f ○ f</code> or `\x. f (f (f x))` is sometimes abbreviated as <code>f<sup>3</sup></code>. <code>f<sup>1</sup></code> is understood to be just `f` itself, and <code>f<sup>0</sup></code> is understood to be the identity function.
+
+So in proposing to encode the number 3 as:
+
+ \f z. f (f (f z))
+
+we are proposing to encode it as:
+
+<code>\f z. f<sup>3</sup> z</code>
+
+(The <code>f<sup><em>n</em></sup></code> isn't some bit of new Lambda Calculus syntax. It's just our metalanguage abbreviation for the same expressions we had before. It does help focus our attention on what we're essentially doing here.)
+
+And indeed this is the Church encoding of the numbers:
+
+<code>0 ≡ \f z. I z ; or \f z. f<sup>0</sup> z</code>
+<code>1 ≡ \f z. f z ; or \f z. f<sup>1</sup> z</code>
+<code>2 ≡ \f z. f (f z) ; or \f z. f<sup>2</sup> z</code>
+<code>3 ≡ \f z. f (f (f z)) ; or \f z. f<sup>3</sup> z</code>
+<code>...</code>
+
+The encoding for `0` can also be written as `\f z. z`, which we've also proposed as the encoding for `[]` and for `false`. Don't read too much into this.
+
+Given the above, can you figure out how to define the `succ` function? We already worked through the definition of `cons`, and this is just a simplification of that, so you should be able to do it. We'll make it a homework.
+
+We saw that using the `cons` operation and `[]` as arguments to `fold_right` over a list gave us back that very same list. In the Lambda Calculus, that would come down to:
+
+ [a, b, c] cons []
+
+with the appropriate lambda terms substituted throughout, would just reduce to the same lambda term we used to encode `[a, b, c]`. What do you think this would reduce to:
+
+ 3 succ 0
+
+with the appropriate lambda terms substituted throughout? Since 3 is `\f z. f (f (f z))`, it should reduce to whatever:
+
+ succ (succ (succ 0))
+
+does. And if we define `succ` properly, we should expect that to give us the same lambda term that we used to encode `3` in the first place.
+