+This is not because there is any difficulty typing what the functions
+involved do "from the outside": for instance, the predecessor function
+is a function from numbers to numbers, or τ -> τ, where τ
+is our type for Church numbers (i.e., (σ -> σ) -> σ
+-> σ). (Though this type will only be correct if we decide that
+the predecessor of zero should be a number, perhaps zero.)
+
+Rather, the problem is that the definition of the function requires
+subterms that can't be simply-typed. We'll illustrate with our
+implementation of the predecessor function, based on the discussion in
+Pierce 2002:547:
+
+ let zero = \s z. z in
+ let fst = \x y. x in
+ let snd = \x y. y in
+ let pair = \x y . \f . f x y in
+ let succ = \n s z. s (n s z) in
+ let shift = \p. pair (succ (p fst)) (p fst) in
+ let pred = \n. n shift (pair zero zero) snd in
+
+Note that `shift` takes a pair `p` as argument, but makes use of only
+the first element of the pair. Why does it do that? In order to
+understand what this code is doing, it is helpful to go through a
+sample computation, the predecessor of 3:
+
+ pred 3
+ 3 shift (pair zero zero) snd
+ (\s z.s(s(s z))) shift (pair zero zero) snd
+ shift (shift (shift (\f.f 0 0))) snd
+ shift (shift (pair (succ ((\f.f 0 0) fst)) ((\f.f 0 0) fst))) snd
+ shift (shift (\f.f 1 0)) snd
+ shift (\f. f 2 1) snd
+ (\f. f 3 2) snd
+ snd 3 2
+ 2
+
+At each stage, `shift` sees an ordered pair that contains two numbers
+related by the successor function. It can safely discard the second
+element without losing any information. The reason we carry around
+the second element at all is that when it comes time to complete the
+computation---that is, when we finally apply the top-level ordered
+pair to `snd`---it's the second element of the pair that will serve as
+the final result.
+
+Let's see how far we can get typing these terms. `zero` is the Church
+encoding of zero. Using `N` as the type for Church numbers (i.e.,
+<code>N ≡ (σ -> σ) -> σ -> σ</code> for
+some σ, `zero` has type `N`. `snd` takes two numbers, and
+returns the second, so `snd` has type `N -> N -> N`. Then the type of
+`pair` is `N -> N -> (type(snd)) -> N`, that is, `N -> N -> (N -> N ->
+N) -> N`. Likewise, `succ` has type `N -> N`, and `shift` has type
+`pair -> pair`, where `pair` is the type of an ordered pair of
+numbers, namely, <code>pair ≡ (N -> N -> N) -> N</code>. So far
+so good.
+
+The problem is the way in which `pred` puts these parts together. In
+particular, `pred` applies its argument, the number `n`, to the
+`shift` function. Since `n` is a number, its type is <code>(σ
+-> σ) -> σ -> σ</code>. This means that the type of
+`shift` has to match <code>σ -> σ</code>. But we
+concluded above that the type of `shift` also had to be `pair ->
+pair`. Putting these constraints together, it appears that
+<code>σ</code> must be the type of a pair of numbers. But we
+already decided that the type of a pair of numbers is `(N -> N -> N)
+-> N`. Here's the difficulty: `N` is shorthand for a type involving
+<code>σ</code>. If <code>σ</code> turns out to depend on
+`N`, and `N` depends in turn on <code>σ</code>, then
+<code>σ</code> is a proper subtype of itself, which is not
+allowed in the simply-typed lambda calculus.
+
+The way we got here is that the `pred` function relies on the built-in
+right-fold structure of the Church numbers to recursively walk down
+the spine of its argument. In order to do that, the argument had to
+apply to the `shift` operation. And since `shift` had to be the
+sort of operation that manipulates numbers, the infinite regress is
+established.
+
+Now, of course, this is only one of myriad possible implementations of
+the predecessor function in the lambda calculus. Could one of them
+possibly be simply-typeable? It turns out that this can't be done.
+See Oleg Kiselyov's discussion and works cited there for details:
+[[predecessor and lists can't be represented in the simply-typed
+lambda
+calculus|http://okmij.org/ftp/Computation/lambda-calc.html#predecessor]].