Peano Arithmetic

We'll let the language of arithmetic be the language of first-order predicate logic with \(=\), with the extra-logical signature \((0, S, +, *, <)\). Our aim is to have \(S\) express the successor function (we'll parse \(SS0\) as \(S(S(0))\)), and the other symbols express their familiar meanings. So far we haven't done anything to secure that aim; we've just introduced some uninterpreted syntax. We'll call any sentence of this language an arithmetic sentence.

Now you know what model we intend this language to have. The domain is just \(\mathbb{N}\), the symbol \(0\) expresses the number \(0\), and so on. We'll call this the intended model of arithmetic, or \(\mathscr{N}\). Like any model of the language we're considering, \(\mathscr{N}\) will ascribe some definite truth value to every sentence in the language. Call the set of sentences that it makes true the theory of true arithmetic, or true arithmetic for short. That is, true arithmetic is \(\{\, \phi \mid [\![\, \phi \,]\!]_{\mathscr{N}} = \mathsf{True}\,\}\). Observe that for every arithmetic sentence, either that sentence or its negation will be included in true arithmetic. Hence any set of arithmetic sentences which is a proper superset of true arithmetic must be contain at least one sentence \(\chi\) and its negation, and so be deductively inconsistent and unsatisfiable.

Mathematicians like to prove things from minimal sets of axioms, so there have been various attempts to axiomatize arithmetic. The best known of these was first proposed by Dedekind and then shortly thereafter refined by Peano, and is now known as the Peano axioms. Actually this name is applied to a family of axiomatizations, not just a single one. We'll discuss that family in a moment, and settle on a particular choice. We call the set of sentences provable from the Peano axioms the theory of Peano arithmetic, or Peano arithmetic for short.

One question we may ask is whether the Peano axioms are deductively consistent. If they aren't, then every sentence will be provable from them. As you'll see when we list these axioms, they sure seem to be consistent, and most everybody believes they are consistent. But there are known obstacles to proving that they are. One thing we know is that certain natural kinds of proofs of this will not be possible; we may discuss that in later weeks. Some theorists think that certain other proofs that we do have demonstrate the consistency of the Peano axioms in an oblique way; others don't think so. Let's just assume for the purposes of discussion that the Peano axioms are consistent, and that the intended model \(\mathscr{N}\) satisfies them.

If they were inconsistent, and so proved every sentence in the language of arithmetic, they'd prove some sentences that we assume the intended model \(\mathscr{N}\) does not satisfy. That is, we assume that even if the Peano axioms are screwy in some subtle way we haven't yet identified, we still do intend some model of the language of arithmetic, which like any model refuses to satisfy \(\neg\chi\) when it satisfies \(\chi\). (Whether we do manage to intend this could also be challenged.)

Now, if \(\mathscr{N}\) does satisfy Peano arithmetic, and true arithmetic by definition includes every sentence that \(\mathscr{N}\) satisfies, then Peano arithmetic \(\subseteq \) true arithmetic. Another question we may ask is whether Peano arithmetic \(=\) true arithmetic? Or is it merely a proper subset of true arithmetic? It turns out the answer is the latter; we'll discuss why in later weeks. For the time being I just offer it as a fact. What this means is that there are some arithmetic sentences \(\psi\) that are true in \(\mathscr{N}\) (and hence \(\psi \in {}\) true arithmetic), but \(\psi \notin {}\) Peano arithmetic. But since \(\neg\psi\) isn't true in \(\mathscr{N}\), and we're assuming \(\mathscr{N}\) does satisfy all the sentences in Peano arithmetic, it follows that \(\neg\psi \notin {}\) Peano arithmetic, either. Hence there are arithmetic sentences such that neither they nor their negation are included in the theory of Peano arithmetic; in other words, neither they nor their negations are provable from the Peano axioms. As I said, we'll discuss this all later. For present, I just want to note this result that the Peano axioms will turn out to prove some but not all of the sentences of true arithmetic; that is, they will prove some but not all of the arithmetic sentences true in \(\mathscr{N}\).

There are also several weaker axiomatizations of arithmetic, that prove even fewer of the sentences of true arithmetic. Sometimes it is useful to theorize with or about these weaker axiomatizations. We may discuss some of these later. Here I just flag that the Peano axiomatization(s) are the most familiar but not the only game in town.

What do the Peano axioms say?

Well, as I said they're formulated in a predicate logic that includes \(=\). If such a logic is itself axiomatized, it will include axioms for \(=\); we discussed what those might look like in our notes on proof theory. They include an infinite set of schema instances, of the form:

\(\alpha=\beta \supset (\phi[x\leftarrow\alpha] \subset\supset \phi[x\leftarrow\beta])\)

In the present context, we could get by with only a few consequences of that set, expressing the facts that \(=\) is reflexive, symmetric, and transitive. Sometimes these are presented as part of a set of Peano axioms, themselves, rather than as being part of the proof system for the background logic. But I will assume that any axioms or rules needed for \(=\) are included in our background proof system, and don't need to be added specially as extra-logical axioms for arithmetic.

Sometimes formulations of the Peano axioms include claims like:

\(0\) is a number
\(\forall x. x\) is a number \({} \supset Sx\) is a number

However, we will treat "number" as being equivalent to "belongs to our domain." So these claims just say that \(0\) belongs to our domain (that will be true for any term constant in a mainstream, non-free logic), and that for every \(x\) in our domain, \(Sx\) is also in the domain, that is, \(S\) is interpreted as a total function (this again will be true for any functor in a mainstream, non-free logic).

We will however want to include the next axioms, which articulate some facts about \(0\) and \(S\):

[Peano-1] \(\forall x. 0 \ne Sx\), that is, \(0\) is nobody's successor
[Peano-2] \(\forall x \forall y. Sx = Sy \supset x = y\), that is, \(S\) is injective

Next, we want some claims involving \(<\) to be true. For instance, it should come out true that \(\forall x. x=0 \vee 0<x\); and so on. But in fact, none of these need to be added as axioms. Instead we can just define \(x < y\) to mean:

\(\forall z. x \ne y+z\),

or alternatively:

\(\exists z. x + Sz = y\),

in the same way that Sider defines \(\phi\vee\psi\) to mean \(\neg\phi\supset \psi\). (Observe that these all are direct, non-recursive definitions.) It will then turn out that the results we desire for \(<\) will follow from our other axioms. On this strategy, the language of arithmetic strictly speaking won't include the predicate \(<\); that will instead be part of our unofficial metalanguge shorthand.

What about \(+\) and \(*\)? Well, informally we can understand these as "defined" like this:

[Peano-3] \(x+0 = x\)
[Peano-4] \(x+Sy = S(x+y)\)

and:

[Peano-5] \(x*0 = 0\)
[Peano-6] \(x*Sy = (x*y) + x\)

But note that those are recursive definitions. In second-order logic, there will be translations of them into direct, non-recursive formulas, where the defining formulas don't include any occurrences of the terms being defined. We could then treat those definitions the same way we're treating the definition of \(<\): wherever the defined term appears in a formula, we suppose that to be shorthand for the relevant defining expression. In first-order logic, however, we can't do this. Instead, we have to add (the universal generalizations of) the preceding four claims to our set of axioms. That's why I labeled them as I did.

Now there are other facts about addition and multiplication you might think of including, such as that they're commutative and associative. In the end, these will turn out to be provable, and so don't need to be explicitly included.

Finally, we'll want to add the ability to do mathematical induction. Here too, there is a difference between what we can do in the setting of second-order logic and what we have to do in the setting of first-order logic. In first-order logic, we have to add an infinite set of axioms for induction, one for each instance of the schema:

[Peano-Induction] \(\theta 0 \wedge (\forall x. \theta x \supset \theta (Sx)) \supset \forall x \theta x\)

where any predicate expressible in the language of arithmetic is substituted for \(\theta\). Hence, we'll have infinitely many first-order Peano axioms. But that's alright.

Given those axioms, some things we can prove are (universal generalizations of) the following. I'll state some of them in their intuitive English form. See if you can translate them into the official language of arithmetic. (Allow yourself to use \(<\), rather than unpacking its definition.)

Nonstandard models

Take the sentences of true arithmetic. Now consider a model \(\mathscr{N}'\) which differs from \(\mathscr{N}\). Its domain will consist of \(\{\, -1 \,\}\cup \mathbb{N}\). The interpretation function of this model will map the symbol \(0\) to the element \(-1\), and will map the symbol \(S\) to a function mapping element \(-1\) to element \(0\), \(0\) to \(1\), \(1\) to \(2\), and so on. We make similar changes for \(+\) and \(*\). So \([\![\, 0+2 \,]\!]_{\mathscr{N}'}\) will be \([\![\, + \,]\!]_{\mathscr{N}'}(-1, 1)\) --- which won't be element \(0\) because \([\![\, + \,]\!]_{\mathscr{N}'}\) isn't the ordinary addition function. Instead it will be element \(1\), that is, \([\![\, 2 \,]\!]_{\mathscr{N}'}\). (In place of \(-1\) we could just as easily have used \(\pi\) or the Eiffel Tower. We'd just need to make the corresponding changes to the interpretation of \(S\) and \(+\) and \(*\).) Our end result will be a model that's mathematically distinct from \(\mathscr{N}\), but which makes all the same arithmetic sentences true. So it's not the intended model, but it is elementarily equivalent to it.

The two models \(\mathscr{N}'\) and \(\mathscr{N}\) are distinct, but I trust it's clear they are also isomorphic. Can we find models that are elementarily equivalent to \(\mathscr{N}\) --- that is, that make all the same arithmetic sentences true --- but which are not isomorphic?

Let's begin like this. Instead of adding \(-1\) to the domain and having it displace the existing elements, let's instead add a different element \(i\) to the domain without changing the interpretations of the symbols \(0, S0, SS0, \dots\). We'll call the latter the standard elements of the domain, and \(i\) a non-standard element. We'll need to persuade ourselves that some model of this sort will both satisfy all of true arithmetic and be non-isomorphic to the intended model \(\mathscr{N}\).

If we're going to satisfy the sentences of true arithmetic, among these will be sentences that tell us that exactly one of any pair of distinct objects is \(<\) the other, and also sentences that tell us that:

\(x=0 \vee x=S0 \vee \dots \vee x=S^n0 \subset\supset x<S^{n+1}0\),

where \(S^n0\) abbreviates \(n\) applications of \(S\) to \(0\). Since \(i\) won't satisfy the left-hand side of these biconditionals, it can't satisfy the right-hand side either. So if there can be any models of the sort we're looking for, \(i\) will have to be an element that "comes after" all the standard numbers.

Here's how we can persuade ourselves that there are such models. Consider a new language that extends the language of arithmetic by adding a single new term constant \(I\). Let \(\Delta\) be the union of the sentences of true arithmetic (now construed as sentences in the extended language) and all sentences of the form \(S^n0<I\). (That is, the sentences \(0<I, S0<I, SS0<I, \dots\)) Consider any finite subset \(\Xi\) of \(\Delta\). Because \(\Xi\) is finite, there will be a smallest number \(\xi\) such that \(\Xi\) doesn't contain \(S^n0<I\) for any \(n\ge\xi\). Hence, any model \(\mathscr{N}_\xi\) which extends the intended model \(\mathscr{N}\) by interpreting \(I\) to be the number \(\xi\) will satisfy that set of sentences \(\Xi\). So all of \(\Delta\)'s finite subsets are satisfiable. By compactness, it follows that the whole set \(\Delta\) must be satisfiable too. Any model that satisfies \(\Delta\) must be different from all the \(\mathscr{N}_\xi\) models; it has to interpret \(I\) to be some element in the domain other than it interprets any \(S^n0\) to be. But compactness tells us that such a model exists. Since it satisfies \(\Delta\) and true arithmetic \({}\subset\Delta\), this model will also satisfy true arithmetic. The same holds for the restriction of this model that differs by ignoring the constant \(I\), but retaining the element \(i\) of the domain that \(I\) was interpreted to designate. This restricted model will be an interpretation of the original language of arithmetic that satisfies all the sentences in true arithmetic and has an additional non-standard number \(i\) in its domain that "comes after" all the standard numbers.

Can \(i\) be the only non-standard number in this model? No, because it's a truth of arithmetic that every number has a successor, so \(i\) needs a successor too. But none of \(0, S0, SS0, \dots\) can be its successor. So we'll need an additional element to be \(i\)'s successor. Call this additional element \(i+1\). Then we'll need \(i+2\) and so on. But then since everything other than \(0\) is something's successor, we need to add \(i-1\) for \(i\) to be the successor of, and then \(i-2\), and so on.

It may be helpful to consider the complex plane, and let \(i\) be the familiar imaginary unit. Our domain now includes all the complex numbers \(0, 1, 2, \dots\) These are the non-negative integers on the x-axis. It also includes \(\dots, i-2, i-1, i, i+1, i+2, \dots\), which are all the integers on the horizontal line where \(y=1\).

But now what about addition? Adding \(i\) to the standard number \(1\) gives us \(i+1\), which is already in the domain, no problem. But what about adding \(i\) to \(i\)? That will have to be distinct from everything we've so far included, so we'll also have to add all the integers on the horizontal line where \(y=2\), and then again where \(y=3\), and so on. At this point we'll have all these point in the complex plane:

\(\{\, (x,y) \in \mathbb{Z}\times\mathbb{N}\mid (y = 0 \wedge x\ge 0) \vee y > 0 \,\}\).

We interpret \(<\) so that a point \(u<{}\) another point \(v\) iff \(u\) is vertically lower than \(v\), or it's at the same vertical level but to the left of \(v\).

We haven't yet fully specified our model. Consider that we can express \(x~is~even\) in the language of arithmetic (\(\exists y. x = y+y\)). Similarly for \(x~is~odd\). And it will be a truth of arithmetic that every number has one of these properties. So if we want to preserve that sentence, \(i\) must also be come out either "even" or "odd." If it's even, then we'll need to include some further elements in the domain, because nothing so far is such that double it is \(i\). We'll need to add \(\frac{i}2\), and then add every other integer point on the same horizontal line (\(\dots, \frac{i}2-1, \dots, \frac{i}2+1, \dots\)) to take care of successors and predecessors. If on the other hand \(i\) is odd, then \(i+1\) will be even, and we'll need to add \(\frac{i+1}2\), and every other integer point on that horizontal line. Similar reasoning shows that we have to add points on the horizontal lines \(\frac{i}3\) and \(\frac{2i}3\) (or else on lines slightly offset from those), and so on. In the end, we'll need to include all the integers on every horizontal line where \(y\) is a positive rational. That is, now we'll have all these points in the complex plane:

\(\{\, (x,y) \in \mathbb{Z}\times\pmb{\mathbb{Q}} \mid (y = 0 \wedge x\ge 0) \vee y > 0 \,\}\).

Have we fully characterized multiplication? We've said what twice \(i\) is, and half \(i\), and so on. But what is \(i*i\)? We don't want to use the ordinary operation of complex multiplication because in true arithmetic \(\forall z. z*z*z*z=1 \supset z=1\). So to characterize multiplication where non-standard elements are multiplied by other non-standard elements, as when we multiply \(i\) by \(i\), we have to do something more complicated. Currently all our non-standard elements are in the part of the complex plane that lies above the axis \(y=0\). Let's vertically "squash" this region of the complex plane so that it occupies instead the just the open vertical band above \(y=0\) and below \(y=1\). We can do this by mapping every \(y\)-value through a monotonically increasing continuous function whose domain is the first interval and whose range is the second interval. One such function is graphed here. We have to adjust the interpretation of \(+\) to compensate, and also the partial interpretation of \(*\) that we've already identified. Our first non-standard element \(i\) might now appear at \((0,\frac12)\) in the complex plane; but I will continue to refer to it as \(i\). Our problem was that we hadn't specified the interpretation of operations like \(i*i\). We can now do that, by adding another copy of the newly squashed region, on top of the first copy. This new region will lie above \(y=1\) and below \(y=2\) (none of the points on the horizontal lines where \(y=1\) or \(y=2\) will themselves be included). The result of squaring our \(i\) element from the original non-standard block will be the element at the corresponding location in the second block. The result of multiplying the \(i\) element from the first block by the \(i\) element from the second block will then have to be an \(i\) element in yet a third block, above the previous two, and so on. In the end, we'll have all these points in the complex plane:

\(\{\, (x,y) \in \mathbb{Z}\times\mathbb{R}\mid (y=0 \wedge x\ge 0) \vee (y>0 \wedge y \notin \mathbb{N}\wedge \operatorname{squash}^{-1}(y \bmod 1)\in \mathbb{Q}) \,\}\),

where \(\operatorname{squash}\) is whatever we use for our vertical "squashing" function. Formulated differently, that's the same as:

\(\{\, (x,0) \mid x\in\mathbb{N}\,\} \cup \{\, (x,n + \operatorname{squash}(y)) \mid x\in\mathbb{Z}\wedge n\in\mathbb{N}\wedge y\in \mathbb{Q}^+ \,\}\).

Let's give this model a name, call it \(\mathscr{N}''\). The size of this model's domain, like those of \(\mathscr{N}\) and \(\mathscr{N}'\), is still countable. But \(\mathscr{N}''\) won't be isomorphic to the other models.

Although \(\mathscr{N}''\) is not isomorphic to \(\mathscr{N}\), it does make all the same sentences true (it's elementarily equivalent). That is, it satisfies all the sentences of true arithmetic. When we have a model that satisfies a theory but isn't isomorphic to that theory's intended model, we call it a non-standard model. So \(\mathscr{N}''\) is a non-standard model of true arithmetic. Furthermore, up to isomorphism this is the only countable non-standard model. What that means is that any countable models not isomorphic the intended model have to be isomorphic to this one.

Warning: I am sure that the model I've described has the right general shape to be the countable non-standard model of arithmetic. (Specifically, it has the right "order type," and it will satisfy many obvious truths of arithmetic.) But I haven't fully specified the interpretation of \(+\) and \(*\), and there are known difficult issues about doing so. So I can't warrant that what I've described is adequate in all respects. However, it should be good enough for you to have a working idea of what this model is like. See comments at the end of the page.

There seems to be a clear structural difference between our first two models and this new model \(\mathscr{N}''\). In the first two, a chain of predecessors always "bottoms out" at \(0\). Whereas in \(\mathscr{N}''\), \(i\) has a predecessor who has a predecessor who \(\dots\), but this sequence never "bottoms out". When we learn more about order relations, we'll see that the official way to say this is that in \(\mathscr{N}''\), the domain is not well-ordered by \(<\), but in the other models it was. And as it turns out, whether or not a relation is a well-order is one of those things you can only express in second-order logic. So it's not surprising that our first-order theories cannot capture this difference between the models.

This is connected to a question Sam asked in class. He asked us to consider the set \(\{\, \neg\forall x\theta x,\theta(0),\theta(S0),\theta(SS0),\dots \,\}\). (Sets of this sort are called \(\omega\)-inconsistent.) Observe that for any finite subset of this, you could find some \(\theta\) where that subset is satisfied on the intended model of arithmetic. But there's no single \(\theta\) for which all the finite subsets would be true, on the intended model. Still, every finite subset would be satisfiable, for some model, which interpreted symbols differently than the intended model does. And so compactness then tells us that the whole set must be satisfiable. But intuitively it might seem like it shouldn't be. Doesn't the first sentence contradict what all the rest of the sentences together say? Well, only if we've managed to somehow introduce the requirement that the only objects there be are \(0, S0, SS0, \dots\). Let's call objects of this sort, that are reachable by applying the successor function a finite number of times to \(0\), "finitely generated." If we haven't excluded the possibility of there being additional objects that weren't finitely generated, then a model could satisfy all the sentences \(\theta(0), \theta(S0), \theta(SS0), \dots\) while giving \(\theta\) an interpretation that doesn't apply to the additional objects.

And as we were saying, in first-order logic it's not possible to require that the only objects there be were finitely generated. So there are no axioms we could state in a first-order language that would rule out the truth of every such set of sentences. As mentioned above, though, on the intended model \(\mathscr{N}\) of arithmetic, no such set is true --- there is no \(\theta\) where \(\mathscr{N}\) satisfies the whole such set.

The preceding remarks still leave it open that there might be some other difference that the theory of true arithmetic could catch: some sentence that one of the models would make true and the other make false. But I report that there isn't. These models satisfy all the same first-order sentences in the language of arithmetic.

Let's press on and extend our model still further. Let \(\mathscr{N}'''\) include all the same elements from \(\mathscr{N}''\), and also all the integer points on the horizontal line where \(y=\operatorname{squash}(\pi)\), too, and so on for every real. Now we'll have all these points in the complex plane:

\(\{\, (x,0) \mid x\in\mathbb{N}\,\} \cup \{\, (x,n + \operatorname{squash}(y)) \mid x\in\mathbb{Z}\wedge n\in\mathbb{N}\wedge y\in \mathbb{R}^+ \,\}\).

which can be simplified to:

\(\{\, (x,0) \mid x\in\mathbb{N}\,\} \cup \{\, (x,y) \in \mathbb{Z}\times\pmb{\mathbb{R}^+} \mid y\notin \mathbb{N}\,\}\).

If we keep in mind that the cardinality of the reals exceeds the cardinality of \(\mathbb{Q}\), it should be clear that this model's domain is larger than that of \(\mathscr{N}\) and the other models so far considered. There is no bijection between their domains; so they cannot be isomorphic. Yet this model too will also satisfy true arithmetic.

So we had \(\mathscr{N}, \mathscr{N}', \mathscr{N}'', \mathscr{N}'''\). The first two were obviously isomorphic, and the fourth has a bigger domain so is obviously not isomorphic to any of the others. The third was non-obviously not isomorphic to the first two. \(\mathscr{N}''\) is a (the) countable non-standard model of true arithmetic, and \(\mathscr{N}'''\) is one uncountable non-standard model.

One of the important metalogical results introduced this week, the Loewenheim-Skolem theorem (in its "upward" form) will tell us that we should expect to find non-standard models like \(\mathscr{N}'''\). When a theory has any infinite model, it will also have models with larger domains. First-order theories are "blind" to the differences between different infinite cardinalities.

To read more, have a look at Wikipedia.

Warning: This page contains a mix of

Let me be explicit about which is which. The text beginning "Here's how we can persuade ourselves..." sketches a proof that, because of compactness, there must be models of true arithmetic that have at least one additional element in the domain, coming after all the standard numbers. It is provable but not here proven that this model will be non-isomorphic to the intended model, and hence it will be non-standard. We go on to sketch some of the additional characteristics that such a model would have. We just take it for granted that there can be such a model with a countable domain. It is provable that there can be, and that it has the "order type" of my model \(\mathscr{N}''\). I do not fully specify the \(+\) and \(*\) functions for this model, and it's mere speculation that the parts of them that I do sketch are sustainable. It has been proven that up to isomorphism, there is only one countable non-standard models of arithmetic; and that its \(+\) and \(*\) functions are not "arithmetically definable", that is, expressible in the language of arithmetic. It is known, because of the upward Loewenheim-Skolem theorem, that there are also uncountable models of arithmetic (for every cardinality); whether any of them is a "filling in" of the countable model in the way my \(\mathscr{N}'''\) is a "filling in" of \(\mathscr{N}''\) is merely my speculation.

I hope that some of this might help you fix your ideas; but I hope also you won't take anything stated here for a known fact that isn't.