The three topics I had you read about this week were:

Here are some remarks about these topics.

Induction on the complexity of formulas

As Sider remarks on pp. 52--55, most of the inductive proofs you'll give or read when doing logic are either inductions on the complexity of formulas, or induction on the length of proofs. (Though at some points other inductive proofs are necessary, as you'll see when we read his completeness proof on p. 64.)

We'll talk about induction on the length of proofs later.

For induction on the complexity of formulas, there is some variation in how it's officially described and deployed. To what extent we talk about numbers is one of the respects in which different presentations can vary. When we do talk about numbers, what number that is can also vary. Sider uses a number that represents the number of sentence letters --- which we can generalize to the number of atomic formula, to permit us to do induction on formulas like \(Fa \vee Gb\) as well as formulas like \(P \vee Q\). Egré uses a number that represents the degree or complexity of a formula; I didn't give you his definition of that, but it corresponds to how deep is the longest branch of the parse tree of a formula. So \((P \wedge (Q \supset \neg R) \vee (Q \wedge P)\) has a degree of four, since the most deeply sentence constant \(R\) is embedded inside four logical operations.

Gamut describes this method of inductive argument without talking about numbers at all, which is also completely legitimate, as "inductive argument" doesn't by definition require any reference to numbers. It's just that the inductive arguments you're most familiar with have been inductions on the structure of the natural numbers. But another way to present an inductive argument is in terms of the set of all and only objects having some property. You can argue that the base cases belong to this set, and that whenever \(\phi\) and \(\psi\) and so on belong to the set, so too does \(\chi\). If you make the right choices in doing this, you'll have proved that all the objects you're interested in --- say, all well-formed formula --- belong to the set and thus have the property of interest.

When one does carry out an induction on formula complexity in a way that refers to some number, it's often useful to use a form of induction called strong induction. Regular or weak induction on the numbers is the form you're most familiar with:

Argue that \(0\) has property so-and-so. Then argue that when a number \(k\) has the property, \(k+1\) also has the property.

Strong induction is a bit different:

Argue that \(0\) has property so-and-so. Then argue that when all numbers \(\le k\) have the property, \(k+1\) also has the property.

In the case of natural numbers, you can show that anything provable using strong induction is also provable using weak induction. But often it's more convenient to use strong induction. For instance, if I'm proving that some formula \(\phi \vee \psi\) has some property, I can't rely on \(\psi\) having exactly one fewer sentence letter than \(\phi \vee \psi\) (that will be true only when \(\phi\) is a sentence constant), or having a degree that's exactly one less than that of \(\phi \vee \psi\) (that won't be true if \(\phi\) is more complex than \(\psi\).)

Later when we talk about ordinals we'll see a further form of induction (an extension of strong induction) which is stronger than either of these.

Substitution

There are three "snares" with respect to substitution I'm hoping you'll learn how to avoid.

The first comes up when you're substituting \(\rho[x\leftarrow \alpha]\), for some possibly complex term \(\alpha\). (Parts of it may also come up when you're substituting \(\rho[F\leftarrow \psi]\) for some \(n\)-adic predicate \(F\) and formula \(\psi\) with \(n\) free variables, or \(\rho[P\leftarrow \psi]\), for some sentence constant \(P\) and any formula \(\psi\).) One key to remember here is that only free occurrences of \(x\) get replaced. A second key to remember is that any free variables in the replacing term \(\alpha\) should remain free after being inserted. Sometimes this will require renaming some of the bound variables in \(\rho\). For instance, homework exercise 66 said:

If \(\phi\) is the formula \(Fx \supset \forall y(Gyx \vee P \vee \exists x Hx)\), then (a) What is \(\phi[x\leftarrow w]\)? (b) What is \(\phi[x\leftarrow y]\)? (c) What is \(\phi[P\leftarrow \forall x Gxy]\)?

Doing (a) is easy, so long as you remember that only free occurrences of \(x\) get replaced. The answer is: \(F\boldsymbol w \supset \forall y(Gy\boldsymbol w \vee P \vee \exists x Hx)\).

Doing (b) is tricky, because of the second key constraint. You can't say \(F\boldsymbol y \supset \forall y(Gy\boldsymbol y \vee P \vee \exists x Hx)\), because then the second argument to \(G\) would be bound, whereas in the replacing term \(y\) it's not bound. So we'd have to do some bound-variable renaming, to get a correct result like: \(F\boldsymbol y \supset \forall w(Gw\boldsymbol y \vee P \vee \exists x Hx)\).

Doing (c) also requires us to rename the bound \(y\) in \(\phi\), to avoid "capturing" the free occurrence of \(y\) in the replacing term \(\forall x Gxy\).

The second "snare" concerns a different kind of substitution, when you replace subexpressions with other logically equivalent expressions. The "snare" here is to keep in mind that this can be relied on to preserve logical relations but only when the term being replaced and the replacing term are equivalent. Thus the following reasoning pattern is correct:

\(\underline{\phi \mathbin{{=}\!|{\models}}\psi}\)
\(\rho[\chi\leftarrow\phi] \pmb{\,\mathbin{{=}\!|{\models}}\,} \rho[\chi\leftarrow\psi]\)

And so too is:

\(\underline{\phi \mathbin{{=}\!|{\models}}\psi}\)
\(\rho[\chi\leftarrow\phi] \pmb{{}\models{}} \rho[\chi\leftarrow\psi]\)

You may wonder, why did I say \(\rho[\chi\leftarrow\phi] \models \rho[\chi\leftarrow\psi]\) rather than something simpler, like \(\rho \models \rho[\phi\leftarrow\psi]\)? I did this because I wanted to write something that permitted the replacement of only some occurences of \(\phi\) in \(\rho\) with \(\psi\). If I write \(\rho[\phi\leftarrow\psi]\), that requires that all occurrences of \(\phi\) be replaced. That is, if \(\rho\) is:

\(Q \vee (P \supset (P \supset R))\)

then \(\rho[P\leftarrow S \supset T]\) has to be:

\(Q \vee ((S\supset T) \supset ((S\supset T) \supset R))\)

Our definition of substitution doesn't allow you to only replace some of the \(P\)s. On the other hand, if we say:

\(\rho[Q\leftarrow P] \models \rho[Q\leftarrow S \supset T]\)

That will mean:

\(P \vee (P \supset (P \supset R)) \models (S\supset T) \vee (P \supset (P \supset R))\)

Thus giving us a way to talk about the replacement of only some occurrences of \(P\) with \(S\supset T\).

And this reasoning pattern is also correct:

\(\underline{\phi \pmb{{}\models{}} \psi}\)
\(\phi[\chi\leftarrow\rho] \pmb{{}\models{}} \psi[\chi\leftarrow\rho]\)

Becuase \(\rho\) is equivalent to itself \(\rho\). However, don't be fooled into thinking that this reasoning pattern is correct:

\(\underline{\phi \pmb{{}\models{}} \psi}\)
\(\rho[\chi\leftarrow\phi] \pmb{{}\models{}} \rho[\chi\leftarrow\psi]\)

It's easy to see this to be false if you let \(\rho\) be \(\neg \chi\). Just because \(\phi \models \psi\), it does not follow that \(\neg\phi \models \neg\psi\). The reasoning patterns we are endorsing here include only cases where the terms being exchanged are logically equivalent.

Now there will be some special contexts --- some special structures that \(\rho\) can have --- where the last reasoning pattern displayed above is OK, after all. But this will only be for certain restricted choices of \(\rho\), not for arbitrary formulas in which \(\phi\) and \(\psi\) may be embedded.

The third "snare" I want to talk about is that now we've seen a variety of different kinds of substitution. We haven't been so explicit about distinguishing and comparing them, but now I want to do that. I'll remind you of four varieties of substitutions you'll have encountered, and one further thing that isn't a variety of substitution.

First, we saw substitution in the case of formal grammars. This was special in that you only replaced one occurrence of a non-terminal expression with its expansion, and could then go on to replace other occurrences of the same non-terminal expression with a different expansion. Another special thing here was that the replacements could be specified "non-deterministically" --- meaning that any of the supplied choices was a legal replacement.

Second, there is the case of uniformly substituting any (possibly complex, possibly open) formula for a schematic sentence variable like \(\phi\). Under this heading we can also include substituting any (possibly complex, possibly open) term for a schematic term variable like \(\alpha\). In these cases, there's only a single replacing expression, so we don't need to worry about it being equivalent to anything else. Also, if we're substituting into a schematic argument rather than a schematic formula, then the substitution has to be performed uniformly across all formulas in the argument --- in all of the premises as well as the conclusion, every occurrence of the schematic variable in question has to be replaced with the substituting expression.

A third kind of substitution is where you replace expressions with other logically equivalent expressions, and expect the larger formulas of which they're part to keep the same semantic values. This is what you mostly read about for this week, and that we were discussing above, under the heading of the second "snare." (We've primarily read about this in terms of substitutions into single formulas, but the idea could be generalized to apply to substitutions into whole argument patterns. I will prefer to talk about argument schemas, though.)

A fourth kind of substitution is where you do replacements like we discussed under the heading of the first "snare," and homework exercise 66. These are again replacements into single formulas. As with the second kind of substitution, there's only a single replacing expression, so we don't need to worry about it being equivalent to anything else. This kind of substitution comes up in a variety of places. It will come up in our proof systems, for example, when we introduce rules permitting you to derive formulas of the form \(\phi[x\leftarrow \alpha]\) from premises of the form \(\forall x\phi\). It has already come up in some of our semantic metalogical claims and proofs. It would also be part of a definition of what it means to "rename bound variables", which we talked about under the heading of the first "snare".

Finally, there is a fifth thing to talk about. This was homework exercise 69:

Explain the difference between \([\![\, \phi[x\leftarrow a] \,]\!]_{\mathscr{M}\,q}\) and \([\![\, \phi \,]\!]_{\mathscr{M}\,q[x:=a]}\).

We will talk about this in class today. What's going on in \([\![\, \phi \,]\!]_{\mathscr{M}\,q[x:=a]}\) is no form of substitution at all. (Though Bostock and some other authors prefer to give the semantics for quantifiers in terms of substitution, rather than by using the apparatus of variable assignments \(q\).)

Semantics for modal logic

We've only started to look at modal logic. There are further complexities we will take up later. But there are some aspects of it that are just natural continuations of things we've done so far. I'll make a few comments about this.

First, in a modal logic our models have to be more complex, so that we can keep track of how many possible worlds there are, and which other possible worlds are "visible from" a given possible world \(w\). Don't worry too much about that last part yet. The idea here is that in some contexts --- using modal notions to represent some phenomena --- you may want it to be generally true that \(\lozenge \lozenge P\models \lozenge P\), but in others you may not want that. We control for those choices by saying that \(\lozenge \phi\) is understood as true at a world \(w\) iff \(\phi\) is true at some world visible from \(w\) --- which may or may not include all the other possible worlds. Hence \(\lozenge \lozenge \phi\) might fail to entail \(\lozenge \phi\) because the worlds that are visible from worlds visible to \(w\) might not themselves be visible from \(w\). The accessibility relation \(\mathscr{R}\) keeps track of which worlds are visible from which other worlds. As I said, we'll talk about this more later. For the time being, you can just assume that all worlds are visible from all other worlds.

In the case of a modal logic that also includes quantifiers and predicates, then our models will no longer be pairs of domains \(\mathscr{D}\) and atomic interpretation functions \(\mathscr{I}\). They'll also include a domain of worlds \(\mathscr{W}\) and an accessibility relation \(\mathscr{R}\). The pair of these latter elements is sometimes called a frame. But I will primarily be talking about whole models like \((\mathscr{W}, \mathscr{R}, \mathscr{D}, \mathscr{I})\). Now, depending on your needs, you might simplify this or you might complicate it. One complication is that models like these are appropriate only when all the objects in \(\mathscr{D}\) exist in all the worlds in \(\mathscr{W}\). If you want to allow for variable domains --- that is, you want to allow the set of objects available to be quantified over to be different from world to world, then you'll need to add more to the model to keep track of which objects exist in which worlds. We'll talk about that much later. One simplification is you might start out thinking just about modal sentential logic, and leave all the tricky issues about quantification for a later day. In that case, you can leave the \(\mathscr{D}\) out of your models, and just talk about models of the form \((\mathscr{W}, \mathscr{R}, \mathscr{I})\). This is what Sider does in his chapter 6.

I hope that despite all these complexities, though, you can already see a lot of formal similarity between the way we handle modality and the way we handle quantification. These are not exactly the same, but there is a lot in common.

One similarity is that just as in the predicational case, where we further relativize \([\![\, \phi \,]\!]_{\mathscr{M}}\) to an assignment \(q\), so too in the modal case we will relativize interpretations to worlds \(w\). We could write:

\([\![\, \phi \,]\!]_{\mathscr{M}\,q\,w}\)

or, if we're just dealing with modal sentential logic, we can omit the assignments and have:

\([\![\, \phi \,]\!]_{\mathscr{M}\,w}\)

Sider writes this instead as:

\(\mathrm{V}_{\mathscr{M}}(\phi,w)\)

But that is just a notational variation.

Another similarity is that we define formula validity in terms of truth in all models and worlds (or models and assignments and worlds, if we have assignments). Similarly for the notion of logical (aka semantic) entailment (aka consequence).

One difference between worlds and assignments is that sometimes when dealing with the semantics of modal language, we're interested not just in whether some formula is true for all the parameters in question (true at all worlds), but in particular whether it's true at some particular, special world \(w_{\mathrm{actual}}\). (We may even want to go on to introduce an actuality operator into the language to interact in a special way with the specification of which world is \(w_{\mathrm{actual}}\).) There's nothing comparable in the case of assignments: in general we have no intrinsic interest in what formulas are true on specific assignments. (Bostock will, because he has assignments settle the interpretation of term constants as well as variables; but we follow standard practice and have that job performed by \(\mathscr{I}\) instead.)

A connected difference is that in the language of predicate logic there is such a thing as open formulas, whereas in the modal logics we're looking at there is not. Our modal operators act something like quantifiers do, but we're not introducing any terms designating worlds into the object language itself. If you press your studies further, you'll see that this point of difference gets to be less straightforward. Many linguists and philosophers now think that there are something like terms designating worlds in natural language, it's just that these terms aren't pronounced. (They're "phonologically null.") In the other direction, there are treatments of predicate logic where the appearance of free variables is just superficial or gets to be eliminated, in favor of other techniques for pairing quantifiers up with one or more argument slots --- techniques that parallel things that have sometimes been proposed for modal operators. So as I said, this gets to be complex. But in the mainstream logical paradigms, we can have formulas with explicit unbound term variables, and we don't have formulas with explicit unbound world variables.

What are our semantic values

One way of thinking of semantic values in the case of modal language is where the interpretations of formulas are still truth-values, it's just that those interpretations are relativized not only to models (and perhaps assignments) but also to worlds, just as in the case of predicate logic we made interpretations relativized not only to models but also to assignments. This goes together with talking about:

\([\![\, \phi \,]\!]_{\mathscr{M}\,q\,w}\)

or, if we're just dealing with modal sentential logic:

\([\![\, \phi \,]\!]_{\mathscr{M}\,w}\)

A different way of thinking of semantic values here is as thinking of them as being unrelativized, but as the results being not truth-values, but rather functions from worlds to truth-values. That is, using the notation I introduced at the end of the the first notes on logic and semantics, we could talk about:

\([\![\, \phi \,]\!]_{\mathscr{M}\,q\,\mathrm{fun}}\)

or, if we're just dealing with modal sentential logic:

\([\![\, \phi \,]\!]_{\mathscr{M}\,\mathrm{fun}}\)

Or, since it's easy to translate between talk of functions from worlds to truth-values and talk of sets of worlds, for which those are the characteristic functions, we could talk of:

\([\![\, \phi \,]\!]_{\mathscr{M}\,\mathrm{set}}\)

That is, we say that \(\phi\)'s interpretation relative to a model \(\mathscr{M}\) is a set of worlds, the set of worlds \(w\) where the original apparatus:

\([\![\, \phi \,]\!]_{\mathscr{M}\,w}\)

would have delivered the result true.

I expect you are familiar with some of these ways of talking. What might be a new realization for you is that we can do similar things with respect to our non-modal language, too.

So, for example, restricting ourselves just to predicate logic, instead of talking about:

\([\![\, \eta \,]\!]_{\mathscr{M}\,q}\)

we could instead say the semantic value of an expression \(\eta\) relative to a model \(\mathscr{M}\) is not further relativized, but is instead a function from assignment functions into something else:

\([\![\, \eta \,]\!]_{\mathscr{M}\,\mathrm{fun}}\)

Where we're talking about the interpretation of formulas rather than terms, then we might talk instead of the semantic value being a set of assignment functions:

\([\![\, \phi \,]\!]_{\mathscr{M}\,\mathrm{set}}\)

That is, the set:

\(\{\, q\mid [\![\, \phi \,]\!]_{\mathscr{M}\,q} = \mathsf{True}\,\}\)

So long as we stick to classical semantics for predicate logic, there does not seem to be any deep theoretical reason to prefer one of these approaches over any of the others. However, the last approaches naturally generalize in certain ways that our original approach does not. So if you go on to study dynamic semantics in linguistics or contemporary philosophy of language, their treatment of pronouns will involve something that also can be naturally described in this last way, but is not equivalent to anything we've seen so far.

Turning from predicate logic to homely non-modal sentential logic, recall that what I'm writing as:

\([\![\, \phi \,]\!]_{\mathscr{M}}\)

could also be written as:

\([\![\, \phi \,]\!]_{\mathscr{I}}\)

since in that setting all there is to a model is just the atomic interpretation function \(\mathscr{I}\). Sider tends to write instead:

\(\mathrm{V}_{\mathscr{I}}(\phi)\)

Now here too we might instead talk about:

\([\![\, \phi \,]\!]_{\mathrm{set}}\)

That is, the set of atomic interpretation functions \(\mathscr{I}\) on which \(\mathrm{V}_{\mathscr{I}}(\phi)\) is true. Recall that in this setting an atomic interpretation function is just a mapping of sentence constants to truth-values. So we could make our semantic values in the case of non-modal sentential logic be sets of such mappings, rather than have them be relativized to such mappings.

One benefit we get by doing that is that the semantic values that result have nice algebraic relations to each other. We get a kind of algebraic structure called a Boolean algebra. We won't be able to define this properly until we talk about lattices later in the class. But it is inter-translatable with the notion of a Boolean ring that we saw at the end of our Notes on Algebraic Structures. So we get results like this:

\([\![\, \phi \vee \psi \,]\!] = [\![\, \phi \,]\!] \cup [\![\, \psi \,]\!]\)

and:

\([\![\, \phi \supset \psi \,]\!] = [\![\, \phi \,]\!] \le [\![\, \psi \,]\!]\)

for some suitable interpretation of \(\le\), that we haven't yet explained. When \(\models \phi \subset\supset \psi\), we of course get:

\([\![\, \phi \,]\!] = [\![\, \psi \,]\!]\).

If one hasn't yet specified a semantics for a language, but does have a proof theory for it, then one can define \([\![\, \phi \,]\!]\) to be the set of formulas provably equivalent to \(\phi\), and define \(\le\) in terms of derivability, then use \(\le\) to define \(\cup \) and \(\cap \), and then get a different Boolean algebra. The elements of this new algebra will be different: they'll be sets of formulas rather than sets of mappings from formulas to truth-values, but when the proof theory one's starting from corresponds nicely to a given semantics, the two algebras one gets in this way will be isomorphic.

Other odds and ends