% More Notes on Logic and Semantics
The three topics I had you read about this week were:
* the handouts reviewing induction on the complexity of formulas
* substitution of equivalent formulas
* some initial material on the semantics of modal logic
Here are some remarks about these topics.
### Induction on the complexity of formulas ###
As Sider remarks on pp. 52--55, most of the inductive proofs you'll give or read when doing logic are either inductions on the complexity of formulas, or induction on the length of proofs. (Though at some points other inductive proofs are necessary, as you'll see when we read his completeness proof on p. 64.)
We'll talk about induction on the length of proofs later.
For induction on the complexity of formulas, there is some variation in how it's officially described and deployed. To what extent we talk about numbers is one of the respects in which different presentations can vary. When we do talk about numbers, what number that is can also vary. Sider uses a number that represents the number of sentence letters --- which we can generalize to the number of atomic formula, to permit us to do induction on formulas like $Fa \vee Gb$ as well as formulas like $P \vee Q$. Egré uses a number that represents the degree or complexity of a formula; I didn't give you his definition of that, but it corresponds to how deep is the longest branch of the parse tree of a formula. So $(P \wedge (Q \cond \neg R) \vee (Q \wedge P)$ has a degree of four, since the most deeply sentence constant $R$ is embedded inside four logical operations.
Gamut describes this method of inductive argument without talking about numbers at all, which is also completely legitimate, as "inductive argument" doesn't by definition require any reference to numbers. It's just that the inductive arguments you're most familiar with have been inductions on the structure of the natural numbers. But another way to present an inductive argument is in terms of the set of all and only objects having some property. You can argue that the base cases belong to this set, and that whenever $\phi$ and $\psi$ and so on belong to the set, so too does $\chi$. If you make the right choices in doing this, you'll have proved that all the objects you're interested in --- say, all well-formed formula --- belong to the set and thus have the property of interest.
When one does carry out an induction on formula complexity in a way that refers to some number, it's often useful to use a form of induction called **strong induction**. Regular or **weak induction** on the numbers is the form you're most familiar with:
> Argue that $0$ has property so-and-so. Then argue that when a number $k$ has the property, $k+1$ also has the property.
Strong induction is a bit different:
> Argue that $0$ has property so-and-so. Then argue that when *all numbers $\le k$* have the property, $k+1$ also has the property.
In the case of natural numbers, you can show that anything provable using strong induction is also provable using weak induction. But often it's more convenient to use strong induction. For instance, if I'm proving that some formula $\phi \vee \psi$ has some property, I can't rely on $\psi$ having exactly one fewer sentence letter than $\phi \vee \psi$ (that will be true only when $\phi$ is a sentence constant), or having a degree that's exactly one less than that of $\phi \vee \psi$ (that won't be true if $\phi$ is more complex than $\psi$.)
Later when we talk about ordinals we'll see a further form of induction (an extension of strong induction) which is stronger than either of these.
### Substitution ###
There are three "snares" with respect to substitution I'm hoping you'll learn how to avoid.
The first comes up when you're substituting $\rho[x\leftarrow \alpha]$, for some possibly complex term $\alpha$. (Parts of it may also come up when you're substituting $\rho[F\leftarrow \psi]$ for some $n$-adic predicate $F$ and formula $\psi$ with $n$ free variables, or $\rho[P\leftarrow \psi]$, for some sentence constant $P$ and any formula $\psi$.) One key to remember here is that *only free occurrences of $x$ get replaced*. A second key to remember is that *any free variables in the replacing term $\alpha$ should remain free after being inserted*. Sometimes this will require renaming some of the bound variables in $\rho$. For instance, homework exercise 66 said:
> If $\phi$ is the formula $Fx \cond \forall y(Gyx \vee P \vee \exists x Hx)$, then (a) What is $\phi[x\leftarrow w]$? (b) What is $\phi[x\leftarrow y]$? (c) What is $\phi[P\leftarrow \forall x Gxy]$?
Doing (a) is easy, so long as you remember that only free occurrences of $x$ get replaced. The answer is: $F\boldsymbol w \cond \forall y(Gy\boldsymbol w \vee P \vee \exists x Hx)$.
Doing (b) is tricky, because of the second key constraint. You can't say $F\boldsymbol y \cond \forall y(Gy\boldsymbol y \vee P \vee \exists x Hx)$, because then the second argument to $G$ would be bound, whereas in the replacing term $y$ it's not bound. So we'd have to do some bound-variable renaming, to get a correct result like: $F\boldsymbol y \cond \forall w(Gw\boldsymbol y \vee P \vee \exists x Hx)$.
Doing (c) also requires us to rename the bound $y$ in $\phi$, to avoid "capturing" the free occurrence of $y$ in the replacing term $\forall x Gxy$.
The second "snare" concerns a different kind of substitution, when you replace subexpressions with other logically equivalent expressions. The "snare" here is to keep in mind that this can be relied on to preserve logical relations but only when the term being replaced and the replacing term *are equivalent*. Thus the following reasoning pattern is correct:
> $\underline{\phi \bimodels \psi}$
> $\rho[\chi\leftarrow\phi] \pmb{\,\bimodels\,} \rho[\chi\leftarrow\psi]$
And so too is:
> $\underline{\phi \bimodels \psi}$
> $\rho[\chi\leftarrow\phi] \pmb{{}\models{}} \rho[\chi\leftarrow\psi]$
You may wonder, why did I say $\rho[\chi\leftarrow\phi] \models \rho[\chi\leftarrow\psi]$ rather than something simpler, like $\rho \models \rho[\phi\leftarrow\psi]$? I did this because I wanted to write something that permitted the replacement of only *some* occurences of $\phi$ in $\rho$ with $\psi$. If I write $\rho[\phi\leftarrow\psi]$, that requires that *all* occurrences of $\phi$ be replaced. That is, if $\rho$ is:
> $Q \vee (P \cond (P \cond R))$
then $\rho[P\leftarrow S \cond T]$ has to be:
> $Q \vee ((S\cond T) \cond ((S\cond T) \cond R))$
Our definition of substitution doesn't allow you to only replace some of the $P$s. On the other hand, if we say:
> $\rho[Q\leftarrow P] \models \rho[Q\leftarrow S \cond T]$
That will mean:
> $P \vee (P \cond (P \cond R)) \models (S\cond T) \vee (P \cond (P \cond R))$
Thus giving us a way to talk about the replacement of only some occurrences of $P$ with $S\cond T$.
And this reasoning pattern is also correct:
> $\underline{\phi \pmb{{}\models{}} \psi}$
> $\phi[\chi\leftarrow\rho] \pmb{{}\models{}} \psi[\chi\leftarrow\rho]$
Becuase $\rho$ is equivalent to itself $\rho$. However, don't be fooled into thinking that this reasoning pattern is correct:
> $\underline{\phi \pmb{{}\models{}} \psi}$
> $\rho[\chi\leftarrow\phi] \pmb{{}\models{}} \rho[\chi\leftarrow\psi]$
It's easy to see this to be false if you let $\rho$ be $\neg \chi$. Just because $\phi \models \psi$, it does not follow that $\neg\phi \models \neg\psi$. The reasoning patterns we are endorsing here include only cases where the terms being exchanged are logically *equivalent*.
Now there will be some special contexts --- some special structures that $\rho$ can have --- where the last reasoning pattern displayed above *is* OK, after all. But this will only be for certain restricted choices of $\rho$, not for arbitrary formulas in which $\phi$ and $\psi$ may be embedded.
The third "snare" I want to talk about is that now we've seen a variety of different kinds of substitution. We haven't been so explicit about distinguishing and comparing them, but now I want to do that. I'll remind you of four varieties of substitutions you'll have encountered, and one further thing that isn't a variety of substitution.
First, we saw substitution in the case of formal grammars. This was special in that you only replaced *one* occurrence of a non-terminal expression with its expansion, and could then go on to replace *other* occurrences of the same non-terminal expression with a different expansion. Another special thing here was that the replacements could be specified "non-deterministically" --- meaning that *any* of the supplied choices was a legal replacement.
Second, there is the case of uniformly substituting any (possibly complex, possibly open) formula for a schematic sentence variable like $\phi$. Under this heading we can also include substituting any (possibly complex, possibly open) term for a schematic term variable like $\alpha$. In these cases, there's only a single replacing expression, so we don't need to worry about it being equivalent to anything else. Also, if we're substituting into a schematic *argument* rather than a schematic *formula*, then the substitution has to be performed uniformly across all formulas in the argument --- in all of the premises as well as the conclusion, every occurrence of the schematic variable in question has to be replaced with the substituting expression.
A third kind of substitution is where you replace expressions with other *logically equivalent* expressions, and expect the larger formulas of which they're part to keep the same semantic values. This is what you mostly read about for this week, and that we were discussing above, under the heading of the second "snare." (We've primarily read about this in terms of substitutions into single formulas, but the idea could be generalized to apply to substitutions into whole argument patterns. I will prefer to talk about argument *schemas*, though.)
A fourth kind of substitution is where you do replacements like we discussed under the heading of the first "snare," and homework exercise 66. These are again replacements into single formulas. As with the *second* kind of substitution, there's only a *single* replacing expression, so we don't need to worry about it being equivalent to anything else. This kind of substitution comes up in a variety of places. It will come up in our proof systems, for example, when we introduce rules permitting you to derive formulas of the form $\phi[x\leftarrow \alpha]$ from premises of the form $\forall x\phi$. It has already come up in some of our semantic metalogical claims and proofs. It would also be part of a definition of what it means to "rename bound variables", which we talked about under the heading of the first "snare".
Finally, there is a fifth thing to talk about. This was homework exercise 69:
> Explain the difference between $\interp{\phi[x\leftarrow a]}_{\script M\,q}$ and $\interp{\phi}_{\script M\,q[x:=a]}$.
We will talk about this in class today. What's going on in $\interp{\phi}_{\script M\,q[x:=a]}$ is no form of substitution at all. (Though Bostock and some other authors prefer to give the semantics for quantifiers in terms of substitution, *rather than* by using the apparatus of variable assignments $q$.)
### Semantics for modal logic ###
We've only started to look at modal logic. There are further complexities we will take up later. But there are some aspects of it that are just natural continuations of things we've done so far. I'll make a few comments about this.
First, in a modal logic our *models* have to be more complex, so that we can keep track of how many possible worlds there are, and which other possible worlds are "visible from" a given possible world $w$. Don't worry too much about that last part yet. The idea here is that in some contexts --- using modal notions to represent some phenomena --- you may want it to be generally true that $\Diamond\Diamond P\models \Diamond P$, but in others you may not want that. We control for those choices by saying that $\Diamond \phi$ is understood as true at a world $w$ iff $\phi$ is true at some world *visible from $w$* --- which may or may not include all the other possible worlds. Hence $\Diamond\Diamond \phi$ might fail to entail $\Diamond \phi$ because the worlds that are *visible from worlds visible to $w$* might not themselves be *visible from $w$*. The **accessibility relation** $\script R$ keeps track of which worlds are visible from which other worlds. As I said, we'll talk about this more later. For the time being, you can just assume that all worlds are visible from all other worlds.
In the case of a modal logic that also includes quantifiers and predicates, then our models will no longer be pairs of domains $\script D$ and atomic interpretation functions $\script I$. They'll also include a domain of worlds $\script W$ and an accessibility relation $\script R$. The pair of these latter elements is sometimes called a **frame**. But I will primarily be talking about *whole models* like $(\script W, \script R, \script D, \script I)$. Now, depending on your needs, you might simplify this or you might complicate it. One *complication* is that models like these are appropriate only when all the objects in $\script D$ exist in all the worlds in $\script W$. If you want to allow for **variable domains** --- that is, you want to allow the set of objects available to be quantified over to be different from world to world, then you'll need to add *more* to the model to keep track of which objects exist in which worlds. We'll talk about that much later. One *simplification* is you might start out thinking just about modal *sentential* logic, and leave all the tricky issues about quantification for a later day. In that case, you can leave the $\script D$ out of your models, and just talk about models of the form $(\script W, \script R, \script I)$. This is what Sider does in his chapter 6.
I hope that despite all these complexities, though, you can already see a lot of formal similarity between the way we handle modality and the way we handle quantification. These are not exactly the same, but there is a lot in common.
One similarity is that just as in the predicational case, where we further relativize $\interp{\phi}_{\script M}$ to an assignment $q$, so too in the modal case we will relativize interpretations to worlds $w$. We could write:
> $\interp{\phi}_{\script M\,q\,w}$
or, if we're just dealing with modal sentential logic, we can omit the assignments and have:
> $\interp{\phi}_{\script M\,w}$
Sider writes this instead as:
> $\mathrm{V}_{\script M}(\phi,w)$
But that is just a notational variation.
Another similarity is that we define formula validity in terms of truth in all *models and worlds* (or models and assignments and worlds, if we have assignments). Similarly for the notion of logical (aka semantic) entailment (aka consequence).
One *difference* between worlds and assignments is that sometimes when dealing with the semantics of modal language, we're interested not just in whether some formula is true for all the parameters in question (true at all worlds), but in particular whether it's true at some particular, special world $w_{\mathrm{actual}}$. (We may even want to go on to introduce an **actuality operator** into the language to interact in a special way with the specification of which world is $w_{\mathrm{actual}}$.) There's nothing comparable in the case of assignments: in general we have no intrinsic interest in what formulas are true on specific assignments. (Bostock will, because he has assignments settle the interpretation of term constants as well as variables; but we follow standard practice and have that job performed by $\script I$ instead.)
A connected difference is that in the language of predicate logic there is such a thing as *open formulas*, whereas in the modal logics we're looking at there is not. Our modal operators act something like quantifiers do, but we're not introducing any terms designating worlds into the object language itself. If you press your studies further, you'll see that this point of difference gets to be less straightforward. Many linguists and philosophers now think that there *are* something like terms designating worlds in natural language, it's just that these terms aren't pronounced. (They're "phonologically null.") In the other direction, there are treatments of predicate logic where the appearance of free variables is just superficial or gets to be eliminated, in favor of other techniques for pairing quantifiers up with one or more argument slots --- techniques that parallel things that have sometimes been proposed for modal operators. So as I said, this gets to be complex. But in the mainstream logical paradigms, we can have formulas with explicit unbound term variables, and we don't have formulas with explicit unbound world variables.
### What are our semantic values ###
One way of thinking of semantic values in the case of modal language is where the interpretations of formulas are still truth-values, it's just that those interpretations are relativized not only to models (and perhaps assignments) but also to worlds, just as in the case of predicate logic we made interpretations relativized not only to models but also to assignments. This goes together with talking about:
> $\interp{\phi}_{\script M\,q\,w}$
or, if we're just dealing with modal sentential logic:
> $\interp{\phi}_{\script M\,w}$
A different way of thinking of semantic values here is as thinking of them as being *unrelativized*, but as the results being not *truth-values*, but rather *functions* from worlds to truth-values. That is, using the notation I introduced at the end of the [the first notes on logic and semantics](semantics.htm), we could talk about:
> $\interp{\phi}_{\script M\,q\,\mathrm{fun}}$
or, if we're just dealing with modal sentential logic:
> $\interp{\phi}_{\script M\,\mathrm{fun}}$
Or, since it's easy to translate between talk of functions from worlds to truth-values and talk of sets of worlds, for which those are the characteristic functions, we could talk of:
> $\interp{\phi}_{\script M\,\mathrm{set}}$
That is, we say that $\phi$'s interpretation relative to a model $\script M$ is a set of worlds, the set of worlds $w$ where the original apparatus:
> $\interp{\phi}_{\script M\,w}$
would have delivered the result true.
I expect you are familiar with some of these ways of talking. What might be a new realization for you is that we can do similar things with respect to our non-modal language, too.
So, for example, restricting ourselves just to predicate logic, instead of talking about:
> $\interp{\eta}_{\script M\,q}$
we could instead say the semantic value of an expression $\eta$ relative to a model $\script M$ is not *further* relativized, but is instead a function from assignment functions into something else:
> $\interp{\eta}_{\script M\,\mathrm{fun}}$
Where we're talking about the interpretation of formulas rather than terms, then we might talk instead of the semantic value being a set of assignment functions:
> $\interp{\phi}_{\script M\,\mathrm{set}}$
That is, the set:
> $\set{q\where \interp{\phi}_{\script M\,q} = \True}$
So long as we stick to classical semantics for predicate logic, there does not seem to be any deep theoretical reason to prefer one of these approaches over any of the others. However, the last approaches naturally *generalize* in certain ways that our original approach does not. So if you go on to study **dynamic semantics** in linguistics or contemporary philosophy of language, their treatment of pronouns will involve something that also can be naturally described in this last way, but is not equivalent to anything we've seen so far.
Turning from predicate logic to homely non-modal sentential logic, recall that what I'm writing as:
> $\interp{\phi}_{\script M}$
could also be written as:
> $\interp{\phi}_{\script I}$
since in that setting *all there is* to a model is just the atomic interpretation function $\script I$. Sider tends to write instead:
> $\mathrm{V}_{\script I}(\phi)$
Now here too we might instead talk about:
> $\interp{\phi}_{\mathrm{set}}$
That is, the set of atomic interpretation functions $\script I$ on which $\mathrm{V}_{\script I}(\phi)$ is true. Recall that in this setting an atomic interpretation function is just a mapping of sentence constants to truth-values. So we could make our semantic values in the case of non-modal sentential logic be sets of such mappings, rather than have them be relativized to such mappings.
One benefit we get by doing that is that the semantic values that result have nice algebraic relations to each other. We get a kind of algebraic structure called a **Boolean algebra**. We won't be able to define this properly until we talk about lattices later in the class. But it is inter-translatable with the notion of a **Boolean ring** that we saw at the end of our [Notes on Algebraic Structures](algebra.htm#boolean-ring). So we get results like this:
> $\interp{\phi \vee \psi} = \interp{\phi} \union \interp{\psi}$
and:
> $\interp{\phi \cond \psi} = \interp{\phi} \le \interp{\psi}$
for some suitable interpretation of $\le$, that we haven't yet explained. When $\models \phi \bicond \psi$, we of course get:
> $\interp{\phi} = \interp{\psi}$.
If one hasn't yet specified a semantics for a language, but does have a proof theory for it, then one can define $\interp{\phi}$ to be the *set of formulas* provably equivalent to $\phi$, and define $\le$ in terms of derivability, then use $\le$ to define $\union$ and $\intersect$, and then get a different Boolean algebra. The elements of this new algebra will be different: they'll be sets of formulas rather than sets of mappings from formulas to truth-values, but when the proof theory one's starting from corresponds nicely to a given semantics, the two algebras one gets in this way will be isomorphic.
### Other odds and ends ###
* When $f$ is a function whose arguments and results are negatable (what that means can vary), we say that the **dual** of $f$ is the function $f^{\mathrm{dual}}$, where $f^{\mathrm{dual}}(x_1,\dots,x_n) = \neg f(\neg x1,\dots,\neg x_n)$. So the dual of $\vee$ is $\wedge$, because $\interp{\phi\wedge\psi}$ (for any $\script M$ and $q$)${} = \interp{\neg(\neg\phi \vee \neg\psi)}$. (And vice versa.) The dual of $\forall$ is $\exists$, because $\interp{\exists x\phi} = \interp{\neg\forall x\neg\phi}$. The dual of $\Box$ is $\Diamond$. And so on.
Question: what if anything is the dual of $\neg$?
* In some argumentative contexts, it's useful to work with formulas that have a special canonical or **normal form**. I'll introduce three of these now; we may introduce a fourth (Skolem normal form) in later classes.
* **Conjunctive normal form (CNF)** is when the formula is a sequence of 0 or more conjunctions of units, each of which is a sequence of 0 or more disjunctions of units, each of which is a formula containing no conjunctions or disjunctions. In sentential logic, these inner units can be either sentence constants or negations of them.
* **Disjunctive normal form (DNF)** is when the formula is a sequence of 0 or more disjunctions of units, each of which is a sequence of 0 or more conjunctions of units, each of which is a formula containing no conjunctions or disjunctions. In sentential logic, these inner units can be either sentence constants or negations of them.
* **Prenex normal form (PNF)** is for formulas in the language of predicate logic. These will be formulas where all the quantifiers are in widest scope, thus:
> $\forall x\exists y (Gxy \cond \neg Fx)$
is in prenex normal form, but:
> $\forall x(\exists y Gxy \cond \neg Fx)$
isn't, because here the $\exists$ is inside the scope of the $\cond$.
In fact any formula in the languages we've looked at can be shown to be equivalent to formulas in these different normal forms. You can rely on a number of equivalence patterns to effect this, some of which I'm sure you're familiar with:
> $\phi \vee \psi \bimodels \neg (\neg \phi \wedge \neg \psi)$
> $\phi \wedge \psi \bimodels \neg (\neg \phi \vee \neg \psi)$
> $\phi \cond \psi \bimodels \neg \phi \vee \psi$
> $\neg(\phi \cond \psi) \bimodels \phi \wedge \neg \psi$
> $\forall x\phi \bimodels \neg\exists x\neg\phi$
> $\exists x\phi \bimodels \neg\forall x\neg\phi$
> $\phi\wedge(\psi\vee\chi) \bimodels (\phi\wedge\psi)\vee(\phi\wedge\chi)$
> $(\psi\vee\chi)\wedge\phi \bimodels (\psi\wedge\phi)\vee(\chi\wedge\phi)$
> $\phi\vee(\psi\wedge\chi) \bimodels (\phi\vee\psi)\wedge(\phi\vee\chi)$
> $(\psi\wedge\chi)\vee\phi \bimodels (\psi\vee\phi)\wedge(\chi\vee\phi)$
Some other equivalence patterns useful for this you turn out to have (almost) established in earlier homeworks. You proved a specific case of this in exercise 56c:
> $\phi\cond(\psi\vee\chi) \bimodels (\phi\cond\psi)\vee(\phi\cond\chi)$
(What you proved was the case where all the schemas are instantiated with sentence constants.) This is also true:
> $\phi\cond(\psi\wedge\chi) \bimodels (\phi\cond\psi)\wedge(\phi\cond\chi)$
You'll have noticed that exercise 56d was not possible to prove as stated. There is an entailment in one direction, but not in the other. You may have noticed that if you replaced one of the $\vee$s with a $\wedge$, then an equivalence did result. That is, (generalizing to operands of arbitrary complexity) both of the following equivalences are correct:
> $(\psi\vee\chi)\cond\phi \bimodels (\psi\cond\phi)\wedge(\chi\cond\phi)$
> $(\psi\wedge\chi)\cond\phi \bimodels (\psi\cond\phi)\vee(\chi\cond\phi)$
In exercise 57a and 57b, you proved specific cases of the following equivalences. When $x$ does not occur free in $\psi$ then:
> $\forall x(\phi\cond\psi) \bimodels \exists x\phi \cond \psi$
> $\forall x(\psi\cond\phi) \bimodels \psi \cond \forall x\phi$
Continuing to suppose that $x$ does not occur free in $\psi$, the following equivalences also hold:
> $\psi \wedge \forall x\phi \bimodels \forall x(\psi \wedge \phi)$
> $\psi \vee \forall x\phi \bimodels \forall x(\psi \vee \phi)$
> $\psi \wedge \exists x\phi \bimodels \exists x(\psi \wedge \phi)$
> $\psi \vee \exists x\phi \bimodels \exists x(\psi \vee \phi)$
In these equivalences, $x$ is allowed to occur free in any of the schemas:
> $\forall x(\phi \wedge \chi) \bimodels \forall x\phi \wedge \forall x\chi$
> $\exists x(\phi \vee \chi) \bimodels \exists x\phi \vee \exists x\chi$
But those equivalences don't hold in general when $\vee$s and $\wedge$s are interchanged. Here is one more useful equivalence:
> $\exists x(\phi \cond \chi) \bimodels \forall x\phi \cond \exists x\chi$
In using any of these, it's useful to remember that (at least in mainstream classical logic, where the domain is understood to always be non-empty), $\forall x\phi$ and $\exists x\phi$ are logically equivalent to $\phi$ when $x$ doesn't occur free in $\phi$.
* You may sometimes see quantifiers like this:
> $\exists!\,x \phi$
What that means is that there exists exactly one $x$ such that $\phi$. It can be expressed using the resources we've already introduced. Suppose $y$ is a variable not occurring free in $\phi$, then:
> $\exists x (\phi \wedge \forall y(\phi[x\leftarrow y] \cond y=x))$
(You may recognize this as part of Russell's proposal about how the description $the~\phi$ should be understood.) Or, more briefly:
> $\exists x\forall y(\phi[x\leftarrow y] \bicond y=x)$
Here's a different notation you will also come across, which looks superficially similar:
> $\mathsf{E}!\,\alpha$
where $\alpha$ is any term. Note that the $\mathsf{E}$ is not flipped. This notation should be undertood as saying $\alpha~exists$, and is important to free logics, where we don't assume in advance that every term is referring. When the language one is working with has an $=$ operation (not always the case), then you can rely on the equivalence (assume $y$ does not occur free in $\alpha)$:
> $\mathsf{E}!\,\alpha \bimodels \exists y(\alpha = y)$
because in free logics $\exists$ is still assumed to have its familiar existential commitment.
* We've looked a little bit at functors or function symbols, that permit us to form complex terms, like $successor~of~0$ or $x+y$. These do introduce some complexity into our grammars and semantics, and even more complexity into our proof theories and metalogic arguments about the proof theories. That's why logic books sometimes introduce them as an afterword, rather than as part of the main story.
For the time being, I just want to observe about *these* complex terms that they're still relatively tame, since we are assuming functors to express total functions into our domain of quantification. So this device is suited to express the commonest variety of mathematical opertations, like the examples I mentioned a moment ago. It's not suited to express relationships that aren't functional, such as $number~less~than~2$, since here there is no unique value or result: both $0$ and $1$ fit the bill. (If you have sets of numbers in your domain as well as numbers, then you *could* have a functor $set~of~numbers~less~than$. But I'll suppose we don't have sets in the domain, only numbers.) Neither is it suited to express relationships that may sometimes fail to have any value, such as $predecessor~of$. (In the structure of *natural* numbers, $0$ has no predecessor.)
To address the second of these lacks, sometimes a technique of **(singular) definite description** is added to the language. (The first lack may be addressed with a technique of **plural description**, but this involves more radical departures from mainstream logics than we'll be discussing.) In addition to being designed to handle cases where there is no suitable value, this technique is also designed to handle descriptive conditions of arbitrary complexity. Your language will just have a fixed stock of functors: perhaps $successor~of$ and $+$, or perhaps more, but there will be limits to what complex terms you can form using functors. With descriptions, on the other hand, for any given open formula in the language we assume there to be a corresponding description.
There is a variety of ways in which descriptions can be expressed and analyzed. One natural way to begin, grammatically, is to think of them as a new kind of quantifier:
> $\mathsf{The}\,x\colon \phi. \psi$
would be true if there is a unique $x$ satisfying $\phi$, and it also satisfies $\psi$. If there is a unique $x$ satisfying $\phi$ and it doesn't satisfy $\psi$, then the quantified formula will be false.
Now what if there are no $x$s satisfying $\phi$, or many such $x$s? A natural way to begin, semantically, is to think that in that case the quantified formula will also be false.
But there are different variations one might spin from that natural starting point. In the first place, grammatical variations. One might prefer to (additionally, or solely) treat descriptions as looking syntactically more like a term than a quantifier. In that case, one would write instead:
> $\psi[x\leftarrow \desc x\phi]$
That is, supposing $\psi$ to be $Fxy$, one would write:
> $F(\desc x\phi)y$
When we do this, we'll need to come to some arrangement to handle scope ambiguities. Consider:
> $\neg F(\desc x\phi)y$
Which of the following should that be equivalent to?
> $\mathsf{The}\,x\colon \phi. \neg Fxy$
> $\neg\mathsf{The}\,x\colon \phi. Fxy$
In the second place, semantic variations. If we keep the new form of quantification, that may prompt us to allow restricted quantification as a primitive notation also for $\exists$ and $\forall$:
> $\forall x\colon \phi. \psi$ (meaning *every $x$ where $\phi$ is such that $\psi$*)
And it may prompt us also to consider further new quantifiers (see Sider section 5.4). If we treat the descriptions as terms, that may prompt us to consider the possibility of other terms failing to have a reference, too, and so consider free logics, where this possibility is countenanced from the beginning, rather than just a special quirk with descriptions. Regardless of our grammatical approach, we might also want to rethink the semantics, and say that when there is no unique $x$ such that $\phi$, then formulas like:
> $F(\desc x\phi)y$
> $\mathsf{The}\,x\colon \phi. Fxy$
aren't *false* exactly but fail in some other way. We might then want to deny that:
> $\neg\mathsf{The}\,x\colon \phi. Fxy$
is true. These thoughts take us into multivalent logics. (Free logics also come in bivalent and multivalent varieties.) And if, like Strawson, we think of improper descriptions as manifesting a kind of *presupposition failure*, we may want to use the apparatus of multivalent logics to represent other cases of presupposition failure too.
So there are different ways this all can be developed. On some of the choices, like the ones that Russell made, descriptions can be "defined away": any formula with a description will be equivalent to some formula in the languages we've already introduced. On other choices, this won't be possible.