GA 1003. Notes on Logic and Semantics

The mainstream contemporary understanding of notions like logical truth and logical consequence defines them in semantic terms. We assume some identification of a language's semantics --- in the languages we study this will be formally specified --- and then we say that a sentence is logically true when it will remain true no matter how its non-logical expressions are interpreted, and no matter what the extra-linguistic world is like.
Notably, this last clause includes its not mattering how many objects there are in the world. Except: in mainstream logic we always assume that there are some objects, which our quantifiers always range over and which our term constants always designate. Arguably this is not true for natural language.
- Some terms seem to be "empty," or to fail to designate anything. Perhaps the chemical term \(phlogiston\) works like this, or the name \(Sherlock~Holmes\), in some of its uses. (These examples and others are controversial.)
- Sometimes we seem to use quantifiers without commiting ourselves to the reality of what we're talking about, as when we say:
  
  Some of the things posited by my colleagues don't exist.
  
  or:
  
  Some of the children you might have had will never in fact exist.
Arguments can and should be had about what's really going in such cases. I just observe that mainstream logic makes no attempt to accommodate them. Later in the term, we may consider universally free logics, which permit the domain of quantification to be empty and also permit term constants to be meaningful even when non-referring. (Those logics do still construe \(\exists\) as being existentially committal.)
This semantic conception is not the only way that logical notions can and have historically been analyzed. See Sider pp. 1--3, 6--top 11, and 28--29 for the semantic conception, and pp. 1--2 and 8--9 for some alternatives; see also Gamut pp. 114--16. Even today, some formalisms are more productively first thought of in non-semantic terms. But this is by far the dominant contemporary understanding, and it is what we will use.
We assume, then, that the semantic properties of a language mathematically settle what its logical properties are. You can raise questions about how we know what the language's semantic properties are, especially in the case where it's not a language whose syntax and semantics we're formally stipulating. You can raise questions about whether the language has enough semantic properties to settle certain logical issues. You can raise questions about how we know what mathematically settles what --- what patterns we're entitled to follow in our metalinguistic reasoning about the language whose logical properties we're studying. Those are all decent questions. Perhaps some of them even belong in part to the "philosophy of logic." But they aren't part of what one does when engaging in the kind of metalogical inquiry we're getting familiar with in this course. So if any of these questions grip you or keep you up at night, file them away to come back to another time.

Likewise, if you think the nature of logical consequence has to in some way be metaphysically prior to this notion of "mathematically settling" that I'm helping myself to, that too may be a good question, but isn't one we're going to pursue here.
So, as I said, we assume that the semantic properties of a language mathematically settle that it has certain logical properties --- for instance, settle that some sentence is logically true, or settle that some set of sentences logically entail another sentence, or settle that they are logically incompatible with each other. Now though the logician ignores much of the epistemology of this, she does concern herself with questions about what "effective methods" there are to establish that such properties are present or absent.

I'll call the combination of a language and a specified semantics for it a logical system. (Usually the term logic is reserved for more than this: some at-least-sound proof system is also expected.)

In the best kind of case, we'll have a decision procedure for a logical system: some effective recipe or algorithm that's guaranteed to answer our questions about logical properties, yes-or-no, after some finite number of steps. (Though there's no guarantee it will be very fast or efficient.) Proof or derivation systems for a logic are generally not of that sort, because generally such decision procedures are unavailable. In practice, proof or derivation systems are instead frameworks that let you creatively search for an answer. The proof system will ensure that if you find one, you can rely on its being correct. But it may take some ingenuity and artistry on your part to do this efficiently.

What I just said is a good picture to start with. But it needs to be filled in and qualified in some ways. First, as I hinted, in some logical settings we do have decision procedures. When we're dealing just with sentential logic, for example --- also called propositional logic and some other names --- then questions about what's logically true and what logically entails what are decidable. The familiar truth-tables are one effective decision procedure. Some fragments of first-order predicate logic are also decidable. (Monadic predicate logic is one such; there are others.) But first-order predicate logic in full generality is not decidable. Still, at least the logical truths of first-order predicate logic can at least be enumerated by a Turing machine. More on this in a moment. In some logical settings, we cannot even do that much. The logical truths of some fragments of second-order logic can also be enumerated; but not all of the logical truths of second-order logic.
To get a better handle on this, let's think more carefully about what we mean by a formal proof. A formal proof will just be some string that conforms to certain formally specified rules. (Remember that most of the arguments you're producing in this class aren't formal proofs; though they should be rigorous in the way that informal arguments in mathematics are standardly expected to be.) It should be effectively decidable whether some string does constitute a formal proof of sentence \(\phi\) in a given proof system. That's just part of what we mean by a "proof system."

Now generally a proof system is intended to track the logical properties of a specific logical system. (Though sometimes the proof system is devised first, and the corresponding semantics supplied later.) There are two important notions about how a proof system might relate to semantically-grounded logical properties of its intended logical system. We say that the proof system is sound when things that you can prove in it always do have the corresponding semantically-grounded logical properties. If a proof system is known to be unsound, we throw it away. No one wants that. In the other direction, we say that a proof system is complete when, if certain semantically-grounded logical properties are present, the corresponding results will be provable. This is harder to get. In the case of first-order predicate logic, the various proof systems we have are known to be both sound and complete. So here provability does entail semantically-grounded logical truth, and vice versa. But for some logical systems, such as second-order logic, we not only lack complete proof systems, we know it's not possible to get them. (If they're also sound. You can always get a complete proof system for cheap by making every sentence provable. But as I said, nobody wants a proof system that isn't sound.)

Proof systems that aren't complete can still be useful, because we may find proofs of sentences we didn't already know to be logical truths, and so finding a proof contributes to the advancement of our knowledge. But if the proof system isn't complete, there will always be some semantically-grounded logical truths that are "left out", that such proofs can never reach. (If there are multiple proof systems, the truths left out by one system may be different from those left out by another.)
Sometimes logicians concern themselves not just with what's logically true in a system, but with what's logically entailed by a given set of axioms. It's natural to think that questions about decidability, or provability, with respect to the latter will go hand-in-hand with questions about decidability, or provability, in the background logic. However that is not so. Some fragments of arithmetic, for example, are decidable even though the logical properties of their language are not decidable in general. And other fragments of arithmetic, famously, must fall short of being able to prove all their truths, though their background logic does not have the corresponding failing. We will discuss this all later. For now, I'm just trying to give you an initial orientation.
Let's return to the difference between proof systems and decision procedures. The fundamental difference here is that a decision procedure is guaranteed eventually to give you a yes-or-no answer. Whereas, with a proof system that happens not to be a decision procedure, you lack the guarantee of getting one of the answers. You can try and try to prove something, but the mere fact that you've failed so far doesn't mean that success isn't still out there waiting around the corner to be had. Now in some cases we can (informally) prove that particular claims will be (formally) unprovable. But in general we can't rely on knowing that. The mere existence of a proof system won't in general enable us to effectively decide whether some claim that's so far unproven is in principle unprovable.

I said earlier that "in practice," proof systems require ingenuity and artistry. By their nature, decision procedures will be "mindless and mechanical." Theoretically, there's nothing about a proof system which demands creativity, and no sense in which creativity can stretch the limits of what the proof system may achieve via more mindless, not-necessarily-efficient, strategies. (Creativity may help you to discover a different proof system.) After all, formal proofs are just finite strings, so we can enumerate all the candidates. Moreoever, we can effectively decide whether any given candidate really is a proof in a given system of a given sentence \(\phi\). So if there's a proof of \(\phi\) to be had, there's an effective enumerating procedure that's guaranteed eventually to yield it. It's just that, in the case of undecidable systems, we won't be able to effectively tell whether a proof we've so far failed to find will never arrive.

The place for "ingenuity" and "artistry" in the use of a proof system is in perhaps finding a proof of the desired conclusion faster than the effective enumerating procedure I just described would.
We'll come back to all of this. The important points for now are that: (a) facts about what's logically true, what logically entails what, and so on, are grounded in a language's semantics. (b) For some systems, we have effective procedures that are sound, complete, and moreover capable of deciding whether those properties are present. They'll not just enable us to prove the properties are present when and only when they are; they'll also enable us to tell that the properties are absent. (c) For other systems, like predicate logic, we only have the first two virtues. Our proof systems will enable us to prove the properties are present when and only when they are. (d) For other systems --- ones that we won't look at for a while and if we do we'll only look at briefly --- the gods are even less kind.

But since we're going to be talking largely about systems of the sort described in (b) and (c), these are cases where many semantically-grounded properties coincide with many proof-grounded properties. And it can be hard to keep track of which words goes with which notions. For instance, is a "tautology" defined as semantically-guaranteed-to-be-true, or as provable-from-no-premises? Since the semantic properties and the provability properties happen to coincide in the region we'll first and mainly be exploring, this can be very hard to keep track of. Indeed authors use some of these words differently, and some authors aren't even consistent with their own usage. Nonetheless, I think this is worth putting some effort into keeping track of. When the context of inquiry changes, you'll want to know whether you're talking about a notion that is stipulated to be semantic, and may or may not as a interesting logical result turn out to coincide with something proof-theoretic, or whether it's the other way around.
- Standard practice is to use the terms logical truth and logical consequence/entailment for the semantically-grounded notions. Some authors say "semantic entailment" to emphasize this. The core idea here is that \(\phi\) is logically true in system \(S\) (written \(\models_S\phi\), normally I will omit the subscript and leave it to context to settle what system is being discussed) when \(\phi\) will remain true no matter how its non-logical predicate, sentence, functor, and term constants are interpreted, and no matter what the extra-linguistic world is like (so long as \(S\)'s domain of quantification, if it has quantifiers, is not empty). And a set of sentences \(\Gamma\) logically entail \(\phi\) (written \(\Gamma\models\phi\)) when no matter what variations of the sort just described are allowed, whenever \(\Gamma\) are all true, so too is \(\phi\). We will make these ideas precise later.
Logical truths are also called valid or universally valid sentences. This is confusing given the use of "valid" in mainstream philosophy to mean an argument whose premises logically entail its conclusion. (Amongst logicians, "valid" and "sound" are seldom used with their mainstream philosophical meanings.) When something is a logical truth (valid sentence) of sentential logic, it's called a tautology. Some authors also apply the term "tautology" to logical truths in other systems that have the form of a sentential tautology, such as \(\forall x Fx \vee \neg\forall x Fx\). These count as logical truths because of structural properties they have that are represented by sentential logic.
- Standard practice is to say that two expressions (primarily, formulas) are logically equivalent (some say "semantically equivalent") when they come out having the same interpretation, no matter what variations of the sort previously described are allowed. We can write that \(\phi\) and \(\psi\) are logically equivalent as \(\phi\mathbin{{=}\!|{\models}}\psi\). We'll see another way to express this same idea later, and also how to express some other claims that in the context of the logical systems we're looking at turn out to be equivalent.
See Sider pp. 33--37, 95--98; Bostock pp. 24--25; Gamut pp. 46, 48--52, 99; and Burgess pp. 119--20, 148 for expositions of these notions.
- One natural correlative notion would be of a logical falsehood: a sentence guaranteed to come out false, no matter what variations of the sort previously described are allowed. Gamut calls this a "contradiction" (pp. 52--53, 99). I suspect some authors use that term for a provability notion instead, but I haven't yet caught any of the ones we're looking at doing it.
- Another natural correlative notion would be of a set of sentences \(\Gamma\) that are logically incompatible, that is, that are guaranteed never to come out simultaneously true, no matter what variations of the sort previously described are allowed. We can represent that as \(\Gamma \models \bot\). Many authors (Burgess is an example) call such sets unsatisfiable (pp. 120, 126, 148).
  
  Here "satisfying" is something that a model or interpretation does to a set of sentences. Earlier we used that verb in a different way: it was something that an object might do to an open formula like \(Fx \wedge x<z\), with respect to the free variable \(x\). See point 10, near the bottom of this page.
  
  Instead of "unsatisfiable," Bostock uses the term inconsistent (pp. 24, 89, 136--7). But other authors use that term for a provability notion, instead, namely, being premises from which one can prove \(\bot\). (See Sider p. 62; Gamut p. 150; Burgess pp. 148, 169.) I've read that historically, the term "inconsistent" was sometimes used in each of these ways --- even before it was established that in mainstream logic the two notions coincided. I'll try to explicitly use the terms \(satisfiable\) and \(consistent_\vdash \) for the semantic and provability notions, respectively, and avoid using the bare unqualified term \(consistent\).
  
  Note that \(\Gamma~is~inconsistent_\vdash \) is itself defined in a variety of ways:
  - \(\Gamma\vdash \bot\)
  - \(\Gamma\vdash \phi\) and \(\Gamma\vdash \neg\phi\), for some sentence \(\phi\)
  - \(\Gamma\vdash \phi\), for every sentence \(\phi\)
Some other provability notions:
- To say that a sentence \(\phi\) is provable in some proof system \(P\), from premises \(\Gamma\), we write \(\Gamma\vdash _P\phi\). (Here too we will omit the subscript if it's clear from context.) If \(\phi\) is provable from no premises, we write \(\vdash \phi\). Sometimes such \(\phi\)s are called theorems (see Sider pp. 47, 50). But that term is also sometimes used for sentences provable from some specified set of extra-logical axioms (see Burgess p. 191, and below).
- Sometimes the word theory is used for any set of sentences in the language of a given logical system; other times such sets are assumed to be closed under \(\vdash \) (see for example, Burgess p. 191). In either case, we say that a sentence is a theorem of some theory (rather than just a theorem simpliciter, or a theorem of logic) if it's provable from the theory-set. When the theory-set is closed under \(\vdash \), being a theorem will be the same as being a member of the theory-set. Sometimes these theory-sets are, and other times they aren't, assumed to be effectively enumerable.

There are several notions that are provably equivalent in the systems we'll be looking at, but which have different meanings. Try not to confuse them.

Semantic	Proof-theoretic
(a) \(\Gamma,\phi\models\psi\)	(c) \(\Gamma,\phi\vdash \psi\)
(b) \(\Gamma\models\phi\supset \psi\)	(d) \(\Gamma\vdash \phi\supset \psi\)

(e) \(\phi\mathbin{{=}\!\|{\models}}\psi\)	(g) \(\phi\dashv\,\vdash \psi\)
(f) \(\models\phi\subset\supset \psi\)	(h) \(\vdash \phi\subset\supset \psi\)

Try to explain to yourself in English what each of these eight cells says. Because some of the symbols may be unfamiliar, I remind you that \(\subset\supset \) is my symbol for the biconditional, and \(\mathbin{{=}\!|{\models}}\) means logical equivalence. \(\dashv\,\vdash \) means that each argument suffices to prove the other.

It's quick work to prove that (a) is true iff (b) is, and likewise (e) and (f). This will follow just from the meanings of \(\models\) and the semantics we assign to \(\supset \) and \(\subset\supset \). However, though these equivalences are easily established, they are susbtantive results. Potentially they could fail to hold in some other logical system.

(d) will imply (c) in any proof-system that has a rule of modus ponens; and similarly for (h) and (g). But the other direction is trickier. Going from (c) to (d) requires that a deduction theorem holds for the proof system in question, and this is not always trivial to establish. See Sider pp. 58--59 for a proof that the deduction theorem holds for his axiomatic proof system for sentential logic; see also Bostock pp. 203--6, 222--24.

Now when a proof system is sound in the sense explained before, then we can rely on each of the right-hand columns to entail the corresponding left-hand columns. (Sider calls this strong soundness, see his Exercise 2.9 on p. 55.) If on the other hand, the proof system is complete, then we can rely on the left-hand columns to entail the corresponding right-hand columns. And in fact the proof systems we'll be looking at are sound and complete, and deduction theorems for them (of some sort) do hold. So cells (a)--(d) do turn out to be equivalent in our systems, and so too cells (e)--(h).

As I said, though, they say different things. When working with other logical systems you can't automatically assume they are equivalent. That needs to be verified.

Let's work through the way I'd formulate a semantics for a predicate logic. As I do this, I'll comment on some of the variation in notation you see among different authors, and also some slightly different theoretical choices they make.

Our system will have the familiar logical operators \(\supset \), \(\subset\supset \), \(\wedge\), \(\vee\), and \(\neg\), as well as \(\bot\), \(\top\), \(=\), \(\forall\), and \(\exists\).

I'll assume a stock of variables \(x\), \(x'\), \(x''\), \(\dots\).

Our system's signature will include a stock of term constants \(a\), \(a'\), \(a''\), \(\dots\); a stock of unary functors \(f\), \(f'\), \(f''\), \(\dots\); a stock of binary functors \(g\), \(g'\), \(g''\), \(\dots\); a stock of sentence constants \(P\), \(P'\), \(P''\), \(\dots\); a stock of unary predicates \(F\), \(F'\), \(F''\), \(\dots\); and a stock of binary predicates \(G\), \(G'\), \(G''\), \(\dots\). You can add more if you like, or less if you like, but this should be enough to give you the central ideas.

See if you can write a grammar for the language.

To give the semantics, we need a domain of quantification \(\mathscr{D}\), a function \(\mathscr{I}\) that interprets the system's atomic vocabulary, and some way to temporarily associate variables with elements from the domain. I'll do this using an assignment function \(q\), like Sider and most authors nowadays do. (See Sider pp. 93--5; Bostock describes this strategy on pp. 86--89 but prefers a more substitutional approach; see also Gamut pp. 89--99.) It's standard to pack all the stuff you need to do the semantics, apart from the assignment, into a single structure we call a model (sometimes it's just called a structure). In this case the model will just be a pair of the domain \(\mathscr{D}\) and the atomic interpretation function \(\mathscr{I}\). When we move to modal logic, the model will also contain information about what possible worlds there are, and possibly some apparatus to keep track of which elements from the domain exist in which world. When we're dealing with sentential logic, all there is to the model is the atomic interpretation function \(\mathscr{I}\), and in that case, some authors no longer use the word \(model\) but I will. I will represent our model as \(\mathscr{M} = (\mathscr{D}, \mathscr{I})\). When I need to talk about multiple models, I'll call the second one \(\mathscr{M}' = (\mathscr{D}', \mathscr{I}')\) and so on.

Now what assignment we use has to be tailored to our choice of model (in particular to what the domain is), but I will just use the symbol \(q\) to represent an arbitrary assignment, and take it as given that this function is appropriate to the model we're working with. The role of the assignment is to keep track of which elements from the domain we're temporarily associating each variable \(x\), \(x'\), \(x''\), \(\dots\) with. Bostock idiosyncratically has the assignment handle the interpretation of term constants (\(a\), \(a'\), \(a''\), \(\dots\)) too (p. 88), and this has some merits but on balance I think it's too much trouble.

When talking about the semantics of programming languages, theorists call assignment functions environments instead.

Given a particular assignment \(q\), we'll need to talk about variant assignments that match \(q\) at every value excepting that they assign a specified element \(d\in \mathscr{D}\) to a specified variable \(x\). Sider would write this as \(q^x_d\) (see his p. 93). Bostock would write \(q_x\) (see his p. 88). I will write it as \(q[x\mathbin{:=}d]\), which I think is clearer.

The role of \(\mathscr{I}\) is to associate with each \({\color{green}{a}}\) some single element from the domain, with each \({\color{green}{f}}\) some total function \(\mathscr{D}\to \mathscr{D}\), with each \({\color{green}{g}}\) some total function \(\mathscr{D}^2\to \mathscr{D}\), with each \({\color{green}{P}}\) some truth value, with each \({\color{green}{F}}\) some set of elements from the domain (or as I'll assume, some function \(\mathscr{D}\to 2\)), and with each \({\color{green}{G}}\) some set of pairs of elements from the domain (or some function \(\mathscr{D}^2\to 2\)). I hope this will all seem quite natural.

If we were working with a simpler signature, we might not need all of this: perhaps we wouldn't have any functors, for example. Or we might just be working with a sentential logic, then \(\mathscr{I}\) would only need to handle the interpretation of the sentence constants. In the latter case, some authors talk about \(\mathscr{I}\) as an \(assignment\) of truth-values to sentence constants. But I think it may be confusing to use that word both for what \(\mathscr{I}\) does (in sentential logic) and for what \(q\) does (in predicate logic). So I'll try to use the word \(assignment\) only in connection with \(q\).

Some authors proceed by taking \(\mathscr{I}\), defined over the atomic vocabulary, and extending it until it gives an interpretation to all the complex formulas in the language, too. Other authors define a separate function, which pulls values from \(\mathscr{I}\) in the atomic cases. I prefer this second strategy; but not much hangs on it.

As I commented at the end of the notes on strings, authors use a variety of notations for the semantic interpretation of terms and formulas. I will use \([\![\, \dots \,]\!]\) to express the semantic interpretation of both formulas and terms. This always has to be relativized to a model, which I'll write with a subscript. In the case of predicate logic, it also needs to be relativized to an assignment, which I'll write with another subscript.

Now we can define the interpretation of arbitrary terms and formulas, relative to a model \(\mathscr{M}=(\mathscr{D},\mathscr{I})\) and assignment \(q\):
1. For any term constant \({\color{green}{a}}\), \([\![\, {\color{green}{a}} \,]\!]_{\mathscr{M}\,q}\) is \(\mathscr{I}({\color{green}{a}})\) (an element from \(\mathscr{D}\)).
2. For any variable \({\color{green}{x}}\), \([\![\, {\color{green}{x}} \,]\!]_{\mathscr{M}\,q}\) is \(q({\color{green}{x}})\) (an element from \(\mathscr{D}\)).
3. For any monadic functor \({\color{green}{f}}\) and term \(\alpha\), \([\![\, {\color{green}{f}} \alpha \,]\!]_{\mathscr{M}\,q}\) is \(\mathscr{I}({\color{green}{f}})\) (a total function \(\mathscr{D}\to \mathscr{D}\)) applied to \([\![\, \alpha \,]\!]_{\mathscr{M}\,q}\).
4. For any binary functor \({\color{green}{g}}\) and terms \(\alpha\), \(\beta\), \([\![\, {\color{green}{g}} \alpha\beta \,]\!]_{\mathscr{M}\,q}\) is \(\mathscr{I}({\color{green}{g}})\) (a total function \(\mathscr{D}^2\to \mathscr{D}\)) applied to \(([\![\, \alpha \,]\!]_{\mathscr{M}\,q},[\![\, \beta \,]\!]_{\mathscr{M}\,q})\).
5. For any sentence constant \({\color{green}{P}}\), \([\![\, {\color{green}{P}} \,]\!]_{\mathscr{M}\,q}\) is \(\mathscr{I}({\color{green}{P}})\) (a truth value).
6. For any monadic predicate \({\color{green}{F}}\) and term \(\alpha\), \([\![\, {\color{green}{F}} \alpha \,]\!]_{\mathscr{M}\,q}\) is \(\mathscr{I}({\color{green}{F}})\) (a function \(\mathscr{D}\to 2\)) applied to \([\![\, \alpha \,]\!]_{\mathscr{M}\,q}\).
7. For any dyadic predicate \({\color{green}{G}}\) and terms \(\alpha\), \(\beta\), \([\![\, {\color{green}{G}} \alpha\beta \,]\!]_{\mathscr{M}\,q}\) is \(\mathscr{I}({\color{green}{G}})\) (a function \(\mathscr{D}^2\to 2\)) applied to \(([\![\, \alpha \,]\!]_{\mathscr{M}\,q},[\![\, \beta \,]\!]_{\mathscr{M}\,q})\).
8. For any terms \(\alpha\), \(\beta\), \([\![\, \alpha = \beta \,]\!]_{\mathscr{M}\,q}\) is true when \([\![\, \alpha \,]\!]_{\mathscr{M}\,q}\) is identical to \([\![\, \beta \,]\!]_{\mathscr{M}\,q}\), else false.
9. \([\![\, \bot \,]\!]_{\mathscr{M}\,q}\) is always false.
10. \([\![\, \top \,]\!]_{\mathscr{M}\,q}\) is always true.
11. For any formula \(\phi\), \([\![\, \neg\phi \,]\!]_{\mathscr{M}\,q}\) is true when \([\![\, \phi \,]\!]_{\mathscr{M}\,q}\) is false, else false.
12. For any formulas \(\phi\) and \(\psi\), \([\![\, \phi\vee\psi \,]\!]_{\mathscr{M}\,q}\) is true when either \([\![\, \phi \,]\!]_{\mathscr{M}\,q}\) or \([\![\, \psi \,]\!]_{\mathscr{M}\,q}\) is true, else false.
13. For any formulas \(\phi\) and \(\psi\), \([\![\, \phi\wedge\psi \,]\!]_{\mathscr{M}\,q}\) is true when both \([\![\, \phi \,]\!]_{\mathscr{M}\,q}\) and \([\![\, \psi \,]\!]_{\mathscr{M}\,q}\) are true, else false.
14. For any formulas \(\phi\) and \(\psi\), \([\![\, \phi\supset \psi \,]\!]_{\mathscr{M}\,q}\) is true when \([\![\, \phi \,]\!]_{\mathscr{M}\,q} \le [\![\, \psi \,]\!]_{\mathscr{M}\,q}\), where false is assumed to be \(<\) true, else false.
15. For any formulas \(\phi\) and \(\psi\), \([\![\, \phi\subset\supset \psi \,]\!]_{\mathscr{M}\,q}\) is true when \([\![\, \phi \,]\!]_{\mathscr{M}\,q} = [\![\, \psi \,]\!]_{\mathscr{M}\,q}\), else false.
16. For any formula \(\phi\) and variable \({\color{green}{x}}\), \([\![\, \forall {\color{green}{x}}\phi \,]\!]_{\mathscr{M}\,q}\) is true when for every \(d\in \mathscr{D}\), \([\![\, \phi \,]\!]_{\mathscr{M}\,q[{\color{green}{x}}\mathbin{:=}d]}\) is true.
17. For any formula \(\phi\) and variable \({\color{green}{x}}\), \([\![\, \exists {\color{green}{x}}\phi \,]\!]_{\mathscr{M}\,q}\) is true when for some \(d\in \mathscr{D}\), \([\![\, \phi \,]\!]_{\mathscr{M}\,q[{\color{green}{x}}\mathbin{:=}d]}\) is true.
Given this definition, we can say that \(\phi\) and \(\psi\) are logically equivalent (written earlier as \(\phi\mathbin{{=}\!|{\models}}\psi\)) when \([\![\, \phi \,]\!]_{\mathscr{M}\,q} = [\![\, \psi \,]\!]_{\mathscr{M}\,q}\) for every choice of \(\mathscr{M}\) and (appropriate) \(q\). This is not some separate fact that turns out to be equivalent to \(\phi\mathbin{{=}\!|{\models}}\psi\); rather it is what that notation means.

As I said, when dealing with sentential logic this can all be simpler and you can omit the assignments and just write \([\![\, \phi \,]\!]_{\mathscr{M}}\), and even then all there is to the model is the interpretation \(\mathscr{I}\) of atomic vocabulary. Some authors would also write \([\![\, \phi \,]\!]_{\mathscr{M}}\) in the case of predicate logic, when \(\phi\) is a closed formula and so is known to have the same interpretation on every assignment. I think this can be misleading, so I will instead write like this:
- \([\![\, \phi \,]\!]_{\mathscr{M}\,\mathrm{fun}}\) is the function from assignments \(q\) (appropriate to \(\mathscr{M}\)) to \([\![\, \phi \,]\!]_{\mathscr{M}\,q}\).
- \([\![\, \phi \,]\!]_{\mathscr{M}\,\mathrm{set}}\) is the set of assignments \(q\) (appropriate to \(\mathscr{M}\)) where \([\![\, \phi \,]\!]_{\mathscr{M}\,q}\) is true.
- \([\![\, \phi \,]\!]_{\mathscr{M}\,\mathrm{all}}\) is defined only when \([\![\, \phi \,]\!]_{\mathscr{M}\,q}\) is invariant across all (appropriate) choices of \(q\), and is identical to their value.
Using this notation, a closed formula \(\phi\) is counted as true in a model \(\mathscr{M}\) when \([\![\, \phi \,]\!]_{\mathscr{M}\,\mathrm{all}}\) is true. An open or closed formula \(\phi\) is counted as a logical truth when \([\![\, \phi \,]\!]_{\mathscr{M}\,q}\) is true in every model \(\mathscr{M}\) and (appropriate) assignment \(q\). (That is, when \([\![\, \phi \,]\!]_{\mathscr{M}\,\mathrm{all}}\) is true in every model \(\mathscr{M}\).) It's counted as a logical consequence of \(\Gamma\) when, whenever \([\![\, \gamma \,]\!]_{\mathscr{M}\,q}\) are true for all \(\gamma\in\Gamma\), \([\![\, \phi \,]\!]_{\mathscr{M}\,q}\) is also true, for every model \(\mathscr{M}\) and (appropriate) assignment \(q\). And so on.

There are some tricky issues when we allow \(\models\) to take open formulas as arguments; see this passage from Bostock. The definition of logical consequence just given accommodates those issues; but they can crop up again elsewhere (see Sider pp. 99--100).
When a formula \(\phi\) has exactly one free variable \({\color{green}{x}}\) (it may have multiple occurrences), authors sometimes talk about an object \(d\)'s satisfying \(\phi\). This assumes some intended model \(\mathscr{M}\) and means that \([\![\, \phi \,]\!]_{\mathscr{M}\,q[{\color{green}{x}}\mathbin{:=}d]}\) is true, for any (and all) appropriate choices of \(q\).

Bostock also talks about assignments satisfying arbitrary open formulas (pp. 87--88). I hadn't seen this way of talking before. There is a different strategy for handling free variables, introduced by Tarski, where instead of assignment functions we work with sequences of elements from the domain, and there we do talk about arbitrary formulas being satisfied by a sequence.
Why do we give \(=\) a privileged role in the semantics, distinct from other binary predicates?

Well, it's possible to have other binary predicates in a system that express equivalence relations --- that is, that are reflexive, symmetric, and transitive. A notion of equality requires more than that, though. It requires congruence, that is, that no predicates in the language should discriminate objects that are equal. (Keep your worries about Frege's Problem to yourself.) Now consider a model whose domain is \(\{\, a_1,a_2,b,c_1,c_2 \,\}\) and whose atomic interpretation function \(\mathscr{I}\) gives binary predicate \(H\) the extension \(\{\, (a_1,a_1),(a_1,a_2),(a_2,a_1),(a_2,a_2),(b,b),(c_1,c_1),(c_1,c_2),(c_2,c_1),(c_2,c_2) \,\}\), and gives unary predicate \(F\) the extension \(\{\, a_1,a_2,b \,\}\). Suppose the system has no other predicates. Then in this model, \(H\) could function as an equality relation. Objects that it classes together are indiscernible by other predicates in the language. However, this is all sensitive to the contingencies of the model. If \(F\) gets a different extension, some objects that \(H\) classes together may be discerned. The point of having a special relation \(=\) whose behavior is fixed by the semantics in a model-independent way is to enable us to reason about equality in a way that isn't hostage to those accidents.

It is possible to say in the language of first-order predicate logic that some relation is reflexive, symmetric, and transitive. Some theories include such claims among their axioms. However it is not possible to say that a relation is a congruence for all the predicates of the language. That is, there's no sentence we can write in a first-order system that on every model will be true iff that condition obtains. In second-order logic, as standardly intepreted, it becomes possible to say this; and so in such systems, it is possible to stipulate as an axiom that some relation \(H\) functions as an equality in a given theory. This is sometimes put by saying that the identity relation is definable in second-order logic.