% Notes on Syntax

1.  As we've mentioned before, authors will often want to write things like $P \vee Q \vee (R \horseshoe S)$, leaving off some parentheses, instead of $((P \vee Q) \vee (R \horseshoe S))$, which their official grammars may require. No ambiguity threatens just from leaving off the outermost parentheses. In a long disjunction, there will be a *syntactic* difference between different ways of parsing the expression, but we know this won't make any semantic difference, so it's an ambiguity we can ignore. Alternatively, we could just adopt a background convention that disjunction will always associate to one side, say to the left. The same goes for conjunction. And for long negations, such as $\neg\neg\neg P$, there is just one parsing that makes this expression well-formed.

    But now what about when multiple logical operators occur in an expression, and there aren't enough parentheses to force a single disambiguation? Well, we could also adopt background conventions about which operators consume their arguments first. Such conventions are standardly understood, and relied on, for how negation should interact with other operators. We assume that negation is very "greedy": it gets to consume arguments before other operators do. Thus we parse $\neg P \vee Q$ as $(\neg P) \vee Q$, not as $\neg (P \vee Q)$; and we parse $\neg P \horseshoe Q$ as $(\neg P) \horseshoe Q$, not as $\neg (P \horseshoe Q)$. Quantifiers are usually treated as being as greedy as negation.

    It's also common, though less safely takable-for-granted, that $\horseshoe$ should be a very "non-greedy" operator. Thus if you see $P \vee Q \horseshoe R$, the author probably meant you to understand $(P \vee Q) \horseshoe R$, not $P \vee (Q \horseshoe R)$. The same goes for the biconditional, which you may see written as $\equiv$ or $\leftrightarrow$ but that I will write as $\bicond$. (The other symbols are used for too many different things.)

    You should never write things like $P \horseshoe Q \horseshoe R$ or $P \vee Q \wedge R$. There is no bar to us adopting background conventions that would disambiguate these in a unique way. We could even "bake" some such conventions into our official grammar. But no such conventions are now broadly enough in place. The potential for confusion here is just too great. Use parentheses.


2.  Or if you really don't want to use parentheses, there is another convention you could adopt. This involves placing a period in the expression. The period is understood to work like a left-parenthesis, with the corresponding right-parenthesis being invisible, but located in the sentence as far rightwards as is possible, without leaving the current syntactic constituent. Thus although:

    > $(P \vee Q \horseshoe R) \wedge S$

    is understood as:

    > $((P \vee Q) \horseshoe R) \wedge S$

    We understand:

    > $(P \vee \pmb{\color{purple}.} \, Q \horseshoe R) \wedge S$

    as:

    > $(P \vee \pmb{\color{purple}(}Q \horseshoe R\pmb{\color{purple})}) \wedge S$

    with the right-parenthesis coming just before the right-parenthesis that closes the subexpression $(P \vee {.}\, Q \horseshoe R)$. Note that we *don't* understand it as:

    > $(P \vee \pmb{\color{purple}(}Q \horseshoe R) \wedge S\pmb{\color{purple})}$

    because that would involve "jumping" past a right-parenthesis that was opened before we came to the period.

    This convention is especially handy when dealing with quantifiers. Thus on our current conventions

    > $(\forall x Fx \vee Gx) \wedge S$

    means:

    > $((\forall x Fx) \vee Gx) \wedge S$

    But:

    > $(\forall x . Fx \vee Gx) \wedge S$

    means:

    > $(\forall x (Fx \vee Gx)) \wedge S$

    This convention is a hold-over from a more complicated set of conventions Russell and Whitehead used in *Principia Mathematica*.

    In the programming language Haskell, $\$$ is used the way I've described $.$ as working here. (Haskell then uses $.$ to express function composition.)


3.  You may sometimes see colons used in combinations with quantifiers. This could mean a variety of things. The colon may just be being used in the way I described periods as working above. Or you may be seeing a **restricted quantifier**, as in the following expressions:

    > $[\forall x\colon Fx] Gx$

    > $[\forall x\colon Fx \wedge x<z] Gx$

    > $\forall x\colon Fx \wedge x<z . Gx$

    These mean that $x$ is not assumed to range over the whole domain of quantification, but only over a subset of the domain, that satisfies the **restrictor** clauses $Fx$ or $Fx \wedge x<z$. In the languages we'll begin looking at, this apparatus is not present. You could express the same claims in the language of standard predicate logic as:

    > $\forall x. Fx \horseshoe Gx$

    > $\forall x. Fx \wedge x<z \horseshoe Gx$

    But there are some claims we can express using the more generalized quantificational idioms gestured at above, that we can't express in the language of standard predicate logic. We may discuss this later in the term.

    You may notice I sometimes write things like $\forall x\in \Gamma \dots$ in the metalanguage I'm mostly writing in. That would be written as $[\forall x\colon x\in\Gamma] \dots$ in the notation illustrated above.

<div class="box">
Notice the combination of colon and period in:

> $\forall x\colon Fx \wedge x<z . Gx$

The first marks the start of the restrictor clause, and the second marks the start of the main body of the expression, that we're being told all so-and-sos satisfy.

When Heim and Krazter introduce $\lambda$ expressions as a way to form complex predicates, they regiment this style. Thus for them,

> $\lambda x\colon Fx \wedge x<z . Gx$

expresses a predicate that is defined only on items $x$ that satisfy $Fx \wedge x<z$. For such items, the predicate applies iff they satisfy $Gx$. For items that don't satisfy the restrictor clause, the predicate is undefined.

Some formal systems use the colon to mean "has the type," and always indicate the type of a bound variable. So in these systems one might write:

> $\lambda x\colon \tau . Gx$

Here $\tau$ would have to be an expression designating a type, not an arbitrary open formula.
</div>


4.  I call the well-formed strings of the languages we're working with **expressions**. In some of these languages, the only expressions we have are sentences. In others, we have formulas more generally (which may contain free variables), of which sentences are a proper subset. We also have terms, like $x$ and $a$ and in some languages $a + x$.

    If $E$ is an expression (or formula or term), a **subexpression** (subformula, subterm) of $E$ is some segment of $E$ that is also an expression (formula, term) and is a syntactic constituent of $E$. This becomes tricky with all the background conventions we're permitting. Thus in the formula:

    > $Fx \vee \neg P \horseshoe Gx$

    I trust it will be clear to everyone that $Gx$ is one subformula, and that $\neg$ is not a subformula. But what about $\neg P \horseshoe Gx$? Superficially, it might be tempting to count that as a subformula. But when you keep track of our background conventions, you'll realize that no part of the parse tree for the original expression corresponds to just that string segment. $Fx \vee \neg P$ on the other hand *is* a syntactic constituent of the original formula, and so does count as a subformula.

    For some of what we'll do, it's best to be thinking about *occurrences* of expressions and what other expression-occurrences are their subexpressions, rather than about expression-types. Thus the first occurrence of $x$ in $Fx \vee \neg P \horseshoe Gx$, but not its second occurrence, is a subexpression of (the single occurrence of) $Fx \vee \neg P$.

    Given a specific grammar, we can inductively define notions like subformula relative to that grammar. Thus, using Sider's grammar for sentential logic:

    > (i) Every sentence letter is a PL-wff.

    > (ii) If $\phi$ and $\psi$ are PL-wffs, then $(\phi \horseshoe \psi)$ and $\neg\phi$ are also PL-wffs.

    > (iii) Only strings that can be shown to be PL-wffs using (i) and (ii) are PL-wffs.

    We can say:

    > (a) Every occurrence of a sentence letter has exactly itself as a subformula.

    > (b) If $\phi$ and $\psi$ are wff-occurrences whose sets of subformula-occurrences are $\Phi$ and $\Psi$, respectively, then an occurrence of $\phi \horseshoe \psi$ of which they are the immediate constituents will have $\{\,$itself$\,\} \union \Phi \union \Psi$ as its subformula-occurrences.

    and so on. But we'd have to do this again differently for every new grammar. See Bostock p. 22 and Gamut pp. 37, 76 for more discussion. (They just define the notion of subformula for expression-*types*; but they then go on to talk about *occurrences* of expressions within other expressions.)
    
    The important thing is for you to understand the general, unifying idea.

    When one expression has another as a subexpression, I'll say that the first expression contains the second.


5.  The **scope** of a quantifier or operator (occurrence) are its proper subformulas. Thus in:

    > $\neg\forall x Fx \vee Q \horseshoe Gx$

    which we understand as:

    > $((\neg\forall x Fx) \vee Q) \horseshoe Gx$

    the scope of the $\vee$ are the subexpressions $\neg\forall x Fx$ and $Q$ (so here the $\vee$ has "wider scope" than the $\neg$). The scope of the $\forall x$ is just the subexpression $Fx$. Note that the final $Gx$ is not "inside the scope of" the quantifier $\forall x$.

    We say that an occurrence of a variable $x$ is **bound** when it's inside the scope of any $\forall x$ or $\exists x$ (or other binding-devices we may introduce later, such as $\lambda x$ or $\desc x$). If there are multiple such, it's **bound by** the one with smallest scope. So the final $x$ in $\forall x(\exists x Fx)$ is bound by the $\exists x$, not by the $\forall x$.

    When an occurrence of a variable isn't bound, it's **free**.

    A formula or term that contains free variables is called **open**; one that contains none is called **closed**. Closed formulas are called **sentences**.

    Some philosophers will encourage you to think of open formulas as representing predicates. Others will encourage you to think of them as representing propositional functions, that is, as representing functions from objects to propositions. <!-- Bostock pp. 74-5; Gamut pp. 72, 74, 78 -->
    These are subtly different ideas and in my view both are subtly mistaken. Let's not get into the subtleties right now. For the time being, I'd just encourage you to regard those claims as rough heuristics that you might later be persuaded to refine or abandon, not as definitional stipulations.
    
    Most logical formalisms permit combining a binding-device with a formula where the variable being bound has no free occurrences. For example: $\forall x Fy$. Semantically, this will come out just the same as $Fy$.

    <div class="box"><p>
There might be a difference if we permitted the domain of quantification to be empty; then it might be that every $x$ --- all none of them --- is such that $Fy$ even though $Fy$ itself isn't true. If we discuss "free logics" later in the semester, this will be something we return to. In mainstream orthodox logics, though, we assume that the domain of quantification isn't empty, and that every term constant or variable can be associated with some object in the domain.
</div>

    Similarly, in my example $\forall x(\exists x Fx)$, the $\forall x$ doesn't bind any occurrences of $x$. Quantifiers that don't bind any free variables are called **vacuous**.

    For more discussion of these various notions in this week's readings, see Sider pp. 90--91; Bostock pp. 78--79; and Gamut pp. 74, 76--77.

    What do we say about the *first* $x$ in the string $\exists x Fx$, the one immediately after the $\exists$? People are all over the map about this. Some authors say that is another occurrence of the variable $x$, and that it's bound. Others say it's neither bound nor free. Others say it's not even an occurrence of the variable $x$. I prefer this last way of talking, but as I said there is lots of variation here. Perhaps best just not to speak of it.

    In expressions like:

    > $\forall x(Gx \wedge \exists xFx \wedge Hx)$

    and:

    > $Gx \wedge \exists xFx \wedge Hx$

    we say that the binding or assignment had by $x$ in the outer expression gets **shadowed by** the quantifier $\exists x$, which rebinds $x$ for its own use in the subformula $Fx$. The interpretation $x$ has in the outer scope is no longer visible inside the scope of the $\exists x$. This is perfectly legitimate, but as we'll see below, it does introduce some complexity when we're thinking about what it is to substitute one expression for another within a larger expression.

    There are some alternate notations for quantification where this kind of phenomenon just can't arise. I may mention them later in the term.


6.  Most everyone thinks $\horseshoe$ is a bit weird when they first encounter it. If you still feel that way, and you have restricted quantification as a primitive in your language, you might think of $\phi\horseshoe \psi$ as defined by $\forall x\colon \phi. \psi$, where $x$ doesn't occur free in $\phi$ or $\psi$. That will be true if all the $x$s such that $\phi$ (which will be all the objects if $\phi$ is true, or no objects if $\phi$ is false) are such that $\psi$. In other words, just in case $\phi$ is false or $\psi$ is true: which is just the meaning of $\phi \horseshoe \psi$.

    There are some interesting contexts in logics *weaker* than "classical" logic where you want a truth-functional conditional that's *not* equivalent to the one just explained. But here we're getting to the limits of what I understand well.


7.  We'll sometimes have to talk about the result of **substituting** some expression $\tau$ in place of another expression $\sigma$ in a larger expression $E$. For discussion, see Sider p. 100; Bostock pp. 80--81; and Gamut pp. 78, 99--100.

    A variety of notation is used to express this. Commonly authors will say something like $E[\tau/\sigma]$, with the idea being that you "divide out" the free occurrences of $\sigma$ from $E$ and replace them with $\tau$. But sometimes the $E$ comes after the $[\tau/\sigma]$, sometimes the $\sigma$ comes on top rather than the bottom(!), and sometimes the notation looks rather different than this. I find this very frustrating. I've adopted a notation that, while not already in widespread use, at least has the advantage of being very difficult to misunderstand. I will say:

    > $E[\sigma \leftarrow \tau]$

    to mean the result of substituting $\tau$ for all free occurrences of $\sigma$ in $E$.

    You have to be careful when doing this. Let $E$ be the expression:

    > $Fxy \wedge \exists y Gxy$

    Now suppose we want to generate $E[y\leftarrow x+1]$. Only the first occurrence of $y$ should then get replaced, because the last occurrence is bound by the quantifier $\exists y$. Suppose instead we want to generate $E[x\leftarrow y+1]$. Here the last occurrence of $x$ is *free*, so it *should* be replaced. But if we simply replace the $x$s with $y+1$, we'd get:

    > $F(y+1)y \wedge \exists y G(y+1)y$

    which isn't correct. For there we'd have introduced a new variable binding (the $\exists y$ will bind the $y$ in $y+1$) that we shouldn't have. Instead, we want to perform substitution in a way that doesn't introduce any new "free variable captures". We have two choices about how to handle this. First, we could simply declare the result of a substitution

    > $\dots[x\leftarrow \tau]$
    
    into a subexpression
    
    > $\exists y \dots$
    
    where $x$ occurs freely to be undefined when $\tau$ contains free occurrences of $y$. Alternatively, we could say that the substitution *is* defined, but introduces a renaming of the bound variable in $\exists y \dots$ to some new variable not occurring in $\dots$ and not occurring free in $\tau$, like so:

    > $F(y+1)y \wedge \exists z G(y+1)z$

    <div class="box"><p>
    Some authors take a different strategy. They say that $(\dots\exists y Gxy)[x\leftarrow y+1]$ *is* defined to be $ $\dots\exists y G(y+1)y$, it's just that $y+1$ doesn't count as being **substitutable for x** in that original expression. These authors then have to qualify various rules and claims they make with the proviso that the term being substituted is substitutable in the relevant ways. It's much more straightforward to just define the substitution operation so that these provisos aren't needed.
    </div>

    As I said at the end of point 5, there are some alternate notations for quantification where these kinds of difficulties can't arise.


8.  Bostock and Sider both work with a minimal set of logical operators, rather than the full range you may be familiar with. There are multiple ways to choose a minimal set (but not every choice is equally expressive). It's even possible to have only a single operator.

    If I had to choose, I'd choose $\horseshoe$ and $\neg$ (or $\horseshoe$ and $\bot$) as primitives, because these make it easiest to compare "classical" logic with some weaker systems. Sider already tells you (p. 27) how to define some other operators in terms of these:

    > $\phi \wedge \psi$ is short for $\neg(\phi\horseshoe\neg\psi)$

    > $\phi \vee \psi$ is short for $\neg\phi\horseshoe\psi$

    Some other definitions one can use:

    > $\phi \vee \psi$ is short for $(\phi\horseshoe\psi)\horseshoe\psi$

    > $\neg\phi$ is short for $\phi\horseshoe\bot$
    
    If you're ever working in a "non-classical" logic, you'll have to take special care about which of these "definitions" still give intuitively correct results.

    For the most part, I won't be choosing a minimal set of operators. We'll take the whole familiar range of operators as primitives, even the term literals $\bot$ and $\top$, which always get interpreted as being false, and true, respectively. (I won't talk about $\mathrm{xor}$, though, as there's too much variation in what notation to use for it.)

    Also, when we talk about *predicate* logic, I'll assume we nonetheless still have *sentence* constants like $P$ and $Q$ in the language (as well as the literals $\bot$ and $\top$), not just predicate constants like $F$ and $G$. As I've said in our meetings, you can think of the sentence constants as though they were 0-adic predicate constants (see Bostock p.75). And you can think of term constants like $a$ and $b$ as though they were 0-adic functors. (A **functor** is a symbol like $+$ that designates a function, in the way that a predicate designates a property or relation.)

    When we shift to talking about *logical* properties of our languages, we will consider what is invariant across changes of meaning to all the non-logical constants in the language, and also changes to the domain of quantification when the language has quantifiers. But we don't change the meaning of logical operators like $\vee$ and $\horseshoe$. Neither do we change the meaning of $=$ or $\bot$ or $\top$. That is why I tend to call the latter predicate and term **literals**, rather than just **constants**.

    You may ask: on what grounds is it decided whether to count something as a *logical* symbol (or literal) rather than a non-logical one.

    That's an *excellent* question. We will make no effort to answer it. Instead, for this class, we'll just be uncritically following the existing conventions.

    Those existing conventions include treating symbols like $0$, $+$, $\lt$, $\in$, and so on as *non*-logical constants, not logical symbols or literals like $\vee$, $\horseshoe$, $\bot$, and $=$ are. If you want to constrain the meanings of $0$, $+$, $\lt$, or $\in$, you need to do so by adding *extra-logical axioms* to your system, and talking about your **theory** of so-and-so, rather than thinking these constraints will be imposed by your *logic*. As we'll discuss later, a theory in the sense of this class is just a set of sentences, closed under provability in some background logic. (These will generally be *terrible* "theories" by the lights of philosophers of science.)


9.  When one is specifying what logic one wants to talk about, it should be clear whether it has a "classical" semantics or something weaker or stronger (or just different). It should also be clear what logical symbols are present, in particular whether quantifiers are present and whether $=$ is present. (If other binding-devices like $\lambda$ or $\desc$ are present, that should also be clear.)

    The term **signature** is used to describe what range of *non*-logical predicates (of whatever adicity), functors (of whatever adicity), and term constants (or functors of adicity 0) are available in a logic's language. Signatures that have no non-logical predicates of adicity greater than 1 are sometimes called **algebraic**; these signatures may include $=$ and functors of whatever adicity. If you review our discussions of algebra, you'll see we talked in terms of belonging to this-or-that set (which can be treated as a monadic predicate), operations (functors), and equalities, rather than any other relation.

    <a name="monadic" />
    It sometimes turns out to be important what range of non-logical constants are available. For example, in *monadic* first-order predicate logic, besides $=$ there are no other predicates whose adicity is greater than 1, and no functors. <!-- Adding a single functor of adicity 1 might preserve decidability -->
Such logics turn out to have interestingly different properties than first-order predicate logics more generally. (Similarly, monadic forms of second-order logic turn out to be interestingly different from second-order logics more generally.)