Substitutions and Models, Part 3: Boolos (and Gödel)

Last time, we saw Quine’s argument that, in languages rich enough for elementary number theory,

(EQ-R) |\!\!\models P iff \models P

And in Part 1, we saw a compelling looking argument — relying on deduction and compactness theorems for substitutional and model-theoretic consequence — that (EQ-R) will entail

(EQ) \Delta |\!\!\models P iff \Delta \models P

Today, we’ll see Boolos’s argument that (EQ) is false in languages rich enough for elementary number theory.

Section 1: Consequence and Consistency

Boolos’s argument is not direct. It goes, rather, via the notion of consistency. Let’s remind ourselves of the connection between the two.

(Boolos in fact talks of satisfiability; like many logicians (and for decent reasons)  he reserves the term “consistency” for a purely syntactic notion. I’m not going to follow suit, though. “Consistency” here will be a catch-all term for concepts in the general neighborhood, and we’ll define “satisfiability” in a moment. This shouldn’t cause any confusion, since the syntactic notion Boolos would use “consistency” won’t show up in today’s post.)

Let’s first remind ourselves of the connection between consistency and consequence.

The basic idea is this: if a set of sentences is consistent, all members of the set can be true together — “can” in the same sense that, if an argument is valid, the conclusion can’t be false if the premises are true.

We say that a set \Delta is substitutionally consistent if and only if there is a substitution scheme S where S(\Delta) is true, and we say that it is model-theoretically consistent, or satisfiable, if and only if there is a model on which all the members of \Delta are true.

The crucial connection we want to preserve with these definitions is that, if \Delta \Rightarrow P, then the set \Delta \cup \{\neg P\} is not consistent. Let’s verify that this holds for our two different notions.

First, model-theoretically: if \Delta \models P, then every model of \Delta is a model of P; since no model is a model of both P and \neg P, no model is a model of \Delta \cup \{\neg P\}, and so that set is unsatisfiable. Conversely, if \Delta \not\models P, then there is some model M of \Delta that is not also a model of P; it will be a model of \neg P, and so \Delta \cup \{\neg P\} will be satisfiable.

Now, substitutionally: if \Delta |\!\!\models P, then every substitution scheme that makes \Delta true makes P true, too. Since no substitution scheme makes both P and \neg P true, every substitution scheme that makes \Delta true does not make \neg P true, and so \Delta \cup \{\neg P\} is not substitutionally consistent. Conversely, suppose \Delta |\!\!\not\models P. Then some substitution scheme S makes all of \Delta true but does not make P true. Since S(P) has to be a sentence, and every sentence is either true or false, S(P) must be false. So \neg S(P) is true; but \neg S(P) = S(\neg P) (since S has to preserve truth-functional connectives), so S is a substitution scheme that makes all of \Delta \cup \{\neg P\} true.

This gets us two biconditionals:

(TransM) \Delta \models P iff \Delta \cup \{\neg P\} is not satisfiable.

(TransS) \Delta \!\!\models P iff \Delta \cup \{\neg P\} is not substitutionally consistent.

Putting these biconditionals together with EQ and negating both sides gets us

(ConEQ*) For any set \Delta and sentence P, \Delta \cup \{\neg P\} is satisfiable iff it is substitutionally consistent.

But notice that, if ConEQ* holds, so does

(ConEQ) For any set \Gamma, \Gamma is satisfiable iff it is substitutionally consistent.

Why is that? Well, either one of the sentences in \Gamma is of the form \neg P, or not. If so, then \Gamma = \Gamma \cup \{\neg P\}, and so we get ConEQ by plugging this identity into ConEQ*. So suppose no sentence in \Gamma is of the form \neg P; then they will all be of the form P. Choose one such sentence, and notice that (a) every model of P is also a model of \neg \neg P, and (b) for every substitution scheme S, S(P) is true iff S(\neg\neg P) is true. So \Gamma will be satisfiable/substitutionally consistent if and only if \Gamma \cup \{\neg\neg P\} is. But by ConEQ*, \Gamma \cup \{\neg\neg P\} is satisfiable iff it is substitutionally consistent, and the result follows.

(The limiting case not covered here is the empty set. We can make an arbitrary choice as to whether to count the empty set as consistent or not; so long as we choose the same way on both the model-theoretic and substitutional side of things, ConEQ will hold.)

So if EQ is true, so is ConEQ. Or, more precisely (since these theses have to be implicitly indexed to languages), if EQ holds of a language, so does ConEQ. Boolos, however, will show us that ConEQ fails for first-order languages rich enough for arithmetic. Thus, so does EQ.

Section 2: Gödel and Tarski: An Aside

In the next section, I’ll give Boolos’s counterexample to ConEQ. But first we’ll briefly touch on some much-needed background on Gödel’s famous incompleteness theorem and a related result by Tarski. (I brushed up against Gödel’s results in the last post, but they’ll be more central to Boolos’s argument here so I want to go through them in a bit more detail.)

Gödel proved that first-order arithmetic was incomplete, in (roughly) the following sense: every finitely specifiable axiomatization of arithmetic will fail to prove some true arithmetical statement. The way he proved it was very clever. He noticed that syntactic entities, such as finite strings of symbols, etc., could each be “coded” by a unique natural number. He then showed it was possible to define arithmetical predicates that applied to natural numbers if and only if those numbers coded syntactic strings with various properties (i.e., properties like being a sentence, being a proof, and so on). And using these resources he showed how to build a sentence that essentially said of itself that it was unprovable. If it was false, it would be provable (and so true); so it must be true, and therefore unprovable.

Let’s get a bit more precise. We suppose we have a language LA strong enough for simple arithmetic, and PA is a true axiomatization of Peano Arithmetic in LA. Then using a Gödel numbering scheme, we can assign each finite string of LA symbols a unique code number (or “Gödel number”). There are actually many different, equally good Gödel numbering scheme; but let’s suppose we’ve fixed on one in particular.

If A is a string of symbols of LA, let \ulcorner A\urcorner denote its Gödel code number. Gödel also showed it is possible to define predicates SENT and PRF_{T} where:

(1a) If A is a sentence, PA \vdash SENT(\ulcorner A \urcorner), and
(1b) If A is not a sentence, PA \vdash \neg SENT(\ulcorner A \urcorner).

(2a) If Pr is a proof (from the axioms of a well-behaved theory T) of A, and Pr’s Gödel code is n, then PA \vdash PRF_{T}(n,\ulcorner A\urcorner); otherwise
(2b) PA \vdash \neg PRF_{T}(n,\ulcorner A\urcorner)

Now, there are a number of ways to proceed from here. But one way of proving Gödel’s result (which is not quite the way Gödel himself originally did it) is via the diagonalization lemma:

DIAG: If a theory T includes PA, and if F(x) is a formula open in x, then there is a sentence A where T \vdash A \leftrightarrow F(\ulcorner A\urcorner).*

Now take F(x) as \neg\exists y(PRF_{PA}(y,x)). By DIAG, we have that there is a sentence A_{G} where

PA \vdash A_{G} \leftrightarrow \neg\exists y(PRF_{PA}(y,\ulcorner A_{G}\urcorner))

(PRF_{PA}(n,m), recall, means that there is a proof from the axioms of PA with Gödel number n, and the theorem it proves has Gödel number m.) Call this sentence DIAG_{G}. Since PA proves it, PA is true, and our proof theory is sound, DIAG_{G} must be true.

Now suppose A_{G} is false. But since DIAG_{G} is true, this means that \exists y(PRF_{PA}(y,\ulcorner A_{G}\urcorner). So there is a PA-proof of A_{G}. But PA is true and its proof system is sound, so whatever it proves is true, so A_{G} is true. Contradiction.

This means A_{G} must be true. But if it is true, DIAG_{G} tells us it can’t be proven in PA. So there is a true arithmetical statement — namely, A_{G} — that cannot be proven in PA. This is known as PA’s Gödel sentence.

(Could you add A_{G} to PA’s axioms to get a new theory, PA+ that could prove A_{G}? Yes. But now you could re-run the argument from DIAG using the open sentence \neg\exists y(PRF_{PA+}(y,x)), where PRF_{PA+} codes for provability-in-PA+. This gives us a new sentence, A_{G+}, which by the above reasoning must be true but not provable in PA+. And so it goes.)

As interesting as all this is, what we need to understand Boolos’s counterxample isn’t Gödel’s proof itself, but rather a related corollary, proven by Tarski, that no sufficiently rich language can define its own truth predicate. More accurately: if L is a language with all the arithmetical vocabulary needed for arithmetic, and if PA (as formulated in L) is true, then there is no arithmetical formula T of L such that T(n) is true if and only if n is the Gödel number of a true sentence of L.

This proof also goes via the diagonalization lemma. Suppose we have a language L, and the axioms of PA are among the truths of L. Suppose L also has a formula \neg T(x) which is true only of Gödel numbers of sentences that are not truths of L. Then the diagonalization lemma tells us that there is a sentence LIAR such that

PA \vdash \textrm{LIAR} \leftrightarrow \neg T(\ulcorner\textrm{LIAR}\urcorner)

Since PA is true and has a sound proof system, this sentence must be true as well. But if LIAR is true, then \neg T(\ulcorner\textrm{LIAR}\urcorner) is false, in which case LIAR isn’t a true sentence of L after all. And if LIAR is false, then \neg T(\ulcorner\textrm{LIAR}\urcorner) is false, in which case \ulcorner\textrm{LIAR}\urcorner numbers a true sentence of L, in which case LIAR is true. Either way leads to contradiction; this is a reductio of the premise that L contains a predicate T true of all and only the Gödel numbers of truths of L.

With Gödel’s proof, we could enrich PA to be able to prove A_{G} (but were then faced with a new unprovable truth, A_{G+}). Similarly, with Tarski’s proof, we can enrich L to a new language, L+, and in L+ it may be possible to define an arithmetical formula T such that T(n) is true if and only if n numbers a true sentence of L. But T cannot get things right for all the sentences that are in L+ but not L, too, in just the same way that PA+ had a new Gödel sentence it could not prove.

Section 3: The Counterexample

The Setup

OK, so here’s the deal. Let LA be a language that is rich enough for simple arithmetic, in the very strongest sense: it has expressions for the various numerical operators, and these mean their associated numerical operations. To fix ideas, let’s suppose LA has no constants at all, and the following predicates:

Predicate Means
Z(x) x = 0
S(x,y) y is the successor of x (y = x + 1)
A(x,y,z) x + y = z
P(x,y,z) x \times y = z

We’ll also suppose LA has one further predicate, G(x), which means `x is a number’. (We also have the quantifiers of LA implicitly restricted to natural numbers, so \forall x G(x) is a truth of LA.) Let’s denote the set of all truths of LA, “Tr(LA)”.

Before going on, let’s notice something about how this language will talk about numbers. Suppose we wanted to say that 1+2=3. We usually do this using a functor “+”, numerals (“1”, “2”, etc.), and an equality sign “=”. Many standard formulations of PA work like this, except that they only have one (constant) numeral, “0”, and then a “successor function” “S”, where “Sx” is the number one greater than x. Here, we would represent “1+2=3” as something like “S0+SS0=SSS0”. In such formulations of PA, the expressions consisting of “S” applied n times to “0” is called a standard numeral, and (more precisely) the standard numeral for n. Sometimes it can be useful to use some sort of typographical shorthand, e.g. a boldfaced convention where “3” is understood to stand for 3’s standard numeral, “SSS0”. Here, we can take an ordinary numerical statement, say “1+2=3”, bold each of its terms, to get “1+2=3“, and then unpack the boldfaced convention to get “S0+SS0=SSS0”.

But our language LA isn’t like this at all: we don’t have identity, and we don’t have functors. So if we use boldfaced numerals to stand for some conventional expression of a certain number in a certain language, we can’t trade out a boldfaced numeral for one cooked up out of “0” and a successor function. We must do something else.

What we do instead is describe the number. For instance, we know that three is something which succeeds something which succeeds something which succeeds zero. (And “succeeding zero” is the same as “succeeding something which zeroes” — that is, succeeds an x where Z(x).) So, formally, our boldfaced convention will work as follows: when we get a sentence, say

3 …,

we understand that as shorthand for

\exists x_{0}\exists x_{1}\exists x_{2}\exists x_{3}(Z(x_{0}) \wedge S(x_{0}, x_{1}) \wedge S(x_{1}, x_{2}) \wedge S(x_{2}, x_{3}) \wedge \ldots x_{3} \ldots)

Of course, that’s pretty long, but it does the trick. (And it should be clear how to extend this for arbitrary “numerals”.) There’s another, equivalent way to describe what we’re doing. Instead of having standard numerals, we’ll have standard numeral-predicates. Thus, for instance, 3(x) will be a shorthand for the open sentence

\exists x_{0}\exists x_{1}\exists x_{2}\exists x_{3}(Z(x_{0}) \wedge S(x_{0}, x_{1}) \wedge S(x_{1}, x_{2}) \wedge S(x_{2}, x))

Then we can read

3

as

\existsx(3(x) \wedge … x … )

This predicate-numeral convention will be useful later.

(Why the convoluted strategy? Boolos does this to stack the deck in Quine’s favor; Quine’s favored language was free of functors, which he replaced with a description trick like this.)

OK, enough of LA. Now consider a language LB. Syntactically, it is exactly the same. Semantically, it is almost the same — but not quite. The only difference is that, in LB, G(x) is true of all and only the numbers that are Gödel numbers of sentences in Tr(LA). Following earlier usage, we’ll denote the set of all truths of LB, “Tr(LB)”. And we’ll use the same boldfaced numeral (and numeral-predicate) convention for LB as we used for LA.

Notice that the terms of the each languages are syntactically the same. This means that they will share Gödel coding schemes. More precisely, since Gödel schemes simply assign numbers to strings on the basis of their syntactic properties — semantic ones never enter into it — then any sensible Gödel coding scheme will assign syntactically identical strings of LA and LB (whether they be sentences, proofs, whatever) the very same Gödel codes.

Despite this syntactic similarity, it will be useful for us to keep tabs on which language we’re talkig about, though. We’ll want to distinguish, e.g., `G‘ as it appears as a term of LA, and as it appears as a term of LB. So we’ll add subscripts to each of the terms as appropriate, writing the (non-logical) terms of LA as

Z_{A}, S_{A}, A_{A}, P_{A}, and G_{A},

and the (non-logical) terms of LB as

Z_{B}, S_{B}, A_{B}, P_{B}, and G_{B}.

And we’ll adopt the further convention that for any syntactic string S (belonging to both languages), we’ll use S_{A} to denote that string as a string of LA, and S_{B} to denote it as one of LB. Notice that these subscripts are not an official part of the languages LA and LB; rather, they’re just our own metalinguistic commentary on elements of those languages. So these subscripts don’t figure into the Gödel codes strings get assigned. Notice, however, that we can now rewrite our earlier observation as follows: if \ulcorner\cdot\urcorner is any well-defined system of Gödel coding, then for any string S, \ulcorner S_{A}\urcorner = \ulcorner S_{B}\urcorner.

Notice also: G_{B} is a truth-predicate for LA. By definition, for any number n, G_{B}(n) is true if and only if n is the Gödel number of a truth of LA. And so it must also be a truth-predicate for the fragment of LA that doesn’t have `G_{A}‘ in it.

3.a: A First Pass

OK, now let’s look at Boolos’s argument. Consider the set Tr(LB)_{A} — the set of all the sentences of LA that are syntactically just like the ones in Tr(LB). This set is satisfiable: we know full well it has a model. We know this because the intended interpretation of LB provides just such a model. More precisely, let M be a model of LA that assigns each predicate P_{A} the extension that P_{B} in fact has; since Tr(LB) Is in fact true, and M makes LA’s terms mean just what LB’s already do, M must be a model of Tr(LB)_{A}. The question, then, is whether it is substitutionally consistent — that is, whether there is a substitution scheme T where T(Tr(LB)_{A}) is true, too.

Let’s start by noting that, if there is, then T(Tr(LB)_{A}) \subseteq Tr(LA). This is because we’re working in the language LA here; if T(Tr(LB)_{A}) are all true, then there some (if not all) of the truths of LA, and so some of the members of Tr(LA).

Now one thing we may have noticed is that Tr(LB)_{A} and Tr(LA) have a lot of overlap. In fact, any sentence that does not have “G” in it will be in one of these if and only if it is in the other. So we might have thought that an easy way to cook up a translation scheme T that makes T(Tr(LB)_{A}) all true would be to have it give an interesting substitution for “G” and leave the other predicates alone. But this won’t work, and it’s worth taking a minute to see why.

Suppose n is a Gödel number of a true sentence of LA. Then G_{B}(n) is in Tr(LB), and so G_{A}(n) is in Tr(A). And this, remember, just is the sentence

\exists x_{0}, x_{1}, \ldots, x_{n}(Z_{A}(x_{0}) \wedge S_{A}(x_{0},x_{1}) \wedge \ldots \wedge S_{A}(x_{n-1}, x_{n}) \wedge G_{A}(x_{n}))

Now, when we take the substitution instance of this that T gives us, that simply swaps out all the predicates here for their values under T. But T, by hypothesis, leaves everything but “G” alone. So the result of applying T to this sentence gives us

\exists x_{0}, x_{1}, \ldots, x_{n}(Z_{A}(x_{0}) \wedge S_{A}(x_{0},x_{1}) \wedge \ldots \wedge S_{A}(x_{n-1}, x_{n}) \wedge T(G_{A})(x_{n}))

In other words, it gives us the sentence T(G_{A})(n). Likewise, if n is not a Gödel number of a true sentence of LA, \neg G_{B}(n) is in Tr(LB), and a similar argument shows that \neg T(G_{A})(n).

We’re supposing that T takes us to all and only truths. But (since the satisfier of the numeral-predicate n just is the number n!) this means that T(G_{A}) is a formula of LA that is true of all and only the Gödel numbers of truths of LA. And this is what Tarski’s theorem told us we couldn’t have. Reductio.

3.b: For More General Cases

But a reductio of what? Of the assumption that there was a substitution scheme T that made Tr(LB)_{A} true and didn’t fiddle with any of the predicates other than G_{A}. What happens if we relax that second assumption? Basically, the same thing. Only it takes more effort to see why. That’s what I’ll argue in this section.

OK, so suppose we have a substitution scheme T, where T[Tr(LB)_{A}] are all true.

Here’s the first thing to notice. G_{B}(\textbf{n}) will be true if and only if n is the Gödel number of a true sentence of LA. So T[G_{A}(\textbf{n})] will be true if and only if n is the Gödel number of a true sentence of LA, too.

But what is G_{A}(\textbf{n})? Recall our numeral-predicate strategy: it is the sentence

\exists x(\textbf{n}(x) \wedge G_{A}(x))

So the substitution instance of this is

(1) \exists x(T[\textbf{n}](x) \wedge T[G_{A}(x)])

And notice: (1) is true if and only if n is the Gödel number of a true sentence of LA. But this doesn’t give us quite what we want: an open formula that is true of and only of the true Gödel numbers of LA sentences. So we’re not quite done.

What we need is a formula of LA, R, where is true of n and m if and only if m satisfies T[n(x)]. If we had that, then (1) would be equivalent to

(2) \exists x(R(\textbf{n},x) \wedge T[G_{A}(x)])

and would be true if and only if n was the Gödel number of a truth of LA. Moreover, the open formula

(3) \exists xR(y,x) \wedge T[G_{A}(x)]

would be a ‘truth-formula’ for LA: it would be true of and only of the Gödel numbers of LA truths. Since Tarski’s theorem tells us there can’t be any such formula, our reductio would be complete.

But how do we show that there is such a formula R? Our answer comes in two parts. The first requires a bit of background. Start with the notion of a (total) recursive function. A function f is recursive if and only if (more or less, assuming the Church-Turing thesis) there is an effective algorithm for determining, for each \vec{x} and y, whether f(\vec{x}) = y. From this, we can also define the notion of a recursive relation: an n-place relation R’ is recursive if and only if there is an n-place recursive function f where f(\vec{x}) = 1 iff \vec{x} stand in R’, and 0 otherwise. And finally, we have the following useful result: if R’ is any n-placed recursive relation, then there is a formula R* of LA where

PA \vdash R*(\vec{\textbf{x}}) if \vec{x} stand in R’, and
PA \vdash \neg R*(\vec{\textbf{x}}) otherwise.

OK, so far, so good. Now notice that, given any number n, there is a recursive function that will deliver the Gödel number of the formula “T[n](x)”. We can see this by imagining a purely mechanical process for taking n and generating “T[n](x)”‘s Gödel number.

The first thing we need to do is figure out just what formula T[n](x) is. But this will clearly be a simple algorithmic process: it’s purely mechanical to write down Z_{A}, n instances of S_{A}, and then conjoin them and fill in and bind the right variables in the right places. This gets us the formula n(x). And it’s also purely mechanical to get from this to T[n](x): just look up each predicate’s assignment in T’s table and write it in the right spot. But (given a fixed Gödel coding) it’s also purely mechanical to take any formula and find its Gödel number. So there is a binary recursive relation GN* that n bears to x if and only if m is the Gödel number of “T[n](x)”. And since there is such a relation, there is also a predicate GN where GN(n,x) is satisfied when and only when x is the Gödel number of “T[n](x)”.

Second, there will be a satisfaction predicate SAT for PA that applies to “T[n](x)” — that is, a predicate SAT where SAT(x,m) is true if and only if m satisfies the formula with Gödel code x.** So now consider the formula

\exists x(GN*(n,x) \wedge SAT(x,m))

This is a formula that is satisfied by n and m if and only if m satisfies T[n]. But (putting this all together), this means that the open formula

\exists y[\exists x(GN*(n,x) \wedge SAT(x,y)) \wedge T[G_{A}](y)]

is a truth-formula for LA. For n satisfies this formula if and only if there is a number m that satisfies both T[G_{A}] and T[n] if and only if the sentence T[G_{A}(\textbf{n})] is true if and only if the sentence G_{B}(\textbf{n}) is true if and only if n is the Gödel number of a truth of LA. Boolos’s reductio is complete.


*Really for this theorem, T need not include PA but only include the weaker Robinson arithmetic; but since PA includes that, and since we’re going to be dealing explicitly with cases where our theories include PA, we can ignore this slight weakening.

**This isn’t quite right, and for reasons that have to do with Tarski’s theorem. Just as there can be no perfectly general truth predicate for LA, there can also be no perfectly general satisfaction predicate for LA. What we do have, though, is a hierarchy for satisfaction predicates that follows the arithmetical hierarchy. That is, for every complexity \Pi_{n} or \Sigma_{n} in the arithmetical hierarchy, there is a satisfaction predicate SAT_{\Pi_{n}} or SAT_{\Sigma_{n}} that applies to all formula of that complexity. That is, if P is (say) a \Sigma_{n} formula, then SAT_{\Sigma_{n}}(\vec{x},\ulcorner P\urcorner) if and only if \vec{x} satisfies P. However, if P is a formula of greater complexity than \Sigma_{n}, SAT_{\Sigma_{n}} can’t be trusted to ‘get the facts right’ about what satisfies it.

However, for any translation scheme T, there will be some particular level of complexity that all formulae of the form T[n(x)] will share. This is because all such formulae will be of the form

\exists y_{0},\ldots,y_{n-1}(T(Z_{A})(y_{0}) \wedge T(S_{A})(y_{0},y_{1}) \wedge \ldots T(S_{A})(y_{n-1},y))

That is, it will be a block of quantifiers followed by a conjunction of translations of primitive formulae of LA. We can think of this conjunction as having two main conjuncts:

(T(Z_{A})(y_{0})) \wedge (T(S_{A})(y_{0},y_{1}) \wedge \ldots T(S_{A})(y_{n-1},y))

Here’s an argument for one of the possible cases; it will be easy to reconstruct the other cases from it. Suppose T[Z_{A}] is of complexity \Delta_{i}, and T[S_{A}] is of complexity \Delta_{j}. Notice that, in this case, any conjunction of formulae all of the form of T[S_{A}] is also \Delta_{j}. If i = j, then the complexity of this entire formula is \Delta_{j}. If i j, the complexity of this formula is \Delta_{i}. So no matter what there is a fixed k such that, no matter how many iterations of T(S_{A}) are added, this formula is of complexity \Delta_{k}, in which case the complexity of T[n(x)] is \Sigma_{k+1}, and we can use SAT_{\Sigma_{k+1}} for the argument in the text.

One response to “Substitutions and Models, Part 3: Boolos (and Gödel)

  1. Pingback: Philosophers’ Carnival #154 | Nick Byrd

Leave a comment