FOM: conservation results

Tue Nov 2 11:20:25 EST 1999

This message is a follow-up to Harvey Friedman's 9/29 posting, 

  FOM: 61: Finitist proofs of conservation

Harvey's posting contains a smooth an elegant way of turning a number of
model-theoretic proofs of conservation results into finitary, syntactic
arguments. In response to a personal request from Harvey, I wrote a note
discussing the relationship between model-theoretic and proof-theoretic
proofs of conservation results. At Steve Simpson's prompting, I have
reworked my response into this posting, in the hope that others may find
it interesting or useful. 

When I say "conservation result," I have in mind a setup that involves two
theories, T_1 and T_2, and a class of sentences Gamma in the language of
T_1. The theorem then states that whenever T_1 proves some sentence phi in
Gamma, then T_2 proves it as well, or, possibly, a related "translation" 
phi'. 

Results like these can be interesting for a number of reasons. Typically,
the theory T_1 formalizes a type of mathematical reasoning that is, prima
facie, stronger than that of T_2, in which case the result gives a
"reduction" of the stronger theory to the weaker one. For example, one can
reduce various kinds of second-order theories (that is, theories in a
second-order language) to first-order ones, first-order theories to
quantifier-free ones, infinitary theories to finitary ones, impredicative
theories to predicative ones, classical theories to constructive ones,
nonstandard theories to standard ones, and so on. In short, insofar as T_1
"captures" a certain type of mathematical argumentation (in the sense that
ordinary mathematical arguments of a certain kind can be formalized in
T_1, in a natural way), a corresponding conservation result shows that
this type of reasoning can be 

  reduced to / 
  understood in terms of / 
  interpreted in terms of / 
  justified relative to 

certain others. Results like these are often surprising and unexpected,
since in many cases T_1 and T_2 look very different.

I have argued in a talk called "Semantic methods in proof theory," (on my
web page, http://www.andrew.cmu.edu/~avigad), that most results in proof
theory, when stated in their strongest form, can be viewed as conservation
results. For example, one can construe the results of an ordinal analysis
as determining, finitarily, that a theory T_1 is conservative over a very
weak theory together with principles involving induction or recursion on
ordinal notations. (See the discussion in the paper "Ordinal analysis
without proofs," also on my web page; I should note that Lev Beklemishev
has approached these issues from a similar point of view, but has come up
with characterizations of ordinal analysis that are different from mine.)
For another example, one obtains combinatorial independences by finding
finitary proofs that the theory T_1 is conservative over weak theories
together with certain combinatorial principles.

Here I will focus on conservation results in the ordinary sense,
restricting my attention mainly to cases where T_1 and T_2 are classical
theories. One can find much more information on reductions like these (as
well as reductions of classical theories to constructive ones) in a trio
of articles found together in JSL 53 (2), 1988, 337-384:

  Sieg, Wilfried, "Hilbert's program sixty years later"
  Simpson, Stephen, "Partial realizations of Hilbert's program"
  Feferman, Solomon, "Hilbert's program relativized: proof-theoretical
    and foundational reductions"

An interesting thing about the conservation results described in these
articles is that some of them were first discovered by model-theoretic
methods, others by proof-theoretic methods. Neither camp can claim clear
superiority. Of course, model-theoretic proofs rely on more semantic
intuitions. Advantages to the proof-theoretic arguments are that (1) they
can be carried out in a weak metatheory, and (2) they involve explicit
translations of the proofs in T_1 to T_2, with bounds on the increase in
length of proof.

As far as (2) is concerned, what typically happens is that either there is
an interpretation of T_1 in T_2, in which there is a low-degree polynomial
bound on the increase in length of proof, or there is an "essential" use
of cut-elimination/normalization, in which case the best bound involves an
iterated stack-of twos.

The brunt of my message to Harvey was basically this: I don't know of any
model-theoretic proof of a conservation result that hasn't been duplicated
using proof-theoretic methods as well; and where there isn't a direct
interpretation, the superexponential increase in length of proof can be
shown to be necessary.

To illustrate, I've listed some of my favorite conservation results, and
indicated whether or not the first proofs were model-theoretic or
proof-theoretic. Keep in mind that I am leaving out a lot -- I haven't
even mentioned reductions of classical theories to constructive ones,
ordinal analysis, functional interpretations, conservation results for
fragments of arithmetic, nonstandard theories, etc. The following theories
are mostly concerned with standard, classical representations of
mathematics.

Here is the list:

    WKL_0 (and hence RCA_0 and ISigma_1) over PRA

    Sigma^1_1-AC_0 (and hence ACA_0) over PA

    Sigma^1_1-AC over (Pi^0_1-CA)^<epsilon_0

    ATR_0 over IR or ID^_<omega

    Pi^1_1-CA_0 over ID_<omega

    NBG (von-Neumann, Bernays, Goedel set theory) over ZFC

The conservation results hold for formulas in the common language, plus a
little more if you read them the right way. For example, the second result
holds for Pi^1_2-sentences if you say what it means for such a sentence to
be provable in PA (roughly, add free set variables, and ask for an
explicit formula witnessing the existential quantifier).

All these results can be obtained by model-theoretic methods, but all of
them can also be obtained using cut-elimination arguments, in which case
the proofs are easily shown to be finitary (taking place in EFA*, aka
IDelta_0 + superexponentiation). 

To the best of my knowledge, the credits are as follows:

Mints, Takeuti, and Parsons got ISigma_1 over PRA independently (Parsons
used a Dialectica interpretation plus normalization, the other two cut
elimination or normalization).

RCA_0 is easily interpreted in ISigma_1 (hence without a big increase
in length of proofs). 

WKL_0 over PRA is due to Friedman, using a model-theoretic argument.
Later, Sieg did this using cut elimination, and Kohlenbach did it with a
Dialectica interpretation (and normalization).

Also after Friedman's proof, Harrington gave a forcing argument for the
Pi^1_1 conservation of WKL_0 over RCA_0 (yielding Friedman's result as a
corollary). Independently, Hajek and I turned this into an interpretation
of WKL_0 in RCA_0, so there is no big increase in the length of proofs.

ACA_0 over PA and NBG over ZF are trivial model-theoretic arguments, and
historically, the latter came first. A number of people seem to have come
up with the model-theoretic argument independently -- see the references
in Fraenkel, Bar-Hillel, and L\'evy's *Foundations of Set Theory*. I
believe Schoenfield had the first syntactic proof, using a method of
eliminating special constants. Arguments using cut elimination are also
easy, probably first noted by Feferman and Sieg.

Sigma^1_1-AC_0 over PA was obtained by Barwise and Schlipf using 
recursive saturation, but the result was also implicit in Friedman's
work (see the discussion of Sigma^1_1-AC). Sieg duplicated the result
using cut-elimination, Feferman with a Dialectica interpretation (and
normalization).

Sigma^1_1-AC over (Pi^0_1-CA)^<epsilon0 is due to Friedman. Sieg and
Feferman later got it with cut-elimination, and Feferman with a 
Dialectica interpretation. Here there is no big increase in lengths of
proofs. (Note that Sigma^1_1-AC is not finitely axiomatizable, so there
are short "local" interpretations.)

ATR_0 over IR is in a paper by Friedman, McAloon, and Simpson, and the
proof is model-theoretic. Jaeger got a proof-theoretic version using
cut-elimination and other methods. ATR_0 over ID^<omega then follows from
other known reductions. I gave a direct proof of this last conservation
result by noting that (ATR) is equivalent to a second-order version of the
fixed-point axioms, after which an easy model-theoretic or cut-elimination
argument finishes it off.

Pi^1_1-CA_0 over ID<omega similarly follows by an easy model-theoretic
or cut-elimination argument, given the Kleene's analysis of Pi^1_1 sets in
terms of inductive definitions. Kreisel was probably the first to consider
formal theories of inductive definitions; this particular result is due to
Feferman.

Unless otherwise noted, in all cases there is necessarily a 
superexponential increase in the length of proofs. The easiest way to show
this is due to Solovay: use his method of "shortening of cuts" to get
short proofs of the consistency of big pieces of the second theory in the
first theory. For example, ACA_0 has proofs of the consistency of
ISigma_{2_n} that are polynomial in n, where 2_n is a stack of n 2's. One
can be more specific regarding the measure of length one is using, and get
the upper and lower bounds to coincide asymptotically -- see Pudlak's
article in the Handbook of Proof Theory. (Incidentally, I like to think of
Herbrand's theorem as being a kind of conservation result for logic with
quantifiers over logic without. Lower bounds here (I think) are due to
Statman and Orevkov independently; see Schwichtenberg and Troelstra's book
on Proof Theory for necessary increases in length of proofs in pure
logic.)