FOM: generic absoluteness

John Steel steel at
Tue Jan 25 14:56:50 EST 2000

   This is a continuation of a previous note (from Jan. 98).  

II. Large cardinals as a preferred way of climbing the consistency
strength hierarchy.

    This section contains what purport to be rational arguments for
adopting large cardinal axioms such as: there are inaccessible (Mahlo,
measurable, Woodin, supercompact,..)cardinals. Much of what I have to say
here has been said many times before; it is known in the trade as 
"The Speech". I will give an abbreviated version here, one pointing toward
the generic absoluteness theorems and conjectures. The writings of
P. Maddy contain much more extensive discussions of the various arguments
which have been made for adopting large cardinal axioms. ( See e.g.
"Believing the Axioms", in the JSL, or her recent book "Mathematical
Naturalism".) There are also shorter versions of The Speech in the
introductions to "Projective Determinacy" (PNAS v. 85, 6582-6586) and
"A Proof of Projective Determinacy" ( JAMS Jan. 1994, 1-75) by D. Martin
and me.
A. Instrumentalism

    What does it mean to adopt large cardinal axioms--just what is the
 behavior being advocated? One ambition of set theory is to be a useful
universal framework in which all our mathematical theories can be
interpreted.  (Of course, one can always amalgamate theories by
interpreting them as speaking of disjoint universes; this is the paradigm
for a useless framework.)  To believe that there are measurable cardinals
is to believe "there are measurable cardinals" should be added to this
framework; that is, to seek to naturally interpret all theories of sets,
to the extent that they have a natural interpretation, in extensions of
ZFC + "there is a measurable cardinal". Let me expand by returning to
Hilbert's program and the consistency strength hierarchy.

   By Godel's 1st incompleteness theorem, we can never have the perfect
tool for deriving true Pi^0_1 ("real") statements that Hilbert envisaged.
We can, however, hope to keep improving in significant ways the tool we
have. If we improve T by moving to S such that P_T is a proper subset of
P_S, then in practice S proves Con(T), so Con(T) implies Con(S)  is not
provable by elementary means (by Godel's 2nd). Thus we are at least a bit
less certain of our tool S than we were of T, and we may be much less
certain if the consistency strength jump was substantial. We may, however,
still have rational grounds for believing that P_S consists only of truths
(i.e.,believing Con(S)). Taking these facts into account, one might
retreat from Hilbertism to "Hilbertism without the consistency proof".  In
this view, our goal in developing a universal framework is to
construct/discover ever more powerful tools for deriving true Pi^0_1
statements, while giving rational grounds for believing these tools yield
no false Pi^0_1 statements. This view is a pretty pale shadow of
Hilbertism--the most interesting and useful thing about Hilbertism was the
definite way in which it was false. What's left is Hilbert's
instrumentalism, and his use of Pi^0_1 statements as a touchstone. 
(Actually, I think "short" proofs of Delta^0_1 statements may be more to
the point here.) 
  There are some cautions needed here, however, that seem to me to make
this instrumentalist philosophy indistinguishable from realism. For we
cannot simply say that if P_T = P_S then T and S are equally good, or that
if P_T is a proper subset of P_S, then S is strictly better than T (as
candidates for inclusion in our universal framework theory). For example,
ZFC + "V=L" and ZFC + "not V=L" are equiconsistent, but one is better than
the other. ZFC + "there is an inaccessible cardinal"  has smaller
consistency strength than ZFC + "there are no inaccessibles"  +Con(ZFC +
"there is an inaccessible"), but the weaker theory is better than the
stronger. The key here is that we are trying to build ONE universal
theory, ONE tool to be used by all. We don't want everyone to have his own
private mathematics, because then we can't use each other's work. ZFC+
"there is an inaccessible" and ZFC + "not V=L" are good candidates for
inclusion in this framework theory, while the other statements are not. We
want whatever approximation to this theory we have at the moment to be
extendible in a natural way to arbitrarily high consistency strengths, for
our framework theory must interpret naturally all natural theories.  One
cannot so extend ZFC + " V=L", for example, without essentially turning
it into ZFC + "not V=L".

    Let me expand this last statement, because there is a trap here that
people sometimes fall into. It is sometimes argued that one can have
all the benefits of the consistency strength of large cardinal hypotheses
while still maintaining that V=L, by adding to ZFC + V=L statements like
"there is a transitive model of "ZFC + there is a measurable cardinal" "
and the like.The new theory has all the Pi^0_1 consequences (in fact all
the Sigma^1_2 consequences) of ZFC + "there is a measurable cardinal", but
also contains "V=L", and therefore settles the GCH and many other
questions of general set theory. It is strong in both the realm of the
concrete and the realm of sets of arbitrary type.

   One might as well take the view above to its logical conclusion: Why
not adopt Peano (or primitive recursive) Arithmetic plus Con(there is a
measurable), Con(there is a supercompact), Con(...),... as our official
theory? We get all the Pi^0_1 consequences we had before, and we decide
every question about sets of high type by simply declaring there are no
such sets. Isn't this progress?  

    One can see what is wrong with this theory by noting that it is
parallel to the theory that there is no past, that the world popped into
existence on, say, Jan.1,1998, complete with fossils, microwave background
radiation, and all the other evidence of a past. The theory is that there
is no past, but the world behaves exactly as if it had one.  The problem
with this theory is that in using it, one immediately asks "what would the
world be like if it had a past", and then one is back to using the
standard theory. The assertion that the world began Jan.1 cannot be used
in that context, and since that is the context we're always in, the 
assertion is just clutter. We've taken the standard theory and added
meaningless window dressing.

   The parallel with PA + Con(ZFC + measurables) is pretty clear: one is
meant to use this theory by dropping into ZFC + measurables.( Of course,
one may not then use all of that theory in a given context; mostly,
2nd or 3rd order arithmetic is enough. But then, big-bang theory isn't
used too much either.) The assertion that there are no sets of infinite
rank, but the sets of finite rank behave as if there were, is analogous to
the assertion that there is no past, but the world behaves as if there
were. ( The fact that we could have chosen any "birthday" for the universe
corresponds to the fact that we could have taken 2nd order arithmetic,
Zermelo, ZFC + V=L, or something else as the window dressing for our
mathematical theory.)

   In sum, we can PRETEND to adopt V=L as part of our universal framework
theory, while reaching consistency strengths at the level of a
measurable and beyond. But as it is imagined above, this is only a
pretense: our mathematical behavior, which is what gives meaning to our
theory, is that of one who believes that there are measurable cardinals.
The framework theory where the action takes place contains
"there is a measurable cardinal", and "V=L" is just an incantation which
does no work in the important arena. Put another way: we have just found a
very strange way of saying that there are measurable cardinals.

    B. Large cardinal hypotheses   
  Large cardinal hypotheses, aka "strong axioms of infinity", are
expressions of the idea that the cumulative hierarchy of sets goes on as
long as possible. This leads to an informal reflection principle: suitable
properties of the full universe V are shared by, or "reflect to", some
level V_alpha of the cumulative hierarchy. The axioms of infinity and
replacement of ZFC are equivalent, modulo the other axioms, to the
reflection schema for first-order properties. With some leaps, one can see
the assertion that there are alpha<beta and an elementary embedding
from (V_alpha+1, \in) to (V_beta+1,\in) as a second-order reflection principle.
The assertion that there is such an embedding is a very strong large cardinal
hypothesis, just weaker than asserting the existence of supercompact
cardinals, and strong enough to prove all the determinacy we know how to
derive from large cardinal hypotheses.

  Adding large cardinal hypotheses to ZFC gives a natural, open-ended
sequence of theories capable, it seems, of naturally interpreting
every natural extension of ZFC. There may be other such sequences of
theories, but then they should naturally interpret the large cardinal
theories and vice-versa, so what we would have would be two
intertranslatable ways of using the language of set theory. Developing
one sequence of theories would be the same as developing the other, so
there would be no need to choose between them.

   In practice, the way a theory S is interpreted in an extension T of the
form ZFC + large cardinal hypo. H is: S is the theory of a certain inner
model of a certain generic extension of an arbitrary model of T. Thus the
notion of interpretation is not the standard relative interpretation
notion of elementary logic. It is a "Boolean valued" interpretation. 

   Because there are such interpretations, theories of the form
ZFC +H, where H is a large cardinal hypo., are cofinal in the
(reasonably large) initial segment of the consistency strength hierarchy
we know about. But more is true: underlying the apparent wellordering of
natural consistency strengths is another remarkable phenomenon, namely,
that for any natural S there is a  T axiomatized by large
cardinal hypotheses such that S is equiconsistent with T. The consistency
strengths of large cardinal assertions are usually easy to compare, so
this gives us a way to compare arbitrary consistency strengths. Indeed,
this is generally how it's done. As indicated above, one shows
Con(T) implies Con(S) by forcing. One shows Con(S) implies Con(T) by
constructing inside an arbitrary model of S a canonical inner model
(some kind of generalization of Godel's L) satisfying T. There is a
powerful body of theory built up around each of the two methods. By
finding large cardinal equiconsistencies, one can relate theories S
and S' which seem to have nothing to do with one another. 

     There is more to this pattern. Let S_n^T be the set of Sigma^1_n
sentences which are provable in T. For natural S and T extending ZFC,
P_T \subseteq P_S iff S_2^T \subseteq S_2^S. (This reflects the fact
that our proof of Con(S)-->Con(T) produces a wellfounded model of S. It is
of course not true for arbitrary S and T; one needs "natural".) So the
Sigma^1_2 consequences of our natural theories converge as we climb the
consistency strength hierarchy.
     If we take S = ZFC + V=L  and T= ZFC + "there is a nonconstructible
real", then P_S =P_T but S_3^S \not= S_3^T. In fact, the natural
extensions of ZFC are far from linearly ordered by inclusion on their
S_3's. However, there is a large cardinal hypothesis H_1 such that
for natural S and T with consistency strength at least that of H_1,
P_S \subseteq P_T iff S_3^S \subseteq S_3^T. Here H_1 is the assertion
that there is a Woodin cardinal with a measurable above it.(In
particular, all the Sigma^1_3 consequences of H_1 are provable in any
natural theory which is consistency-wise stronger than H_1.) Again,
the Sigma^1_3 consequences of our natural theories converge as we
climb the consistency strength hierarchy.
     In fact, letting H_n be "there are n Woodin cardinals with a
measurable above them all", we have that for natural S and T which
are consistency-wise as strong as H_n, S_(n+2)^H_n \subseteq 
S_(n+2)^T, and P_S \subseteq P_T iff S_(n+2)^S \subseteq S_(n+2)^T.
So the consequences of our natural theories in the language of second
order arithmetic converge (or so it seems) as we climb the consistency
strength hierarchy.
    What is this set of consequences converging to? The set of truths in
the language of second order arithmetic (LSA), or at least a subset
    The consequences of the H_n's in LSA can be axiomatized in LSA. One
way is to use H^*_n = "for any real x, there is a countable Sigma^1_n
correct model M of H_n s.t. x \in M". More useful in most contexts is
PD, the assertion that all projective games are determined. Letting
PD^* be the associated schema in LSA, we have that ACA + PD^* =
ACA + {H_n^*| n in omega}, and this theory is equiconsistent with--
in fact has the same LSA consequences as--ZFC + {H_n |n in omega}.  

    In a nutshell, all roads upward seem to lead to PD. Any natural
theory which is consistency-wise as strong as PD actually proves PD.

C. Evidence through consequences.

    One should add at this point that the theory of projective sets one
gets from PD, and thus it seems from any sufficiently strong natural
theory, is coherent and sensible. It is not some mysterious mish-mash.
Indeed, much of the classical theory of the low-level projective sets
(Borel, Sigma^1_1, Sigma^1_2) can be seen as based on open determinacy,
which is provable in ZFC. ( ZFC proves Borel determinacy, a result of
Martin. Some of the large cardinal strength of ZFC is used here; by
results of Friedman, one needs aleph_1 cardinals, so that e.g. Zermelo
set theory does not prove Borel determinacy.) The theory of projective
sets one gets from PD generalizes the theory of Sigma^1_2 sets one gets
from open determinacy in a natural way. ( Although, contrary to a 
remark Steve Simpson made on FOM some time ago, the generalization is
often far from routine, and entirely new phenomena do show up.) It includes
the "right" answers to questions such as whether all projective sets are
Lebesgue measurable or have the property of Baire.

   In this realm of evidence for PD through its consequences, I would like to
mention a class of examples pointed out by Martin (in a paper called
"Mathematical Evidence"; I don't have a reference). Namely, some theorems
are first proved under the hypothesis of PD or large cardinals, and then
later the proof is refined or a new proof given in such a way as to use
only ZFC. Borel determinacy is a prime example of this: originally it was
proved assuming there are measurable cardinals, and then later a more
complicated proof which uses only ZFC was found. Martin points out other
instances of this phenomenon, for example some cases of his cone theorem
for the Turing degrees which can be proved by direct constructions. Another
nice example down lower is Borel Wadge determinacy. This is an immediate
consequence of Borel determinacy, but by work of Louveau and St. Raymond,
it is much weaker--in fact, provable in 2nd order arithmetic.
There is a nice potential instance of this    
phenomenon in the following. It is open now whether the class of Borel
linear orders is well-quasi-ordered under embeddability. Louveau and
Shelah have shown this for a subclass of the Borel linear orders, those
embeddable in some (R^n, lex). However, their proof uses PD! Louveau
conjectures that the result can be proved in ZFC. The point here is that
the use of large cardinals and PD to provide proofs of statements which can
then be proved true more laboriously by elementary means constitutes
evidence for large cardinals/PD. 

III. Generic Absoluteness

    We want our universal framework theory to be as complete as possible.
We have seen above that all roads toward completeness at the Pi^0_1 level
we know about seem to lead through PD. (That is, if one rules out
instrumentalist dodges like PA + Con(supercompacts).) Of course, this is just a
"phenomenon"; are there any theorems which encapsulate or explain it?
It also appears that the theory we get from PD is complete in a practical
sense, insofar as natural non-Godel-like statements in LSA go. Is there
a theorem explaining this?

    Both these phenomena are explained to some extent by some theorems
about generic absoluteness. Recall that our most powerful tool
for proving independence results in set theory is forcing. Indeed, all the
non-Godel-like independences we know of can be proved by forcing. Perhaps
a  theory
to which this method does not apply is apt to be "complete" in a practical
     In this connection one has the following theorem.

Theorem. The following are equivalent:

a) If G and H are set-generic over V, then L(R)^V[G] is elementarily
equivalent to L(R)^V[H] for statements about real parameters in
V[G] intersect V[H],

b) Over any set X, there is an iterable proper class inner model M_X
satisfying "there are omega Woodin cardinals.

c) If G is set generic over V, then L(R)^V[G] satisfies AD.

( Much of the theorem is due to Woodin; other parts are due to me.
It relies heavily on earlier work of Martin, Mitchell, and others.)
 Statement (b) follows from the existence of arbitrarily
large Woodin cardinals, and so (b)-->(a) tells us that no statement in
the first order theory of L(R) can be proved independent of the existence
of arbitrarily large Woodin cardinals by forcing. This is a partial
explanation of the practical "completeness" of this large cardinal
hypothesis in the realm of statements about L(R). The direction (a)-->(b),(c)
is a partial explanation of the uniqueness of such "complete" theories.

Remark: I moved from statements in LSA to statements about L(R) in the
theorem above because the analog of (a)-->(b),(c) does not hold in the more
constricted setting of projective sets. This is a technical detail; 
(a)-->(b),(c) is a general phenomenon, but one needs to work with classes of
statements having better closure properties than the class of those
expressible in LSA.
    The next installment of this note, due Jan. 2002, will discuss
generic absoluteness and CH.

John Steel
Prof. of Math.
UC Berkeley

More information about the FOM mailing list