FOM: 107:Automated Proof Checking

Harvey Friedman friedman at
Fri May 25 04:32:24 EDT 2001

We address the problem of setting up a uniform procedure for creating
strictly rigorous mathematics. By strictly rigorous, we mean machine

We are thinking of well known, elementary mathematics - the sort of
mathematics that is needed to support verification of hardware/software. We
distinguish this from any attempt to create serious new mathematics, or to
deal with mathematics at the outer fringes of what mathematicians can
handle. The first of these two goals is a major goal of the automated
deduction, or theorem proving, community. But this is not what we have in

There are many reasons to create strictly rigorous mathematics. One is
alluded to above - to support verification of hardware/software.

There are systems that currently create strictly rigorous mathematics -
e.g., Mizar. But as far as I have seen, they do not have the level of
general user friendliness that we have in mind, as well as the readability
of proofs that we are looking for. Much more user friendliness is required
in order to unleash the gigantic unused mathematical intuitions of
mathematically gifted people, who generally are allergic to computerese.

A second is to begin a serious theory of the structure of actual
mathematical proofs. It is difficult to know how to start this field in any
serious way without a considerable amount of data to ponder and analyze,
and the best way to obtain such data is to create an appropriate user
friendly transparent system.


Our approach differs from others in the following respects:

a) there is a new, formalized, division of responsibilities between the
mathematician and the computer;
b) there is a clear definition of exactly what the computer is doing, and
this definition is attractive proof theoretically, mathematically
meaningful, and efficiently implementable;
c) it is easy to do many examples by hand, long before one has a working

The plan is to first set up a general purpose language for presenting
mathematical statements. This will be a sugared version of Zermelo Frankel
theory with the axiom of choice. Later, we can be concerned with using only
weak fragments of ZFC.

We use the key concept of SELF PROVING SEQUENT. This is discussed in detail
in the next section.

The mathematical development consists of the presentation by the
mathematician of the axioms, the definitions, and a series of bite sized
claims. Each claim is given a proof as indicated below, using previous
claims, the axioms, and previous definitions. The definitions can be viewed
as claims that require no proof, because of their special form.

The expectation is that the mathematician can readily develop an intuition
for the breaking up of the mathematical deavelopment into appropriate bite
sized claims that facilitate the creation of computer aided proofs under
this setup. This is supposed to be interesting and relatively painless for
the mathematician.

The input of the mathematician into the proofs is easily and understandably
and naturally documented, so that a text document is easily read and
readily informative.

The computer's job is solely to verify that certain sequents are in fact
self proving. A recipient of the text document is also supplied the
software for doing the self proving. One can simply select from among the
myriad implicit claims of self proving in the text document, and simply run
the "self proving software" on the selections.

The self proving even generates text that is totally sensible, although
full of mundane of manipulations. Nevertheless, the self proving software
can also be made to generate all sorts of quantitative information
concerning the self proving.

Every claim is labeled as such, and is in the following form:

Assume A1,...,Ak. Then B.

Sometimes k = 0, in which case we just write a formula.

Generally, claims have free variables, which are interpreted universally.

The body of the proof contains the following optional information supplied
by the mathematician:

i) one or more instantiations of various previous axioms, previous
definitions, or previous claims;
ii) a single claim (generally a sharper, more detailed form of the claim
being proved).

There is absolutely no other items in the proof. No reasoning, no
inferences, etcetera.

Each of these items are clearly marked so that the machine can act on them.
The machine verifies that i) is as indicated, and that ii) is well formed.

These items are optional. In fact, it may be that none are present.

Now consider the simplest case where we do not have ii). Let C1,...,Cp be a
list of i). Consider the new sequent

A1,...,Ak,C1,...,Cp arrows B.

Then this new sequent is supposed to be SELF PROVING.

The idea is clear. First of all, we cannot expect the original sequent
A1,...,Ak arrows B that we are trying to prove, to be valid in predicate
calculus. One needs to use previous facts, including of course definitions.
Thus we need to import previous axioms, previous definitions, and previous
claims. But not, generally, in their original form - rather we want
appropriate instantiations which take into account the original sequent
that we are trying to prove, and in particular the free variables there.

When we import previous claims, we put them into a single formula, rather
than as a sequent (this is not a major point).

Item i), in various guises, serves some very useful purposes. Sometimes the
clearest way to prove something is by cases defined by a statement phi. To
force the computer to do this in the self proving algorithm, we would use
the formula (phi or not(phi)). This obviously comes under i) as an
instantiation of an axiom. This is also a natural choice.

In particular, when we discuss self proving in the next section, the effect
of adding (phi or not(phi)) on the left is to split into two sequents, one
with phi on the left and one with not(phi) on the left. The computer
without the mathematician is fully capable of doing such splits, except
that the computer would not generally know any good phi to do this
splitting. If the mathematician feels that the proof is clearest under such
a split with phi, then the mathematician will include (phi or not(phi))
under i).

Another important use of i) is to tell the computer about important terms
that it might not come up with in the self proving algorithm on its own.
This is easily and naturally done by writing down t = t under i), if the
mathematician wants the computer to be incorporating the term t in its self
proving algorithm.

We now come to the deepest part of the discussion, where ii) is present.
This claim is generally a modified form of the original sequent, but
containing more detailed information - and (less importantly) put into a
single formula rather than a sequent.

The kind of extra information is typically in the form of terms replacing
quantifiers. In this way, we might even eliminate all quantifiers in the
original sequent in favor of new terms.

But there is another - perhaps additional - move that we may wish to make,
particularly if there are AE or AEA statements on the left. And that is to
introduce new function symbols - or Skolem functions. We may want to at
least partially Skolemize the original sequent, and also use terms to
replace quantifiers. We can think of this as a mathematician controlled
form of resolution theorem proving.

We wish to avoid use of any resolution theorem proving. There are important
ideas there, but we believe that partial forms of resolution ideas should
be implemented only, and under strict, but totally natural and effective
control of the mathematician - never the computer.

We now come to the crux of the matter of how we use the single claim under
ii) which we call D. Recall that C1,...,Cp is the list of i), and A1,...,Ak
arrows B is the sequent being proved.

Consider the two sequents

C1,...,Cp arrows D

A1,...,Ak,C1,...,Cp,D arrows B

We demand that both of these sequents are self proving.

The idea is that the first sequent above being self proving indicates that
we have replaced the original sequent by a sharper more detailed claim that
rescues us from the likely disappointment that A1,...,Ak,C1,...,Cp arrows B
is not self proving. Frequently, it is self proving, in which case we do
not need D.

The sequent above being self proving tells us this: it is easily verified
that the sharper more detailed claim D does indeed imply the original

The principal thesis is of course that it is relatively painless and fun to
construct proofs like this for a mathematician.

Note that under this approach, the mathematician is always at a clearly
meaningful semantic level.

Very often, a natural choice of i) may be sufficient, without resorting to
ii). Thus i) alone should be tried with successively more and more entries
in i), before moving to ii).


We now discuss the crucial notion of self proving.

Let A1,...,Ak arrows B be a sequent that is trying to be self proved by the

We are going to be using a somewhat sugared form of predicate calculus, so
we first need to expand A1,...,Ak,B into ordinary predicate calculus. This
step is completely routine, and does not involve unraveling definitions.
Terms will be preserved, and we will use a free form of the predicate
calculus where terms may or may not be defined. In any case, the expansion
of A1,...,Ak,B is not much of an increase in length or a major change of

We will simplify the discussion by assuming that A1,...,Ak,B are in
ordinary first order predicate calculus with equality.

Let B' be the dual of B. We pass to the list A1,...,Ak,B'. The idea is that
we wish to refute this set.

Do the obvious manipulations so that there are no implications, and
negations apply only to atomic formulas. Split the conjunctions. Replace
outermost existential quantifiers by distinct new variables.

We then have a set E1,...,Eq of formulas that are either atomic, negations
of atomic, disjunctions, or universally quantified. Delete repetitions,
using change of bound variables if needed. Check for tautological
inconsistency (using the equality axioms).

Now look at the disjunctions. Say there are m of them. Split on them. This
results in 2^m new sets of formulas. More if the disjunctions are multiple
(more than binary).

Now repeat the process with these new sets. We may also compare the various
sets, looking for inclusion relations, again using change of bound
variables if useful. If one sets is included in another, we delete the

Eventually we will get a set of pairwise incomparable sets of formulas,
with elimination of repetition, each of which is either atomic, negated
atomic, or universal. We seek to refute each of these sets of formulas. We
assume that we  have already checked for tautological inconsistency (using
the equality axioms) of each set of formulas.

At this point we pause to call the above the FIRST PREPARATION STAGE.

We now want to deal with the crux of the matter - which is those universal

Let phi be a formula. A crticial term t of phi is a term t such that either
t is a free variable of phi, or there is an occurrence of t in phi for which

i) no variable in this occurrence of t lies within the scope of a bound
quantifier in phi;
ii) this occurrence of t is not properly part of an occurrence of a term t'
in phi that obeys i).

A subcritical term of phi is a subterm of a critical term of phi. Every
critical term of phi is a subcritical term of phi.

The (sub)critical terms of a set of formulas are the (sub)critical terms of
its elements.

In the FIRST INSTANTIATION STAGE, we take each set and each universally
quantified statement in each set, and insert the result of instantiating
the universal quantifier with all of the critical terms of that set. This
term substitution must be legal, and for this purpose we may have to change
the bound variables in some of the formulas.

A variant of the FIRST INSTANTIATION STAGE is to use all of the subcritical
terms of each set for substitution in that set, rather than just the
critical terms. Experimentation is needed to assess the significance of
this variant. It generally creates larger sets, but also may allow for more
self proving. One needs to gather experience and data here.

After the FIRST INSTANTIATION STAGE, the computer moves into the SECOND
except that we work on all of the sets of formulas. Of course, we eliminate
those sets of formulas that are tautologically inconsistent.

This is followed by the SECOND INSTANTIATION STAGE. Of course, the FIRST
INSTANTATION STAGE may create new (sub)critical terms, which then are used

This is followed by the THIRD PREPARATION STAGE, etcetera.

The mathematician may want to control which is the last stage to be
performed, and in the case of the instantiation stages, control whether one
is using critical terms or the broader subcritical terms.

I am under the impression that there are modern ways to handle a
proliferation of terms by using, e.g., pointers to copy parts of terms, so
that the sizes are under reasonable control. Thus the use of subcritical
terms may be generally feasible.

In any case, I believe that these processes in real examples from
mathematics, in the hands of a mathematician who has a feel for bite sized
lemmas, are sufficiently controllable that a large number of clear examples
can be sketched by hand in a convincing manner.


We are not trying to create a particularly powerful automated proof checker
or theorem prover, but rather one that is based on a very clear division of
labor between mathematician and computer. We want to facilitate the
construction of completely verified mathematics, with particularly friendly
output, particularly friendly input, maximum leverage of mathematical
intution, and where what is under the hood - i.e., what is done by the
computer - is completely transparent. The documentation is such that this

i) lead to a new understanding and theory of actual mathematical proofs;
ii) lead to a new kind of tool for members of the mathematical community to
verify in a particularly friendly and efficient way.

This verification in ii) is not only of basic mathematics, but also,
ultimately, of hardware/software systems. I am working on a functional
programming language that is intended to have a suitably efficient
implementation, and where verification of programs is intended to be
particularly easy using the kind of automated proof checking system we are
talking about here.


 This is the 106th in a series of self contained postings to FOM covering
 a wide range of topics in f.o.m. Previous ones are:

 1:Foundational Completeness   11/3/97, 10:13AM, 10:26AM.
 2:Axioms  11/6/97.
 3:Simplicity  11/14/97 10:10AM.
 4:Simplicity  11/14/97  4:25PM
 5:Constructions  11/15/97  5:24PM
 6:Undefinability/Nonstandard Models   11/16/97  12:04AM
 7.Undefinability/Nonstandard Models   11/17/97  12:31AM
 8.Schemes 11/17/97    12:30AM
 9:Nonstandard Arithmetic 11/18/97  11:53AM
 10:Pathology   12/8/97   12:37AM
 11:F.O.M. & Math Logic  12/14/97 5:47AM
 12:Finite trees/large cardinals  3/11/98  11:36AM
 13:Min recursion/Provably recursive functions  3/20/98  4:45AM
 14:New characterizations of the provable ordinals  4/8/98  2:09AM
 14':Errata  4/8/98  9:48AM
 15:Structural Independence results and provable ordinals  4/16/98
 16:Logical Equations, etc.  4/17/98  1:25PM
 16':Errata  4/28/98  10:28AM
 17:Very Strong Borel statements  4/26/98  8:06PM
 18:Binary Functions and Large Cardinals  4/30/98  12:03PM
 19:Long Sequences  7/31/98  9:42AM
 20:Proof Theoretic Degrees  8/2/98  9:37PM
 21:Long Sequences/Update  10/13/98  3:18AM
 22:Finite Trees/Impredicativity  10/20/98  10:13AM
 23:Q-Systems and Proof Theoretic Ordinals  11/6/98  3:01AM
 24:Predicatively Unfeasible Integers  11/10/98  10:44PM
 25:Long Walks  11/16/98  7:05AM
 26:Optimized functions/Large Cardinals  1/13/99  12:53PM
 27:Finite Trees/Impredicativity:Sketches  1/13/99  12:54PM
 28:Optimized Functions/Large Cardinals:more  1/27/99  4:37AM
 28':Restatement  1/28/99  5:49AM
 29:Large Cardinals/where are we? I  2/22/99  6:11AM
 30:Large Cardinals/where are we? II  2/23/99  6:15AM
 31:First Free Sets/Large Cardinals  2/27/99  1:43AM
 32:Greedy Constructions/Large Cardinals  3/2/99  11:21PM
 33:A Variant  3/4/99  1:52PM
 34:Walks in N^k  3/7/99  1:43PM
 35:Special AE Sentences  3/18/99  4:56AM
 35':Restatement  3/21/99  2:20PM
 36:Adjacent Ramsey Theory  3/23/99  1:00AM
 37:Adjacent Ramsey Theory/more  5:45AM  3/25/99
 38:Existential Properties of Numerical Functions  3/26/99  2:21PM
 39:Large Cardinals/synthesis  4/7/99  11:43AM
 40:Enormous Integers in Algebraic Geometry  5/17/99 11:07AM
 41:Strong Philosophical Indiscernibles
 42:Mythical Trees  5/25/99  5:11PM
 43:More Enormous Integers/AlgGeom  5/25/99  6:00PM
 44:Indiscernible Primes  5/27/99  12:53 PM
 45:Result #1/Program A  7/14/99  11:07AM
 46:Tamism  7/14/99  11:25AM
 47:Subalgebras/Reverse Math  7/14/99  11:36AM
 48:Continuous Embeddings/Reverse Mathematics  7/15/99  12:24PM
 49:Ulm Theory/Reverse Mathematics  7/17/99  3:21PM
 50:Enormous Integers/Number Theory  7/17/99  11:39PN
 51:Enormous Integers/Plane Geometry  7/18/99  3:16PM
 52:Cardinals and Cones  7/18/99  3:33PM
 53:Free Sets/Reverse Math  7/19/99  2:11PM
 54:Recursion Theory/Dynamics 7/22/99 9:28PM
 55:Term Rewriting/Proof Theory 8/27/99 3:00PM
 56:Consistency of Algebra/Geometry  8/27/99  3:01PM
 57:Fixpoints/Summation/Large Cardinals  9/10/99  3:47AM
 57':Restatement  9/11/99  7:06AM
 58:Program A/Conjectures  9/12/99  1:03AM
 59:Restricted summation:Pi-0-1 sentences  9/17/99  10:41AM
 60:Program A/Results  9/17/99  1:32PM
 61:Finitist proofs of conservation  9/29/99  11:52AM
 62:Approximate fixed points revisited  10/11/99  1:35AM
 63:Disjoint Covers/Large Cardinals  10/11/99  1:36AM
 64:Finite Posets/Large Cardinals  10/11/99  1:37AM
 65:Simplicity of Axioms/Conjectures  10/19/99  9:54AM
 66:PA/an approach  10/21/99  8:02PM
 67:Nested Min Recursion/Large Cardinals  10/25/99  8:00AM
 68:Bad to Worse/Conjectures  10/28/99  10:00PM
 69:Baby Real Analysis  11/1/99  6:59AM
 70:Efficient Formulas and Schemes  11/1/99  1:46PM
 71:Ackerman/Algebraic Geometry/1  12/10/99  1:52PM
 72:New finite forms/large cardinals  12/12/99  6:11AM
 73:Hilbert's program wide open?  12/20/99  8:28PM
 74:Reverse arithmetic beginnings  12/22/99  8:33AM
 75:Finite Reverse Mathematics  12/28/99  1:21PM
 76: Finite set theories  12/28/99  1:28PM
 77:Missing axiom/atonement  1/4/00  3:51PM
 78:Quadratic Axioms/Literature Conjectures  1/7/00  11:51AM
 79:Axioms for geometry  1/10/00  12:08PM
 80.Boolean Relation Theory  3/10/00  9:41AM
 81:Finite Distribution  3/13/00  1:44AM
 82:Simplified Boolean Relation Theory  3/15/00  9:23AM
 83:Tame Boolean Relation Theory  3/20/00  2:19AM
 84:BRT/First Major Classification  3/27/00  4:04AM
 85:General Framework/BRT   3/29/00  12:58AM
 86:Invariant Subspace Problem/fA not= U  3/29/00  9:37AM
 87:Programs in Naturalism  5/15/00  2:57AM
 88:Boolean Relation Theory  6/8/00  10:40AM
 89:Model Theoretic Interpretations of Set Theory  6/14/00 10:28AM
 90:Two Universes  6/23/00  1:34PM
 91:Counting Theorems  6/24/00  8:22PM
 92:Thin Set Theorem  6/25/00  5:42AM
 93:Orderings on Formulas  9/18/00  3:46AM
 94:Relative Completeness  9/19/00  4:20AM
 95:Boolean Relation Theory III  12/19/00  7:29PM
 96:Comments on BRT  12/20/00  9:20AM
 97.Classification of Set Theories  12/22/00  7:55AM
 98:Model Theoretic Interpretation of Large Cardinals  3/5/01  3:08PM
 99:Boolean Relation Theory IV  3/8/01  6:08PM
100:Boolean Relation Theory IV corrected  3/21/01  11:29AM
101:Turing Degrees/1  4/2/01  3:32AM
102: 102:Turing Degrees/2  4/8/01  5:20PM
103:Hilbert's Program for Consistency Proofs/1 4/11/01  11:10AM
104:Turing Degrees/3   4/12/01  3:19PM
105:Turing Degrees/4   4/26/01  7:44PM
106.Degenerative Cloning  5/4/01  10:57AM

More information about the FOM mailing list