[FOM] 185:Ideas in Proof Checking 4

Harvey Friedman friedman at math.ohio-state.edu
Wed Jun 25 03:25:01 EDT 2003


Here we continue the posting #184. We move on to predicate calculus 
with equality.

The notion of finished module needs to be expanded. It's if there is 
an absurdity of the form

i) T(A), F(A'), where A' is an alphabetic variant of A (change of 
bound variables);
ii) F(t = t).

1. TOWARDS A THEORY OF TRIVIALITIES.

We would like to have a useful notion of trivial logical implication 
with a theoretical basis, so that it is reasonably natural from a 
logical point of view.

Our approach is to use the same Beth tableaux type proof construction 
that is employed by the user in section 2. The computer works in the 
background, and is required to use a forced set of priorities. The 
user works in the foreground, on the active module, and is in 
complete control.

The computer will maintain a list of unfinished modules, each of 
which consist of a nonempty finite set of signed formulas.

The free terms in the module are those terms that have an occurrence 
in a formula in the module, in which every occurrence of every 
variable is free. By an occurrence of a term, we mean a maximal 
occurrence - i.e., there is no bigger term occurrence.

The computer starts with a given module - a nonempty finite set of 
signed formulas.

When the computer is working on the given module, it first applies 
these reduction rules the highest priority, which simply expands the 
module, but does not create any new unfinished modules. These are 
applied until they cannot be applied any more.

R1. T(notA). Create F(A).
R2. F(notA). Create T(A).
R3. T(A1 and ... and An). Create T(A1),...,T(An).
R4. F(A1 or ... or An). Create F(A1),...,F(An).
R5. F(A implies B). Create T(A), F(B).
R6. T(A iff B). Create T(A implies B), T(B implies A).
R7. T(A), T(A implies B). Create T(B).
R8. F(B), T(A implies B). Create F(A).
R9. T(A1 or ... or An), F(Ai). Create T(A1 or ... or An) with Ai removed.
R10. F(A1 and ... and An), T(Ai). Create F(A1 and ... and An) with Ai removed.
R11. T(A implies notB). Create F(A and B).
R12. T(notA implies B). Create T(A or B),.
R13. T(notA implies notB). Create T(B implies A).
R14. T(s = t). Create T(t = s).
R15. T(s = t). Consider the formulas in the module obtained by taking 
any formula in the module, and replacing every free occurrence of s 
in the module by t, provided this occurrence of t is also free. 
Otherwise, create an alphabetic variant of the formula so that this 
is the case. Add T of all of these new formulas to the module.

Next apply the following splitting rules.

S1. T(A implies B). Create T(B). Create new module by using F(A) 
instead of T(B).
S2. T(A1 or ... or An). Create T(A1). Create n-1 new modules by using 
T(A2),...,T(An), respectively,  instead of T(A1).
S3. F(A1 and ... and An). Create F(A1). Create n-1 new modules by 
using F(A2),...,F(An), respectively,  instead of F(A1).
S4. F(A iff B). Create T(A),F(B). Create a new module by using 
T(B),F(A) instead of T(A),F(B).

These new modules are put in the list of modules right after the 
active module. They are considered WITNESS FREE. This is because the 
witness rules have not yet been applied.

Next apply these instantiation rules. (All occurrence of terms are maximal).

I1. T((forall x)(A)). Create T(A[x/t]) for t = x, and for every free 
term t in the module. If t is not substitutable for x in A, then use 
T(A'[x/t]) where A' is an alphabetic variant chosen so that t is 
substitutable for x in A'.
I2. F((therexists x)(A)). Create F(A[x/t]) for t = x, and for every 
free term t in the module. If t is not substitutable for x in A, then 
use F(A'[x/t]) where A' is an alphabetic variant chosen so that t is 
substitutable for x in A'.

Now apply the following two witness rules.

W1. T((therexists x)(A)). Create T(A[x/v]), where v is a variable not 
free in the module, and v is substitutable for x in A. Use v = x if 
possible.
W2. F((forall x)(A)). Create F(A[x/v]), where v is a variable not 
free in the module, and v is substitutable for x in A. Use v = x if 
possible.

W1 and W2 are applied exactly once for each such formula.

Now repeat the process indefinitely, BUT WITHOUT THE WITNESS RULES 
EVER BEING APPLIED AGAIN. I.e., reduction, splitting, instantiation, 
reduction, splitting, instantiation, etcetera. The new modules 
created by the splittings are NOT considered witness free. This must 
terminate.

If an absurdity in the sense described just before this section, does 
not occur, then we consider the given module nontrivial.

Suppose an absurdity is generated. Then we remove the initial module 
from memory, and start the same process on the next unfinished module 
in the computer's list of unfinished modules.

The witness rule can only be applied in unfinished modules that have 
been declared WITNESS FREE. This is important, because after the 
witness rules have been applied, we continue to generate splittings, 
and we should not apply witness rules to the results of these 
splittings.

The given module is considered TRIVIAL if and only if eventually all 
modules produce an absurdity, and there are no modules left to work 
on.

Note that we have not been very sophisticated regarding which terms 
to use in the instantiation rules. If the computer has time left 
over, it should probably try subterms of free terms. But clever 
selection of terms is something that the USER is particularly good 
at, and it may be just fine that the entire burden of clever 
selection of terms is on the user.

Thus I am confident that enough genuine trivialities are recognizable 
by this procedure in order to be good enough to be a useful SLAVE to 
the user in the user/computer interaction described in the next 
section.

On the other hand, one could use a highly developed theorem prover 
like Otter as the slave, if need be.

But bear in mind that all of this is in the context where the user is 
expected to be using LEMMA ENRICHMENT in a strong way, in the context 
of a very friendly language for the formalization of mathematics.

2. USER/COMPUTER INTERACTION.

Here are the rules for the user. Assume an active unfinished module.

1. "I don't feel like finishing this module now". The user selects 
another unfinished module. The present state of the closed unfinished 
module is saved so that when it is next activated, the primary window 
comes up just as it was left. However, the secondary window may be 
enlarged because of the background work by the computer.

2. Select a signed formula in the primary window and select move. 
This removes the signed formula from the primary window and enters it 
in the secondary window.

3. Select a signed formula in the secondary window and select move. 
This removes the signed formula from the secondary window and enters 
it in the primary window.

4. The user makes an import from the official record. Say it is A. 
The user selects A from the record and selects move. The signed 
formula T(A) remains in the official record (that's a permanent 
record), and it appears in the primary window. If the previously 
established result from the official record is in implicational form, 
then the computer converts it to a single signed formula A, and then 
T(A) appears in the primary window.

5. The user makes a local definition via a dialog box. The dialog box 
is called up by selecting local. Specifically, the user introduces a 
new symbol by explicit definition, locally to the active module. This 
appears in the primary window as a signed formula T(A) which is the 
introduction of a new symbol by explicit definition. Here we are 
making a definition that is in force only for the active module. 
Consequently, we allow free variables in the definition, A, that are 
free in the active module. We caution that this T(A) cannot be 
imported to other modules.

6. The user makes a global definition via a dialog box. The dialog 
box is called up by selecting global. Specifically, the user 
introduces a new symbol by explicit definition, which is in the 
proper form to be entered into the record of the UMD3, but is not 
sufficiently interesting or fundamental to be submitted to the UMD 
for this purpose. However, it will appear in the UMD3 record while 
this proof is being done, and will be just as accessible as other 
statements in the record. However, it will never appear on the UMD3 
record, and so will not be importable to other proofs. This appears 
in the primary window as a signed formula T(A#) which is the 
universal closure of the introduction of a new symbol by explicit 
definition. This is necessarily more restrictive than in 5, where 
parameters are allowed, since it is useable by every module in this 
proof.

7. Implication splitting. The user clicks on T(A implies B) in the 
primary window and selects split. T(B) appears in the primary window, 
and T(A implies B) remains. The computer creates a new unfinished 
module with the same primary and secondary windows, with T(B) 
replaced by F(A). (This new unfinished module is hidden from view, 
until it gets called up by the user, or by default).

8. Disjunction splitting. The user clicks on T(A1 or ... or An) in 
the primary window and selects split. T(A1) appears in the primary 
window, and T(A1 or ... or An) remains. The computer creates n-1 
unfinished modules with the same primary and secondary windows, with 
T(A1) replaced by T(A2),...,T(An) respectively.

9. Conjunction splitting. The user clicks on F(A1 and ... and An) in 
the primary window and selects split. F(A1) appears in the primary 
window, and F(A1 or ... or An) remains. The computer creates n-1 
unfinished modules with the same primary and secondary windows, with 
F(A1) replaced by F(A2),...,F(An) respectively.

10. Equivalence splitting. The user clicks on F(A iff B) in the 
primary window and selects split. T(A),F(B) appear in the primary 
window, and F(A iff B) remains. The computer creates an unfinished 
module with the same primary and secondary windows, with T(A),F(B) 
replaced by T(B),F(A).

11. Wild card splitting. The user types wild, which calls up the wild 
dialog box. The user enters any signed formula in the dialog box, 
using only symbols previously introduced, including those introduced 
through local and global definitions. The signed formula is entered 
into the primary window, and the computer creates an unfinished 
module with the same primary and secondary windows, with the dual of 
the signed formula replacing the signed formula.

A very typical use of the WILD CARD is when the user is in hot 
pursuit of getting the desired absurdity to finish the module, but 
needs some obvious facts that have been proved previously, or have at 
least nearly been proved previously, or are trivial. Maybe it's a 
matter of some algebra. It's difficult to predict exactly all the 
algebra needed before the user starts on a given module, and 
especially before the user starts on the proof. The user can then use 
any algebra he wants to, knowing that the computer will later ask for 
a justification of these algebraic manipulations in the form of 
additional unfinished modules. The user can pick a time when he is in 
the mood for dealing with such boring matters, choosing to finish 
more interesting modules in the meantime.

It is this use of WILD CARD that prompts us to keep the computer busy 
with the process of looking for trivialities in section 1.

In this use of WILD CARD, the user hopes that the signed formula he 
is entering turns out to be TRIVIAL according to the computer. The 
hope is that the computer will have finished the unfinished module 
created in WILD CARD. See section 3 on background computing. The 
worst that can happen is that the computer doesn't finish it, and the 
user will eventually have to. A friendly computer will notify the 
user that it has "finished the WILD CARD split", which hopefully will 
be during the time the user is working with the WILD CARD entry.

Of course, sometimes WILD CARD is used in a more critical way, 
arguing from the truth and separately the falsity of a crucial 
assertion that only the user would naturally come up with.

3. BACKGROUND COMPUTING.

The computer will do two kinds of computing in the background.

a, Every time a module gets finished - i.e., an absurdity is 
generated, there is an origin module that is sufficient to generate 
that absurdity. The computer keeps track of all minimum generating 
modules that generate absurdity. These are available to the user for 
browsing. To avoid duplication of work, the computer checks to see if 
any of these minimum generating modules are contained in the entries 
in an unfinished module. If this is the case, then that unfinished 
module is considered finished.

The browsing by the user of these minimum generating modules can give 
the user ideas as to good uses of WILD CARD.

b. The computer runs the process spelled out in section 1 on each and 
every unfinished module, including the active one. The computer 
notifies the user in a friendly way whenever it has finished a module 
in this way.

c. The computer should not waste time. It can rev up the process in 
section 1 by allowing every module to use the witness rules twice 
instead of once. Also expand the terms used in other systematic ways.

d. Also, an idea is to use an existing proof checker as a slave. This 
is a dirty solution, but perhaps highly effective, taking advantage 
of existing technology. E.g., OTTER. This can be combined with 
section 1 and this section 3, or it could replace it. The computing 
involved in section 2 is essentially trivial, and so there is plenty 
of computer time available to run a slave like OTTER.

e. Other, more special slaves can also be hooked up. The interfaces 
to them need to be established. Of course these interfaces cannot be 
discussed until the language for the formalization of mathematics is 
specified. These more special slaves include MAPLE, and various 
packages for certain classes of statements. E.g., Presburger 
arithmetic and real algebra. Also, there is major room for various 
other kinds of fragments that may NOT be decidable.

4. CERTAIN IMPLEMENTATIONAL DETAILS.

Let us review the overall architecture of the system. We have been 
somewhat vague about the relationship between the modules created by 
the user doing a splitting, and the modules created by the computer 
doing a splitting. We clear this up, as well as other issues.

The interaction begins when a specific statement is made by the user 
and the user initiates a proof of it. The initial module is created, 
which is a list of one or more signed formulas corresponding to the 
specific statement.

The specific statement lives in the context of the record of all 
accepted statements. These accepted statements include definitions, 
axioms, lemmas, theorems, corollaries, and remarks. The user has 
access to the record for import. The specific statement must be in 
the language of the record.

At the moment the interaction begins, the computer initiates and 
maintains a numbered list of unfinished user modules, with just the 
one entry. Exactly one of the unfinished user modules is considered 
to be the active unfinished user module.

Each unfinished user module is identified with a set of signed 
formulas. These signed formulas are divided into two lists. The first 
list is the primary list, and the second list is the secondary list. 
Upon activation of an unfinished user module, the primary list is 
displayed in the primary window, and the secondary list is displayed 
in the secondary window.

At various times, the primary and secondary lists of the unfinished 
user modules are adjusted as a response to the user, and also as a 
response to the computer. The combined lists for any unfinished user 
module always remain the same or expands. There are never any 
deletions in the combined list for any unfinished user module. The 
computer can only cause changes in the secondary lists (by an 
expansion).

At various times, new modules are created in response to the user and 
to the computer. The computer maintains a list of all unfinished user 
created modules. These unfinished user created modules, at any given 
point in time, can be thought of as the terminal nodes in a tree. At 
various times, some modules become absurd, and are then removed. 
These modules only expand, as directed by both the user and the 
computer.

Also, the computer maintains a list of all unfinished modules, 
including user created modules and computer created modules. These 
unfinished user created modules, at any given point in time, can be 
thought of as the terminal nodes in a tree. At various times, some 
modules become absurd, and are then removed. These modules only 
expand. The user created modules expand as directed by BOTH the user 
and the computer. The computer created modules expand only as 
directed by the computer, and NOT by the user.

Note that an unfinished module, whether created by the user or the 
computer, can get finished - i.e., get an absurdity - and thereby be 
removed, without the user being responsible. This can be a rather 
pleasant surprise to the user.

We now discuss UNDO.

The very best way to implement the most powerful UNDO is too early to 
specify at this time. However, it seems clear already that with some 
care, one can have a very useful UNDO.

*********************************************

I use http://www.mathpreprints.com/math/Preprint/show/ for manuscripts with
proofs. Type Harvey Friedman in the window.
This is the 185th in a series of self contained numbered postings to 
FOM covering a wide range of topics in f.o.m. The list of previous 
numbered postings #1-149 can be found at 
http://www.cs.nyu.edu/pipermail/fom/2003-May/006563.html  in the FOM 
archives, 5/8/03 8:46AM. Previous ones counting from #150 are:

150:Finite obstruction/statistics  8:55AM  6/1/02
151:Finite forms by bounding  4:35AM  6/5/02
152:sin  10:35PM  6/8/02
153:Large cardinals as general algebra  1:21PM  6/17/02
154:Orderings on theories  5:28AM  6/25/02
155:A way out  8/13/02  6:56PM
156:Societies  8/13/02  6:56PM
157:Finite Societies  8/13/02  6:56PM
158:Sentential Reflection  3/31/03  12:17AM
159.Elemental Sentential Reflection  3/31/03  12:17AM
160.Similar Subclasses  3/31/03  12:17AM
161:Restrictions and Extensions  3/31/03  12:18AM
162:Two Quantifier Blocks  3/31/03  12:28PM
163:Ouch!  4/20/03  3:08AM
164:Foundations with (almost) no axioms, 4/22/0  5:31PM
165:Incompleteness Reformulated  4/29/03  1:42PM
166:Clean Godel Incompleteness  5/6/03  11:06AM
167:Incompleteness Reformulated/More  5/6/03  11:57AM
168:Incompleteness Reformulated/Again 5/8/03  12:30PM
169:New PA Independence  5:11PM  8:35PM
170:New Borel Independence  5/18/03  11:53PM
171:Coordinate Free Borel Statements  5/22/03  2:27PM
172:Ordered Fields/Countable DST/PD/Large Cardinals  5/34/03  1:55AM
173:Borel/DST/PD  5/25/03  2:11AM
174:Directly Honest Second Incompleteness  6/3/03  1:39PM
175:Maximal Principle/Hilbert's Program  6/8/03  11:59PM
176:Count Arithmetic  6/10/03  8:54AM
177:Strict Reverse Mathematics 1  6/10/03  8:27PM
178:Diophantine Shift Sequences  6/14/03  6:34PM
179:Polynomial Shift Sequences/Correction  6/15/03  2:24PM
180:Provable Functions of PA  6/16/03  12:42AM
181:Strict Reverse Mathematics 2:06/19/03  2:06AM
182:Ideas in Proof Checking 1 10:50PM  6/21/03
183:Ideas in Proof Checking 2 5:48PM  6/22/03
184:Ideas in Proof Checking 3 5:58PM  6/23/03

Harvey Friedman


More information about the FOM mailing list