[FOM] 185:Ideas in Proof Checking 4
Harvey Friedman
friedman at math.ohio-state.edu
Wed Jun 25 03:25:01 EDT 2003
Here we continue the posting #184. We move on to predicate calculus
with equality.
The notion of finished module needs to be expanded. It's if there is
an absurdity of the form
i) T(A), F(A'), where A' is an alphabetic variant of A (change of
bound variables);
ii) F(t = t).
1. TOWARDS A THEORY OF TRIVIALITIES.
We would like to have a useful notion of trivial logical implication
with a theoretical basis, so that it is reasonably natural from a
logical point of view.
Our approach is to use the same Beth tableaux type proof construction
that is employed by the user in section 2. The computer works in the
background, and is required to use a forced set of priorities. The
user works in the foreground, on the active module, and is in
complete control.
The computer will maintain a list of unfinished modules, each of
which consist of a nonempty finite set of signed formulas.
The free terms in the module are those terms that have an occurrence
in a formula in the module, in which every occurrence of every
variable is free. By an occurrence of a term, we mean a maximal
occurrence - i.e., there is no bigger term occurrence.
The computer starts with a given module - a nonempty finite set of
signed formulas.
When the computer is working on the given module, it first applies
these reduction rules the highest priority, which simply expands the
module, but does not create any new unfinished modules. These are
applied until they cannot be applied any more.
R1. T(notA). Create F(A).
R2. F(notA). Create T(A).
R3. T(A1 and ... and An). Create T(A1),...,T(An).
R4. F(A1 or ... or An). Create F(A1),...,F(An).
R5. F(A implies B). Create T(A), F(B).
R6. T(A iff B). Create T(A implies B), T(B implies A).
R7. T(A), T(A implies B). Create T(B).
R8. F(B), T(A implies B). Create F(A).
R9. T(A1 or ... or An), F(Ai). Create T(A1 or ... or An) with Ai removed.
R10. F(A1 and ... and An), T(Ai). Create F(A1 and ... and An) with Ai removed.
R11. T(A implies notB). Create F(A and B).
R12. T(notA implies B). Create T(A or B),.
R13. T(notA implies notB). Create T(B implies A).
R14. T(s = t). Create T(t = s).
R15. T(s = t). Consider the formulas in the module obtained by taking
any formula in the module, and replacing every free occurrence of s
in the module by t, provided this occurrence of t is also free.
Otherwise, create an alphabetic variant of the formula so that this
is the case. Add T of all of these new formulas to the module.
Next apply the following splitting rules.
S1. T(A implies B). Create T(B). Create new module by using F(A)
instead of T(B).
S2. T(A1 or ... or An). Create T(A1). Create n-1 new modules by using
T(A2),...,T(An), respectively, instead of T(A1).
S3. F(A1 and ... and An). Create F(A1). Create n-1 new modules by
using F(A2),...,F(An), respectively, instead of F(A1).
S4. F(A iff B). Create T(A),F(B). Create a new module by using
T(B),F(A) instead of T(A),F(B).
These new modules are put in the list of modules right after the
active module. They are considered WITNESS FREE. This is because the
witness rules have not yet been applied.
Next apply these instantiation rules. (All occurrence of terms are maximal).
I1. T((forall x)(A)). Create T(A[x/t]) for t = x, and for every free
term t in the module. If t is not substitutable for x in A, then use
T(A'[x/t]) where A' is an alphabetic variant chosen so that t is
substitutable for x in A'.
I2. F((therexists x)(A)). Create F(A[x/t]) for t = x, and for every
free term t in the module. If t is not substitutable for x in A, then
use F(A'[x/t]) where A' is an alphabetic variant chosen so that t is
substitutable for x in A'.
Now apply the following two witness rules.
W1. T((therexists x)(A)). Create T(A[x/v]), where v is a variable not
free in the module, and v is substitutable for x in A. Use v = x if
possible.
W2. F((forall x)(A)). Create F(A[x/v]), where v is a variable not
free in the module, and v is substitutable for x in A. Use v = x if
possible.
W1 and W2 are applied exactly once for each such formula.
Now repeat the process indefinitely, BUT WITHOUT THE WITNESS RULES
EVER BEING APPLIED AGAIN. I.e., reduction, splitting, instantiation,
reduction, splitting, instantiation, etcetera. The new modules
created by the splittings are NOT considered witness free. This must
terminate.
If an absurdity in the sense described just before this section, does
not occur, then we consider the given module nontrivial.
Suppose an absurdity is generated. Then we remove the initial module
from memory, and start the same process on the next unfinished module
in the computer's list of unfinished modules.
The witness rule can only be applied in unfinished modules that have
been declared WITNESS FREE. This is important, because after the
witness rules have been applied, we continue to generate splittings,
and we should not apply witness rules to the results of these
splittings.
The given module is considered TRIVIAL if and only if eventually all
modules produce an absurdity, and there are no modules left to work
on.
Note that we have not been very sophisticated regarding which terms
to use in the instantiation rules. If the computer has time left
over, it should probably try subterms of free terms. But clever
selection of terms is something that the USER is particularly good
at, and it may be just fine that the entire burden of clever
selection of terms is on the user.
Thus I am confident that enough genuine trivialities are recognizable
by this procedure in order to be good enough to be a useful SLAVE to
the user in the user/computer interaction described in the next
section.
On the other hand, one could use a highly developed theorem prover
like Otter as the slave, if need be.
But bear in mind that all of this is in the context where the user is
expected to be using LEMMA ENRICHMENT in a strong way, in the context
of a very friendly language for the formalization of mathematics.
2. USER/COMPUTER INTERACTION.
Here are the rules for the user. Assume an active unfinished module.
1. "I don't feel like finishing this module now". The user selects
another unfinished module. The present state of the closed unfinished
module is saved so that when it is next activated, the primary window
comes up just as it was left. However, the secondary window may be
enlarged because of the background work by the computer.
2. Select a signed formula in the primary window and select move.
This removes the signed formula from the primary window and enters it
in the secondary window.
3. Select a signed formula in the secondary window and select move.
This removes the signed formula from the secondary window and enters
it in the primary window.
4. The user makes an import from the official record. Say it is A.
The user selects A from the record and selects move. The signed
formula T(A) remains in the official record (that's a permanent
record), and it appears in the primary window. If the previously
established result from the official record is in implicational form,
then the computer converts it to a single signed formula A, and then
T(A) appears in the primary window.
5. The user makes a local definition via a dialog box. The dialog box
is called up by selecting local. Specifically, the user introduces a
new symbol by explicit definition, locally to the active module. This
appears in the primary window as a signed formula T(A) which is the
introduction of a new symbol by explicit definition. Here we are
making a definition that is in force only for the active module.
Consequently, we allow free variables in the definition, A, that are
free in the active module. We caution that this T(A) cannot be
imported to other modules.
6. The user makes a global definition via a dialog box. The dialog
box is called up by selecting global. Specifically, the user
introduces a new symbol by explicit definition, which is in the
proper form to be entered into the record of the UMD3, but is not
sufficiently interesting or fundamental to be submitted to the UMD
for this purpose. However, it will appear in the UMD3 record while
this proof is being done, and will be just as accessible as other
statements in the record. However, it will never appear on the UMD3
record, and so will not be importable to other proofs. This appears
in the primary window as a signed formula T(A#) which is the
universal closure of the introduction of a new symbol by explicit
definition. This is necessarily more restrictive than in 5, where
parameters are allowed, since it is useable by every module in this
proof.
7. Implication splitting. The user clicks on T(A implies B) in the
primary window and selects split. T(B) appears in the primary window,
and T(A implies B) remains. The computer creates a new unfinished
module with the same primary and secondary windows, with T(B)
replaced by F(A). (This new unfinished module is hidden from view,
until it gets called up by the user, or by default).
8. Disjunction splitting. The user clicks on T(A1 or ... or An) in
the primary window and selects split. T(A1) appears in the primary
window, and T(A1 or ... or An) remains. The computer creates n-1
unfinished modules with the same primary and secondary windows, with
T(A1) replaced by T(A2),...,T(An) respectively.
9. Conjunction splitting. The user clicks on F(A1 and ... and An) in
the primary window and selects split. F(A1) appears in the primary
window, and F(A1 or ... or An) remains. The computer creates n-1
unfinished modules with the same primary and secondary windows, with
F(A1) replaced by F(A2),...,F(An) respectively.
10. Equivalence splitting. The user clicks on F(A iff B) in the
primary window and selects split. T(A),F(B) appear in the primary
window, and F(A iff B) remains. The computer creates an unfinished
module with the same primary and secondary windows, with T(A),F(B)
replaced by T(B),F(A).
11. Wild card splitting. The user types wild, which calls up the wild
dialog box. The user enters any signed formula in the dialog box,
using only symbols previously introduced, including those introduced
through local and global definitions. The signed formula is entered
into the primary window, and the computer creates an unfinished
module with the same primary and secondary windows, with the dual of
the signed formula replacing the signed formula.
A very typical use of the WILD CARD is when the user is in hot
pursuit of getting the desired absurdity to finish the module, but
needs some obvious facts that have been proved previously, or have at
least nearly been proved previously, or are trivial. Maybe it's a
matter of some algebra. It's difficult to predict exactly all the
algebra needed before the user starts on a given module, and
especially before the user starts on the proof. The user can then use
any algebra he wants to, knowing that the computer will later ask for
a justification of these algebraic manipulations in the form of
additional unfinished modules. The user can pick a time when he is in
the mood for dealing with such boring matters, choosing to finish
more interesting modules in the meantime.
It is this use of WILD CARD that prompts us to keep the computer busy
with the process of looking for trivialities in section 1.
In this use of WILD CARD, the user hopes that the signed formula he
is entering turns out to be TRIVIAL according to the computer. The
hope is that the computer will have finished the unfinished module
created in WILD CARD. See section 3 on background computing. The
worst that can happen is that the computer doesn't finish it, and the
user will eventually have to. A friendly computer will notify the
user that it has "finished the WILD CARD split", which hopefully will
be during the time the user is working with the WILD CARD entry.
Of course, sometimes WILD CARD is used in a more critical way,
arguing from the truth and separately the falsity of a crucial
assertion that only the user would naturally come up with.
3. BACKGROUND COMPUTING.
The computer will do two kinds of computing in the background.
a, Every time a module gets finished - i.e., an absurdity is
generated, there is an origin module that is sufficient to generate
that absurdity. The computer keeps track of all minimum generating
modules that generate absurdity. These are available to the user for
browsing. To avoid duplication of work, the computer checks to see if
any of these minimum generating modules are contained in the entries
in an unfinished module. If this is the case, then that unfinished
module is considered finished.
The browsing by the user of these minimum generating modules can give
the user ideas as to good uses of WILD CARD.
b. The computer runs the process spelled out in section 1 on each and
every unfinished module, including the active one. The computer
notifies the user in a friendly way whenever it has finished a module
in this way.
c. The computer should not waste time. It can rev up the process in
section 1 by allowing every module to use the witness rules twice
instead of once. Also expand the terms used in other systematic ways.
d. Also, an idea is to use an existing proof checker as a slave. This
is a dirty solution, but perhaps highly effective, taking advantage
of existing technology. E.g., OTTER. This can be combined with
section 1 and this section 3, or it could replace it. The computing
involved in section 2 is essentially trivial, and so there is plenty
of computer time available to run a slave like OTTER.
e. Other, more special slaves can also be hooked up. The interfaces
to them need to be established. Of course these interfaces cannot be
discussed until the language for the formalization of mathematics is
specified. These more special slaves include MAPLE, and various
packages for certain classes of statements. E.g., Presburger
arithmetic and real algebra. Also, there is major room for various
other kinds of fragments that may NOT be decidable.
4. CERTAIN IMPLEMENTATIONAL DETAILS.
Let us review the overall architecture of the system. We have been
somewhat vague about the relationship between the modules created by
the user doing a splitting, and the modules created by the computer
doing a splitting. We clear this up, as well as other issues.
The interaction begins when a specific statement is made by the user
and the user initiates a proof of it. The initial module is created,
which is a list of one or more signed formulas corresponding to the
specific statement.
The specific statement lives in the context of the record of all
accepted statements. These accepted statements include definitions,
axioms, lemmas, theorems, corollaries, and remarks. The user has
access to the record for import. The specific statement must be in
the language of the record.
At the moment the interaction begins, the computer initiates and
maintains a numbered list of unfinished user modules, with just the
one entry. Exactly one of the unfinished user modules is considered
to be the active unfinished user module.
Each unfinished user module is identified with a set of signed
formulas. These signed formulas are divided into two lists. The first
list is the primary list, and the second list is the secondary list.
Upon activation of an unfinished user module, the primary list is
displayed in the primary window, and the secondary list is displayed
in the secondary window.
At various times, the primary and secondary lists of the unfinished
user modules are adjusted as a response to the user, and also as a
response to the computer. The combined lists for any unfinished user
module always remain the same or expands. There are never any
deletions in the combined list for any unfinished user module. The
computer can only cause changes in the secondary lists (by an
expansion).
At various times, new modules are created in response to the user and
to the computer. The computer maintains a list of all unfinished user
created modules. These unfinished user created modules, at any given
point in time, can be thought of as the terminal nodes in a tree. At
various times, some modules become absurd, and are then removed.
These modules only expand, as directed by both the user and the
computer.
Also, the computer maintains a list of all unfinished modules,
including user created modules and computer created modules. These
unfinished user created modules, at any given point in time, can be
thought of as the terminal nodes in a tree. At various times, some
modules become absurd, and are then removed. These modules only
expand. The user created modules expand as directed by BOTH the user
and the computer. The computer created modules expand only as
directed by the computer, and NOT by the user.
Note that an unfinished module, whether created by the user or the
computer, can get finished - i.e., get an absurdity - and thereby be
removed, without the user being responsible. This can be a rather
pleasant surprise to the user.
We now discuss UNDO.
The very best way to implement the most powerful UNDO is too early to
specify at this time. However, it seems clear already that with some
care, one can have a very useful UNDO.
*********************************************
I use http://www.mathpreprints.com/math/Preprint/show/ for manuscripts with
proofs. Type Harvey Friedman in the window.
This is the 185th in a series of self contained numbered postings to
FOM covering a wide range of topics in f.o.m. The list of previous
numbered postings #1-149 can be found at
http://www.cs.nyu.edu/pipermail/fom/2003-May/006563.html in the FOM
archives, 5/8/03 8:46AM. Previous ones counting from #150 are:
150:Finite obstruction/statistics 8:55AM 6/1/02
151:Finite forms by bounding 4:35AM 6/5/02
152:sin 10:35PM 6/8/02
153:Large cardinals as general algebra 1:21PM 6/17/02
154:Orderings on theories 5:28AM 6/25/02
155:A way out 8/13/02 6:56PM
156:Societies 8/13/02 6:56PM
157:Finite Societies 8/13/02 6:56PM
158:Sentential Reflection 3/31/03 12:17AM
159.Elemental Sentential Reflection 3/31/03 12:17AM
160.Similar Subclasses 3/31/03 12:17AM
161:Restrictions and Extensions 3/31/03 12:18AM
162:Two Quantifier Blocks 3/31/03 12:28PM
163:Ouch! 4/20/03 3:08AM
164:Foundations with (almost) no axioms, 4/22/0 5:31PM
165:Incompleteness Reformulated 4/29/03 1:42PM
166:Clean Godel Incompleteness 5/6/03 11:06AM
167:Incompleteness Reformulated/More 5/6/03 11:57AM
168:Incompleteness Reformulated/Again 5/8/03 12:30PM
169:New PA Independence 5:11PM 8:35PM
170:New Borel Independence 5/18/03 11:53PM
171:Coordinate Free Borel Statements 5/22/03 2:27PM
172:Ordered Fields/Countable DST/PD/Large Cardinals 5/34/03 1:55AM
173:Borel/DST/PD 5/25/03 2:11AM
174:Directly Honest Second Incompleteness 6/3/03 1:39PM
175:Maximal Principle/Hilbert's Program 6/8/03 11:59PM
176:Count Arithmetic 6/10/03 8:54AM
177:Strict Reverse Mathematics 1 6/10/03 8:27PM
178:Diophantine Shift Sequences 6/14/03 6:34PM
179:Polynomial Shift Sequences/Correction 6/15/03 2:24PM
180:Provable Functions of PA 6/16/03 12:42AM
181:Strict Reverse Mathematics 2:06/19/03 2:06AM
182:Ideas in Proof Checking 1 10:50PM 6/21/03
183:Ideas in Proof Checking 2 5:48PM 6/22/03
184:Ideas in Proof Checking 3 5:58PM 6/23/03
Harvey Friedman
More information about the FOM
mailing list