[FOM] 184:Ideas in Proof Checking 3
Harvey Friedman
friedman at math.ohio-state.edu
Mon Jun 23 17:58:11 EDT 2003
Before moving on to Free Predicate Calculus, I am going to revisit
the material in #183 to give the user much more powerful options, and
also make the computer work harder.
But please recall that I strongly recommend LEMMA ENRICHMENT, so that
all proofs are as trivial as possible.
I have also changed terminology, using module instead of task.
Recall the general philosophy of the user/computer interaction. It is
a unidirectional effort to get ABSURDITIES. Therefore no mistakes can
be made, and there are no difficulties involved in excessive
pointless generation of formulas by the computer - or even by the
user. It is completely modularized, and unidirectional, with no side
effects. The only danger is that things get too cluttered. This is
solved by a double window system. Also there is an UNDO. The user
cannot get lost in the proof. Importation is made unexpectedly
friendly as we shall demonstrate.
1. IMPLEMENTATION FOR TOY LANGUAGE OF PROPOSITIONAL CALCULUS.
The computer keeps a list of unfinished modules for the human.
At any time, a module is a nonempty finite set of signed formulas in
propositional calculus. A module is said to be finished if and only
if it contains some T(A),F(A). Otherwise, it is said to be unfinished.
Any module that is finished is removed from the list of unfinished
modules. The goal is for the user to get all unfinished modules
finished. This process is initiated by an initial module, which
corresponds exactly to the statement being proved.
I.e., if the statement being proved is A, then the initial module
consists of F(A). If the statement being proved takes the form
Assume A1,...,An. Then B.
then the initial module consists of
T(A1),...,T(An),F(B).
At any point in the user/computer interaction, there is exactly one
active module, which is of course unfinished. The unfinished modules
are kept in a numbered inventory of unfinished modules by the
computer, and can be called by name or number through standard human
interfaces for such purposes.
The active module presents a primary and a secondary window. (There
may be a point to having more windows, but this is not yet clear).
Both windows merely contain a list of signed formulas.
The contents of the primary window are entirely user controlled,
whereas the contents of the secondary window are primarily computer
controlled.
The entire active module - which is just a nonempty finite set of
signed formulas - is always partitioned into the signed formulas on
the primary window and the signed formulas on the secondary window.
As time proceeds, the active module expands. It never contracts.
However, the user can shift any signed formula from one window to
another. Generally, the user will shift signed formulas from the
primary window to the secondary window in order to keep the primary
window uncluttered for the main work.
The computer works in the background, continually enlarging the
secondary window. Much of what is generated is likely to be junk. The
secondary window is displayed in order of increasing complexity, with
the low complexity signed formulas first.
If the user browses the secondary window, he may find useful signed
formulas, in which case he may wish to shift them from the secondary
window to the primary window.
The background algorithm that the computer continually runs makes
immediate use of the moves that the human makes on the primary
window. Frequently, the computer will simply announced that the
active module is finished. The module is then removed from the list
of unfinished modules, and a new unfinished module is activated. New
ones are chosen by name or number, or by default.
The heart of the user/computer interaction occurs when the user makes
certain moves in the primary window of the active module.
These moves create zero or more new signed formulas in the primary
window, and create zero or more new modules. These new modules are
hidden from view, as they are not active. The unfinished among these
are placed in the numbered list of unfinished modules right after the
active module being worked on.
1. "I don't feel like finishing this task now". The user selects
another unfinished module. The present state of the closed unfinished
module is saved so that when it is next activated, the primary window
comes up just as it was left. However, the secondary window may be
enlarged because of the background work by the computer.
2. Select a signed formula in the primary window and hit m (for
move). This moves the signed formula to the secondary window.
3. Select a signed formula in the secondary window and hit m. This
moves the signed formula to the primary window.
4. The user looks up a previously established formula in the official
record of the UMD3 (universal mathematical database, part 3). See
items 1-8 in #182, 10:50PM 6/21/03. Say it is A. The user selects A
and hits =. The signed formula T(A) appears in the primary window. If
the previously established result is in implicational form, then it
must be converted into a single formula A, and imported as T(A).
We now list a number of very useful REDUCTION RULES that create new
signed formulas in the primary window, upon the selection of signed
formulas in the primary window. These rules do not create any new
unfinished modules. They are preferable to the creation of new
unfinished modules.
The rules are activated by selecting the indicated signed formula(s)
from the primary window, and hitting r (reduce). The indicated signed
formula(s) appears in the primary window. The selected signed formula
remains. Of course, the user can then activate 2 if so desired.
R1. T(notA). F(A) created.
R2. F(notA). T(A) created.
R3. T(A1 and ... and An). T(A1),...,T(An) created.
R4. F(A1 or ... or An). F(A1),...,F(An) created.
R5. F(A implies B). T(A),F(B) created.
R6. T(A iff B). T(A implies B), T(B implies A) created.
R7. T(A implies B), T(A). T(B) created.
R8. T(A implies B), F(B). F(A) created.
R9. F(A iff B). T(A implies notB), T(B implies notA) created.
R14. T(A implies notB). T(notA implies B) created.
R15. T(notA implies B). T(A implies notB) created.
R16. T(notA implies notB). T(B implies A) created.
R17. T(A1 or ... or An) and one or more F(Ai). T(B1 or ... or Bk)
created, where the B's are the remaining A's. If there are no
remaining A's, then the present module is declared finished.
R18. F(A1 and ... and An) and one or more T(Ai). F(B1 and ... and Bk)
is created, where the B's are the remaining A's. If there are no
remaining A's, then the present module is declared finished.
R19. T(A1 implies A2), T(A2 implies A3), ..., T(An implies An+1),
n >= 2. T(A1 implies An+1) created.
R20. T(A1 iff A2), T(A2 iff A3), .., T(An iff An+1), n >= 2. T(A1 iff
An+1) created.
R21. T(A implies B), T(B implies C), T(C implies A). T(B implies A),
T(C implies B), T(C implies A) created.
We also have a few buildup rules.
B1. Select T(A1),...,T(An), n >= 2. Hit c (conjunction). T(A1 and ...
and An) created.
B2. Select T(A) and hit d (disjunction). Dialog box comes up which
allows user to enter a disjunction, alpha, that includes A as a
disjunct. T(alpha) created.
B3. Select T(A) and hit n (negation). F(notA) created.
B4. Select F(A) and hit n (negation). T(notA) created.
B5. Select T(A) and hit i (implication). Dialog box comes up which
allows user to enter a formula B. T(B implies A) created.
B6. Select F(A) and hit i (implication). Dialog box comes up which
allows user to enter a formula B. T(A implies B) created.
We now come to the rules that create new unfinished modules. In each
case, make the selection indicated and hit s (split). "Create" always
means placing it on the primary window. All new unfinished modules
are given by the module before the split instruction ogether with the
signed formula(s) indicated.
S1. T(A1 or ... or An). Create T(A1) for this unfinished module. Also
create n-1 new unfinished modules for each of T(A2),...,T(An).
S2. F(A1 and ... and An). Create F(A1) for this unfinished module.
Also create n-1 new unfinished modules for each of F(A2),...,F(An).
S3. T(A implies B). Create T(B) for this unfinished module. Also
create a single new unfinished module for F(A).
S4. F(A iff B). Create T(A), F(B) for this unfinished module. Also
create a single new unfinished module for T(B), F(A).
We now come to the WILD CARD rule. Again, create means placing it on
the primary window.
W. Hit w. A dialog box comes up asking for a signed formula, say
T(A). Create T(A). Also create a new unfinished module for F(A). If
the signed formula entered is F(A), then create F(A), and also create
a new unfinished module for T(A).
The point of rule W is the following. The user knows exactly how to
proceed if only he knew T(A). But there is no immediate way to get
T(A). So the user enters T(A) anyways, and happily proceeds, while
postponing the difficultly until later, in the form of a new
unfinished module.
A very typical use of the WILD CARD is when the user is in hot
pursuit of getting the desired absurdity to finish the module, but
needs some obvious facts that have been proved previously. Maybe it's
a matter of some algebra. It's difficult to predict exactly all the
algebra needed before the user starts on a given module, and
especially before the user starts on the proof. The user can then use
any algebra he wants to, knowing that the computer will later ask for
a justification of these algebraic manipulations in the form of
additional unfinished modules. The user can pick a time when he is in
the mood for dealing with such boring matters, choosing to finish
more interesting modules in the meantime.
Of course, sometimes the wild card rule is used in a more critical
way, arguing from the truth and separately the falsity of a crucial
assertion that only the user would naturally come up with.
2. BACKGROUND COMPUTING.
The computer will do two kinds of computing in the background.
a, Every time a module gets finished - i.e., an absurdity is
generated, there is an origin module that is sufficient to generate
that absurdity. The computer keeps track of all minimum generating
modules that generate absurdity. These are available to the user for
browsing. To avoid duplication of work, the computer checks to see if
any of these minimum generating modules are contained in the entries
in an unfinished module. If this is the case, then that unfinished
module is considered finished.
The browsing by the user of these minimum generating modules can give
the user ideas as to good uses of the wild card rule.
b. The computer does a restricted proof search on each and every
unfinished module, including the active one. This restricted proof
search is limited to rules R1-R21, S1-S4, and also rules B1-B6, W,
but only if using the results of applying B1-B6, W which are
subformulas of the formulas that are present in the unfinished module
being worked on by the computer (not necessarily just the active one
being worked on by the user). As the user adds new formulas to active
unfinished modules, the computer can more liberally apply these rules
since now some results of B1-B6, W are subformulas whereas they
weren't before.
The combinatorial explosion inherent in b needs to be managed by some
protocols that are a bit unclear at this point. However, recall that
the user is following a LEMMA RICH development, so that proofs are
generally fairly trivial, else a new Lemma should be inserted. Also,
the user has a very powerful set of unidirectional, completely
modular, intuitively clear, set of commands at his disposal.
3. UNDO.
The very best way to implement the most powerful UNDO is too early to
specify at this time. We generally may want to save the results of
the high powered background computation that is going on, and which
may be generating a lot of data, and would perhaps take too much time
to reassemble. On the other hand, there is a crude UNDO that might be
useful to have, if the user is willing to wait a bit. Up in the air.
#############################
This posting is long enough, with enough ideas to chew on. In the
next posting, I will take up the ordinary predicate calculus with
equality. The ideas here extend naturally. The additional ideas
concern terms and equality.
*********************************************
I use http://www.mathpreprints.com/math/Preprint/show/ for manuscripts with
proofs. Type Harvey Friedman in the window.
This is the 184th in a series of self contained numbered postings to
FOM covering a wide range of topics in f.o.m. The list of previous
numbered postings #1-149 can be found at
http://www.cs.nyu.edu/pipermail/fom/2003-May/006563.html in the FOM
archives, 5/8/03 8:46AM. Previous ones counting from #150 are:
150:Finite obstruction/statistics 8:55AM 6/1/02
151:Finite forms by bounding 4:35AM 6/5/02
152:sin 10:35PM 6/8/02
153:Large cardinals as general algebra 1:21PM 6/17/02
154:Orderings on theories 5:28AM 6/25/02
155:A way out 8/13/02 6:56PM
156:Societies 8/13/02 6:56PM
157:Finite Societies 8/13/02 6:56PM
158:Sentential Reflection 3/31/03 12:17AM
159.Elemental Sentential Reflection 3/31/03 12:17AM
160.Similar Subclasses 3/31/03 12:17AM
161:Restrictions and Extensions 3/31/03 12:18AM
162:Two Quantifier Blocks 3/31/03 12:28PM
163:Ouch! 4/20/03 3:08AM
164:Foundations with (almost) no axioms, 4/22/0 5:31PM
165:Incompleteness Reformulated 4/29/03 1:42PM
166:Clean Godel Incompleteness 5/6/03 11:06AM
167:Incompleteness Reformulated/More 5/6/03 11:57AM
168:Incompleteness Reformulated/Again 5/8/03 12:30PM
169:New PA Independence 5:11PM 8:35PM
170:New Borel Independence 5/18/03 11:53PM
171:Coordinate Free Borel Statements 5/22/03 2:27PM
172:Ordered Fields/Countable DST/PD/Large Cardinals 5/34/03 1:55AM
173:Borel/DST/PD 5/25/03 2:11AM
174:Directly Honest Second Incompleteness 6/3/03 1:39PM
175:Maximal Principle/Hilbert's Program 6/8/03 11:59PM
176:Count Arithmetic 6/10/03 8:54AM
177:Strict Reverse Mathematics 1 6/10/03 8:27PM
178:Diophantine Shift Sequences 6/14/03 6:34PM
179:Polynomial Shift Sequences/Correction 6/15/03 2:24PM
180:Provable Functions of PA 6/16/03 12:42AM
181:Strict Reverse Mathematics 2:06/19/03 2:06AM
182:Ideas in Proof Checking 1 10:50PM 6/21/03
183:Ideas in Proof Checking 2 5:48PM 6/22/03
Harvey Friedman
More information about the FOM
mailing list