[FOM] 184:Ideas in Proof Checking 3

Harvey Friedman friedman at math.ohio-state.edu
Mon Jun 23 17:58:11 EDT 2003

Before moving on to Free Predicate Calculus, I am going to revisit 
the material in #183 to give the user much more powerful options, and 
also make the computer work harder.

But please recall that I strongly recommend LEMMA ENRICHMENT, so that 
all proofs are as trivial as possible.

I have also changed terminology, using module instead of task.

Recall the general philosophy of the user/computer interaction. It is 
a unidirectional effort to get ABSURDITIES. Therefore no mistakes can 
be made, and there are no difficulties involved in excessive 
pointless generation of formulas by the computer - or even by the 
user. It is completely modularized, and unidirectional, with no side 
effects. The only danger is that things get too cluttered. This is 
solved by a double window system. Also there is an UNDO. The user 
cannot get lost in the proof. Importation is made unexpectedly 
friendly as we shall demonstrate.


The computer keeps a list of unfinished modules for the human.

At any time, a module is a nonempty finite set of signed formulas in 
propositional calculus. A module is said to be finished if and only 
if it contains some T(A),F(A). Otherwise, it is said to be unfinished.

Any module that is finished is removed from the list of unfinished 
modules. The goal is for the user to get all unfinished modules 
finished. This process is initiated by an initial module, which 
corresponds exactly to the statement being proved.

I.e., if the statement being proved is A, then the initial module 
consists of F(A). If the statement being proved takes the form

Assume A1,...,An. Then B.

then the initial module consists of


At any point in the user/computer interaction, there is exactly one 
active module, which is of course unfinished. The unfinished modules 
are kept in a numbered inventory of unfinished modules by the 
computer, and can be called by name or number through standard human 
interfaces for such purposes.

The active module presents a primary and a secondary window. (There 
may be a point to having more windows, but this is not yet clear). 
Both windows merely contain a list of signed formulas.

The contents of the primary window are entirely user controlled, 
whereas the contents of the secondary window are primarily computer 

The entire active module - which is just a nonempty finite set of 
signed formulas - is always partitioned into the signed formulas on 
the primary window and the signed formulas on the secondary window.

As time proceeds, the active module expands. It never contracts. 
However, the user can shift any signed formula from one window to 
another. Generally, the user will shift signed formulas from the 
primary window to the secondary window in order to keep the primary 
window uncluttered for the main work.

The computer works in the background, continually enlarging the 
secondary window. Much of what is generated is likely to be junk. The 
secondary window is displayed in order of increasing complexity, with 
the low complexity signed formulas first.

If the user browses the secondary window, he may find useful signed 
formulas, in which case he may wish to shift them from the secondary 
window to the primary window.

The background algorithm that the computer continually runs makes 
immediate use of the moves that the human makes on the primary 
window. Frequently, the computer will simply announced that the 
active module is finished. The module is then removed from the list 
of unfinished modules, and a new unfinished module is activated. New 
ones are chosen by name or number, or by default.

The heart of the user/computer interaction occurs when the user makes 
certain moves in the primary window of the active module.

These moves create zero or more new signed formulas in the primary 
window, and create zero or more new modules. These new modules are 
hidden from view, as they are not active. The unfinished among these 
are placed in the numbered list of unfinished modules right after the 
active module being worked on.

1. "I don't feel like finishing this task now". The user selects 
another unfinished module. The present state of the closed unfinished 
module is saved so that when it is next activated, the primary window 
comes up just as it was left. However, the secondary window may be 
enlarged because of the background work by the computer.

2. Select a signed formula in the primary window and hit m (for 
move). This moves the signed formula to the secondary window.

3. Select a signed formula in the secondary window and hit m. This 
moves the signed formula to the primary window.

4. The user looks up a previously established formula in the official 
record of the UMD3 (universal mathematical database, part 3). See 
items 1-8 in #182, 10:50PM 6/21/03. Say it is A. The user selects A 
and hits =. The signed formula T(A) appears in the primary window. If 
the previously established result is in implicational form, then it 
must be converted into a single formula A, and imported as T(A).

We now list a number of very useful REDUCTION RULES that create new 
signed formulas in the primary window, upon the selection of signed 
formulas in the primary window. These rules do not create any new 
unfinished modules. They are preferable to the creation of new 
unfinished modules.

The rules are activated by selecting the indicated signed formula(s) 
from the primary window, and hitting r (reduce). The indicated signed 
formula(s) appears in the primary window. The selected signed formula 
remains. Of course, the user can then activate 2 if so desired.

R1. T(notA). F(A) created.

R2. F(notA). T(A) created.

R3. T(A1 and ... and An). T(A1),...,T(An) created.

R4. F(A1 or ... or An). F(A1),...,F(An) created.

R5. F(A implies B). T(A),F(B) created.

R6. T(A iff B). T(A implies B), T(B implies A) created.

R7. T(A implies B), T(A). T(B) created.

R8. T(A implies B), F(B). F(A) created.

R9. F(A iff B). T(A implies notB), T(B implies notA) created.

R14. T(A implies notB). T(notA implies B) created.

R15. T(notA implies B). T(A implies notB) created.

R16. T(notA implies notB). T(B implies A) created.

R17. T(A1 or ... or An) and one or more F(Ai). T(B1 or ... or Bk) 
created, where the B's are the remaining A's. If there are no 
remaining A's, then the present module is declared finished.

R18. F(A1 and ... and An) and one or more T(Ai). F(B1 and ... and Bk) 
is created, where the B's are the remaining A's. If there are no 
remaining A's, then the present module is declared finished.

R19. T(A1 implies A2), T(A2 implies A3), ..., T(An implies An+1), 
n >= 2. T(A1 implies An+1) created.

R20. T(A1 iff A2), T(A2 iff A3), .., T(An iff An+1), n >= 2. T(A1 iff 
An+1) created.

R21. T(A implies B), T(B implies C), T(C implies A). T(B implies A), 
T(C implies B), T(C implies A) created.

We also have a few buildup rules.

B1. Select T(A1),...,T(An), n >= 2. Hit c (conjunction). T(A1 and ... 
and An) created.

B2.  Select T(A) and hit d (disjunction). Dialog box comes up which 
allows user to enter a disjunction, alpha, that includes A as a 
disjunct. T(alpha) created.

B3. Select T(A) and hit n (negation). F(notA) created.

B4. Select F(A) and hit n (negation). T(notA) created.

B5. Select T(A) and hit i (implication). Dialog box comes up which 
allows user to enter a formula B. T(B implies A) created.

B6. Select F(A) and hit i (implication). Dialog box comes up which 
allows user to enter a formula B. T(A implies B) created.

We now come to the rules that create new unfinished modules. In each 
case, make the selection indicated and hit s (split). "Create" always 
means placing it on the primary window. All new unfinished modules 
are given by the module before the split instruction ogether with the 
signed formula(s) indicated.

S1. T(A1 or ... or An). Create T(A1) for this unfinished module. Also 
create n-1 new unfinished modules for each of T(A2),...,T(An).

S2. F(A1 and ... and An). Create F(A1) for this unfinished module. 
Also create n-1 new unfinished modules for each of F(A2),...,F(An).

S3. T(A implies B). Create T(B) for this unfinished module. Also 
create a single new unfinished module for F(A).

S4. F(A iff B). Create T(A), F(B) for this unfinished module. Also 
create a single new unfinished module for T(B), F(A).

We now come to the WILD CARD rule. Again, create means placing it on 
the primary window.

W. Hit w. A dialog box comes up asking for a signed formula, say 
T(A). Create T(A). Also create a new unfinished module for F(A). If 
the signed formula entered is F(A), then create F(A), and also create 
a new unfinished module for T(A).

The point of rule W is the following. The user knows exactly how to 
proceed if only he knew T(A). But there is no immediate way to get 
T(A). So the user enters T(A) anyways, and happily proceeds, while 
postponing the difficultly until later, in the form of a new 
unfinished module.

A very typical use of the WILD CARD is when the user is in hot 
pursuit of getting the desired absurdity to finish the module, but 
needs some obvious facts that have been proved previously. Maybe it's 
a matter of some algebra. It's difficult to predict exactly all the 
algebra needed before the user starts on a given module, and 
especially before the user starts on the proof. The user can then use 
any algebra he wants to, knowing that the computer will later ask for 
a justification of these algebraic manipulations in the form of 
additional unfinished modules. The user can pick a time when he is in 
the mood for dealing with such boring matters, choosing to finish 
more interesting modules in the meantime.

Of course, sometimes the wild card rule is used in a more critical 
way, arguing from the truth and separately the falsity of a crucial 
assertion that only the user would naturally come up with.


The computer will do two kinds of computing in the background.

a, Every time a module gets finished - i.e., an absurdity is 
generated, there is an origin module that is sufficient to generate 
that absurdity. The computer keeps track of all minimum generating 
modules that generate absurdity. These are available to the user for 
browsing. To avoid duplication of work, the computer checks to see if 
any of these minimum generating modules are contained in the entries 
in an unfinished module. If this is the case, then that unfinished 
module is considered finished.

The browsing by the user of these minimum generating modules can give 
the user ideas as to good uses of the wild card rule.

b. The computer does a restricted proof search on each and every 
unfinished module, including the active one. This restricted proof 
search is limited to rules R1-R21, S1-S4, and also rules B1-B6, W, 
but only if using the results of applying B1-B6, W which are 
subformulas of the formulas that are present in the unfinished module 
being worked on by the computer (not necessarily just the active one 
being worked on by the user). As the user adds new formulas to active 
unfinished modules, the computer can more liberally apply these rules 
since now some results of B1-B6, W are subformulas whereas they 
weren't before.

The combinatorial explosion inherent in b needs to be managed by some 
protocols that are a bit unclear at this point. However, recall that 
the user is following a LEMMA RICH development, so that proofs are 
generally fairly trivial, else a new Lemma should be inserted. Also, 
the user has a very powerful set of unidirectional, completely 
modular, intuitively clear, set of commands at his disposal.

3. UNDO.

The very best way to implement the most powerful UNDO is too early to 
specify at this time. We generally may want to save the results of 
the high powered background computation that is going on, and which 
may be generating a lot of data, and would perhaps take too much time 
to reassemble. On the other hand, there is a crude UNDO that might be 
useful to have, if the user is willing to wait a bit. Up in the air.


This posting is long enough, with enough ideas to chew on. In the 
next posting, I will take up the ordinary predicate calculus with 
equality. The ideas here extend naturally. The additional ideas 
concern terms and equality.


I use http://www.mathpreprints.com/math/Preprint/show/ for manuscripts with
proofs. Type Harvey Friedman in the window.
This is the 184th in a series of self contained numbered postings to 
FOM covering a wide range of topics in f.o.m. The list of previous 
numbered postings #1-149 can be found at 
http://www.cs.nyu.edu/pipermail/fom/2003-May/006563.html  in the FOM 
archives, 5/8/03 8:46AM. Previous ones counting from #150 are:

150:Finite obstruction/statistics  8:55AM  6/1/02
151:Finite forms by bounding  4:35AM  6/5/02
152:sin  10:35PM  6/8/02
153:Large cardinals as general algebra  1:21PM  6/17/02
154:Orderings on theories  5:28AM  6/25/02
155:A way out  8/13/02  6:56PM
156:Societies  8/13/02  6:56PM
157:Finite Societies  8/13/02  6:56PM
158:Sentential Reflection  3/31/03  12:17AM
159.Elemental Sentential Reflection  3/31/03  12:17AM
160.Similar Subclasses  3/31/03  12:17AM
161:Restrictions and Extensions  3/31/03  12:18AM
162:Two Quantifier Blocks  3/31/03  12:28PM
163:Ouch!  4/20/03  3:08AM
164:Foundations with (almost) no axioms, 4/22/0  5:31PM
165:Incompleteness Reformulated  4/29/03  1:42PM
166:Clean Godel Incompleteness  5/6/03  11:06AM
167:Incompleteness Reformulated/More  5/6/03  11:57AM
168:Incompleteness Reformulated/Again 5/8/03  12:30PM
169:New PA Independence  5:11PM  8:35PM
170:New Borel Independence  5/18/03  11:53PM
171:Coordinate Free Borel Statements  5/22/03  2:27PM
172:Ordered Fields/Countable DST/PD/Large Cardinals  5/34/03  1:55AM
173:Borel/DST/PD  5/25/03  2:11AM
174:Directly Honest Second Incompleteness  6/3/03  1:39PM
175:Maximal Principle/Hilbert's Program  6/8/03  11:59PM
176:Count Arithmetic  6/10/03  8:54AM
177:Strict Reverse Mathematics 1  6/10/03  8:27PM
178:Diophantine Shift Sequences  6/14/03  6:34PM
179:Polynomial Shift Sequences/Correction  6/15/03  2:24PM
180:Provable Functions of PA  6/16/03  12:42AM
181:Strict Reverse Mathematics 2:06/19/03  2:06AM
182:Ideas in Proof Checking 1 10:50PM  6/21/03
183:Ideas in Proof Checking 2 5:48PM  6/22/03

Harvey Friedman

More information about the FOM mailing list