# [FOM] 183:Ideas in Proof Checking 2

Harvey Friedman friedman at math.ohio-state.edu
Sun Jun 22 17:48:58 EDT 2003

```In #182, 10:50PM 6/21/03, I promised the following. Our IPC
(interactive proof system) will be driven by

1) a truly appropriate language for the formalization of mathematics;
2) model trees (Beth tableaux);
3) trivial proofs;
4) interaction protocols for constructing nontrivial proofs.

In this posting I will talk about 2 and 4, at least to the extent
that I can before I take up 1.

PS: This posting got too long, and some of 2,4 is spilling over into
posting #184.

I think that the right way of dealing with 3 will only be apparent
after 1 is specified, and I work out some examples. I have a lot of
confidence that a good notion of 3 will emerge. I.e.,

THESIS. A naturally occurring mathematical statement either has a
trivial proof, or there are appropriate naturally associated
mathematical statements in the form of Lemmas that naturally precede
it that cause it to have a trivial proof. Furthermore, trivial proofs
can be trivially obtained from the computer, relative to obvious
imports from the human.

Lemma rich exposition is not the norm among mathematicians. But some
of us do practice this. It is a rather effective form of exposition,
and leads to more error free publications. It does require extra
effort, but it is well worth it for other humans. Now we have another
use of it - the interactive creation of absolute certainty.

I'm not sure that an analysis of trivial proofs is so critical. My
plan is to make the human/machine interaction in the proof system
being designed so easy and simple, that it is actually a lot of fun,
especially on theorems with trivial proofs.

I am well aware that the necessary presence of a lot of possibly
undefined terms can make things more complicated than otherwise.
However, I believe that I have enough user interface tricks to make
this painless.

1. MODEL TREES (BETH TABLEAUX) - TOYS.

In this section, we give the theory behind the IPC proposal.

To make the ideas clear, I will confine the discussion to the toy
system of ordinary propositional calculus, with multiple conjunction
and disjunction. I will also continue with this toy case also in the
next section, which is design o the IPC.

I will then redo everything for the semitoy system of ordinary
predicate calculus with equality. Then I will address the semiserious
system of free predicate calculus. Any further discussion will be
postponed until 1) above is addressed: i.e., the truly appropriate
language for the formalization of mathematics.

We have atoms pn, n >= 1. We use connectives not, and, or, implies, iff.

We define formulas as follows.

i. pn is a formula.
ii. if A is a formula then (notA) is a formula.
iii. if A1,...,An are formulas, n >= 2, then (A1 or ... or An) and
(A1 and ... and An) are formulas.
iv. if A,B are formulas then (A implies B) and (A iff B) are formulas.

We will take up the matter of precedence and parentheses when we get
to the real thing, later.

The extended formulas are

T(A)
F(A).

The first means that "we want A to be true". The second means that
"we want A to be false".

Suppose that we want to prove A. We build a model tree for F(A).

This is an attempt to find a model (truth assignment) in which A is false.

If A is provable, then we are going to fail. This corresponds to the
model tree closing off, in the sense that all of the leaves are
absurd.

We are going to nondeterministically create a tree going upward from the root.

At any stage, we will have a nonempty tree in which every node is
labeled with a finite set of extended formulas.

There are four kinds of moves that can be made at any stage.

i. Two or more new nodes are created which lie just above some
existing terminal node. The previously terminal node is not terminal
any more. One or more formulas are placed in each of these new nodes.
This is called a splitting.

ii. No new nodes are created. Instead, for one of the nodes, some

iii. Certain terminal nodes are marked as absurd.

The model tree is declared ABSURD if and only if all terminal nodes
are marked as absurd.

Generally, the strategy is to attempt to get halting with as few
splits as possible. Generally, this will cut down the size of the
tree.

Here are the basic rules.

For iii, a terminal node is marked absurd if and only if when looking
along the path from the root up to and including the terminal node.
one finds some T(B) and F(B). They may be from different nodes.

For ii. Any one of the following is a legal move.

a. Add F(B) if T(notB) is present.
b. Add T(B) if F(notB) is present.
c. Add any of T(A1),...,T(An) if T(A1 and ... and An) is present.
d. Add any of F(A1),...,F(An) if F(A1 or ... or An) is present.
e. Add any of T(A),F(B) if F(A implies B) is present.
f. Add  any of T(A implies B),T(B implies A) if T(A iff B) is present.

For i. We first identify a terminal node. Then we identify a nonempty
list from the forms not covered in ii above, that lie on the path
from the root to the terminal node:

T(A1 or ... or An)
F(A1 and ... and An)
T(A implies B)
F(A iff B).

If the list has k entries, then we will have a split whose size is
the product of the arities of the k entries. The arity of the above
entries are, respectively, n,n,2,2.

We will be content to give the details in a simple case, leaving the

Suppose

T(A or B)
F(C iff D)

have been identified on the path from the root to the terminal node.

Then we split into four new terminal nodes, which are respectively labeled with

T(A),T(C),F(D)
T(A),T(D),F(C)
T(B),T(C),F(D)
T(B),T(D),F(C).

THEOREM 1.1. If there is an absurd model tree starting with the root
labeled F(A), then A is provable (is a tautology). Conversely, if A
is provable (is a tautology), then there is an absurd model tree
starting with the root labeled F(A).

It is also useful to have the following sharper form.

THEOREM 1.2. A1,...,An tautologically implies B if and only if there
is an absurd model tree starting with T(A1),...,T(An),F(B).

When giving examples of model trees, it is helpful to delete entries
so the tree looks less cluttered.

EXAMPLE. ((p implies q) implies p) implies p.

1. F(((p implies q) implies p) implies p).
1. T((p implies q) implies p), F(p).
1.1. F(p implies q)).
1.2. T(p). ABSURD.
1.1. T(p), F(q). ABSURD.

EXPLANATION. Each line represents a move. The nodes are 1, 1.1, 1.2.
The first line is the initialization of the problem. The second line
adds a formula, and then deletes the original formula. (Any deletions
are OK). The third and fourth lines are a binary split. The node 1.2
is recognized as absurd. The fifth line adds a formula to node 1.1,
and deletes the formula in line 3. Then node 1.1 is recognized as
absurd. Finally, all terminal nodes are recognized as absurd.

There is an important additional optional move we add to the mix.

vi. Split at any time in any legal way on any terminal node.

Here the splitting is given by a list of distinct formulas,
A1,...,An. There are 2^n new terminal nodes placed above the given
terminal node. Each new node is labeled with a different set of n
formulas, which are just the A's with T's and F's put in front.

THEOREM 1.3. Theorems 1.1 and 1.2 still hold with this new move vi allowed.

Rule vi is very useful - even ESSENTIAL - in the full blown real
thing we are designing. Here is the reason.

The reason is to create manageable imports. There are also other
important reasons.

I used to think that at the outset of the proof of a statement, one
should first do all of the needed imports.

This is awkward, unfriendly, and difficult to do completely.

What instead you want to do is to go ahead and make these model tree
moves. Then at some point you say

"gee, if only I knew this, I could do what I want now."

So you just simply enter what you need. Don't worry at all about the
fact that what you need has not been established. Since the proof you
are constructing interactively is supposed to be nearly trivial, you
know that it is going to be easy to justify this entry when you feel
like it at a later time. The order in which things are done is of no
consequence, as long as they all get done.

And you don't have to remember that there is something left over. The
computer keeps an inventory of model trees that you later need to
close off. At some point, you just complete all these little tasks
that the computer asks you to do, in the form of ABSURDITY CREATION.
All organization of tasks is done by the computer.

The computer is even good at looking up whether you have entered
something that really is just a legal import. Then the computer won't

In fact, all the time you spend at the computer slowly making your
mouse clicks and entering some terms, the computer is in the
background trying to close off the model tree all by itself, and also
justifying the stuff you have left over for later - also, possibly by
closing off other model trees. You may be pleasantly surprised that
the computer is working very effectively as you are working. Even
many nodes - maybe not all nodes - that you thought you would have to
work on, will have already been branded as absurd by the computer.

NOTE: Now it becomes clear that a proper workable notion of TRIVIAL
PROOF is, after all, important. This is what the computer will look
for emanating from any newly created terminal node.

2. IMPLEMENTATION OF MODEL TREES - TOYS.

It is nasty to look at a tree displayed on the screen. That is why,
in my implementation of the previous section, I will protect the
human entirely from actual model trees.

But at what cost? No cost!

The computer keeps a list of tasks for the human. While the human is
working on these tasks, the computer tries to solve its inventory of
tasks not yet solved on its own. Any task that it solves on its own
is removed from the inventory. Between the human and the computer,
hopefully the inventory is reduced to nothing.

All tasks have the following form. A list of signed formulas are
given. The human is supposed to read them and see that they lead to
absurdity. Of course, if the list contains a T(B),F(B), then the
computer won't even bother to present this task (list). It will have

So the human is presented with a nonempty list of signed formulas.
The human can take any of the following actions.

1. I don't feel like working on this task now. Give me another one.

2. Click on T(A1 and ... and An). Then T(A1),...,T(An) appear on the list.

3. Click on F(A1 or ... or An). Then F(A1),...,F(An) appear on the list.

4. Click on F(A implies B). Then T(A), F(B) appear on the list.

5. Click on T(A iff B). Then T(A implies B), T(B implies A) appear on the list.

6. Click on T(notA). Then F(A) appears on the list.

7. Click on F(notA). Then T(A) appears on the list.

8. Click on T(A1 or ... or An). Then T(A1) appears on the list. Also
the computer stores the present list appended with, separately,
T(A2),...,T(An), thereby hiding from view n-1 tasks, all of which
will be eventually sprung on the human at later times.

9. Click on F(A1 and ... and An).  Then T(A1) appears on the list.
Also the computer stores the present list appended with, separately,
F(A2),...,F(An), thereby hiding from view n-1 tasks, all of which
will be eventually sprung on the human at later times.

10. Click on T(A implies B).  Then T(B) appears on the list. Also the
computer stores the present list appended with F(A), separately,
thereby hiding from view 1 task, which will be eventually sprung on
the human at a later time.

11. Click on F(A iff B).  Then T(A),F(B) appears on the list. Also
the computer stores the present list appended with F(A),T(B)
separately, thereby hiding from view 1 task, which will be eventually
sprung on the human at a later time.

12. Doubleclick on any entry. The entry disappears.

13. You can restore the disappearing entries from 8, one at a time.

The idea behind 12 is to make things less cluttered. The computer
always considers the disappeared entries as part of the list.
However, they will always remain marked to disappear from view,
unless 13, 14 instructs them to be put back into view.

Tasks are considered completed when and only when an absurdity,
T(B),F(B) appears.

There is one more move of essential importance.

14. The user can simply pick any extended formula out of the air, and
put it on the list. Presumably this makes the task easier - that is
the point of putting it on the list. The cost is that the computer
then stores the following task to be sprung later on the human: the
same list but with the new extended formula dualized.

I.e., the human wishes to add the new formula T(A) to the list L.
Then the computer stores the task consisting of: L together with F(A)
for later. Or the human adds the new formula F(A) to L. Then the
computer stores the task: L together with T(A).

Of course this is an obvious waste of time if T(A) or F(A) is already
on the list. The user is prevented from such foolishness.

This is also subject to the undo operation.

This last move occurs in the following typical situation. The human
realizes that if only they had T(A) then they would know just how to
proceed, and they have confidence that they can get T(A) when asked
to. They will get asked to in the form of being presented with the
associated task with F(A), but a later time when they are ready to

That's it. If the Lemma is A, then the original task is

F(A).

More generally, suppose that the Lemma takes the form:

Assume A1,...,An. Then B.

T(A1),...,T(An),F(B).

I want to indicate how the computer stores the not yet completed tasks.

At any stage, the computer has a list of one or more not yet
completed tasks. Recall that a task gets completed when and only when
an absurdity becomes present in the form of some T(B),F(B).

Each task is merely a nonempty finite set of signed formulas.
However, we want more data. We mark some of them as disappearing from
view, and we assign a number to them, so that we can implement 14
above. This extra data is maintained as the list changes, so that
when the list is revisited because of, say, 1, we have maintained the
instructions on how to implement 14 above.

We indicate what the computer does according to moves 1-14.

First of all, the computer keeps a count of the tasks not yet
completed, and displays this to the human. It also displays the
position of the present task being displayed, in this list. E.g., the
computer might have 18 incompleted tasks, and the one on display is
number 7.

1. The present task is put at the end of the list of incomplete tasks
(default), or at a position specified by the human by number.

8-11. The new hidden tasks become the very next task (default), or is
put into the list of incomplete tasks at the position specified by
the human.

12. The entry is put at the end of the list of disappearing entries

13. Make them appear, one at a time, from the back of the list of
disappearing entries.

14. The new hidden task becomes the very next task (default), or is
put at a position specified by the human by number.

If a task becomes completed - i.e., gets an absurdity of the form
T(B),F(B), then it is deleted from the list of incomplete tasks.

The object of the human is to get all tasks deleted.

3. MODEL TREES - PREDICATE CALCULUS.

We use variables xn, n >= 1. We use the usual connectives not, and,
or, implies, iff. We use the quantifiers forall, therexists. We use
constant symbols cn, n >= 1, n-ary relation symbols Rnm, n,m >= 1,
and n-ary function symbols Fnm, n,m >= 1. No equality in this version.

We are going to nondeterministically create a tree going upward from the root.

At any stage, we will have a nonempty tree in which every node is
labeled with a finite set of extended formulas.

There are four kinds of moves that can be made at any stage.

i. Two or more new nodes are created which lie just above some
existing terminal node. The previously terminal node is not terminal
any more. One or more formulas are placed in each of these new nodes.
This is called a splitting.

ii. No new nodes are created. Instead, for one of the nodes, some

iii. Certain terminal nodes are marked as absurd.

The model tree is declared ABSURD if and only if all terminal nodes
are marked as absurd.

Generally, the strategy is to attempt to get halting with as few
splits as possible. Generally, this will cut down the size of the
tree.

Here are the basic rules.

For iii, a terminal node is marked absurd if and only if when looking
along the path from the root up to and including the terminal node.
one finds some T(B) and F(B'), where B' is an alphabetic variant of B
(change of bound variables). They may be from different nodes.

For ii. Any one of the following is a legal move.

a. Add F(B) if T(notB) is present.
b. Add T(B) if F(notB) is present.
c. Add T(A1),...,T(An) if T(A1 and ... and An) is present.
d. Add F(A1),...,F(An) if F(A1 or ... or An) is present.
e. Add T(A),F(B) if F(A implies B) is present.
f. Add T(A implies B),T(B implies A) if T(A iff B) is present.
g. Add T(A[x/t]) if T((forall x)(A)) is present and t is
substitutable for x in A.
h. Add F(A[x/t]) if F((therexists x)(A)) is present and t is
substitutable for x in A.
i. Add T(A[x/v]) if T((therexists x)(A)) is present and v is
substitutable for x in A and v is not free in the formulas from the
root up through the node.
j. Add F(A[x/v]) if F((forall x)(A)) is present and v is
substitutable for x in A and v is not free in the formulas from the
root up through the node.

For i. we first identify a terminal node. Then we identify a nonempty
list from the forms not covered in ii above, that lie on the path
from the root to the terminal node:

T(A1 or ... or An)
F(A1 and ... and An)
T(A implies B)
F(A iff B).

If the list has k entries, then we will have a split whose size is
the product of the arities of the k entries. The arity of the above
entries are, respectively, n,n,2,2.

THEOREM 3.1. A1,...,An logically implies B if and only if there is an
absurd model tree starting with T(A1),...,T(An),F(B).

We can also allow splitting on an arbitrary formula at any stage, and
things are still OK.

4. IMPLEMENTATION OF MODEL TREES - PREDICATE CALCULUS.

The computer keeps a list of tasks for the human. While the human is
working on these tasks, the computer tries to solve its inventory of
tasks not yet solved on its own. Any task that it solves on its own
is removed from the inventory. Between the human and the computer,
hopefully the inventory is reduced to nothing.

All tasks have the following form. A list of signed formulas are
given. The human is supposed to read them and see that they lead to
absurdity. Of course, if the list contains a T(B),F(B), then the
computer won't even bother to present this task (list). It will have

So the human is presented with a nonempty list of signed formulas.
The human can take any of the following actions.

1. I don't feel like working on this task now. Give me another one.

2. Click on T(A1 and ... and An). Then T(A1),...,T(An) appear on the list.

3. Click on F(A1 or ... or An). Then F(A1),...,F(An) appear on the list.

4. Click on F(A implies B). Then T(A), F(B) appear on the list.

5. Click on T(A iff B). Then T(A implies B), T(B implies A) appear on the list.

6. Click on T(notA). Then F(A) appears on the list.

7. Click on F(notA). Then T(A) appears on the list.

8. Click on T(A1 or ... or An). Then T(A1) appears on the list. Also
the computer stores the present list appended with, separately,
T(A2),...,T(An), thereby hiding from view n-1 tasks, all of which
will be eventually sprung on the human at later times.

9. Click on F(A1 and ... and An).  Then T(A1) appears on the list.
Also the computer stores the present list appended with, separately,
F(A2),...,F(An), thereby hiding from view n-1 tasks, all of which
will be eventually sprung on the human at later times.

10. Click on T(A implies B).  Then T(B) appears on the list. Also the
computer stores the present list appended with F(A), separately,
thereby hiding from view 1 task, which will be eventually sprung on
the human at a later time.

11. Click on F(A iff B).  Then T(A),F(B) appears on the list. Also
the computer stores the present list appended with F(A),T(B)
separately, thereby hiding from view 1 task, which will be eventually
sprung on the human at a later time.

12. Click on T((forall x)(A)). Then the computer asks which term
should be used. The human replies with t. The computer determines
whether t is substitutable for x in A. If so, then T(A[x/t]) appears
in the list. If not, the human is notified, and is asked to pick
another term.

13. Click on F((therexists x)(A)). Then the computer asks which term
should be used for x. The human replies with t. The computer
determines whether t is substitutable for x in A. If so, then
F(A[x/t]) appears in the list. If not, the human is notified, and is
asked to pick another term. The default choice is x.

14. Click on T((therexists x)(A)). Then the computer asks for a
variable v to be used for x. The human replies with v. The computer
determines whether v is substitutable for x in A and v is not free in
the other formulas in the list. If so, then T(A[x/v]) appears on the
list. If not, the human is notified, and is asked to pick another
variable. The default choice is x.

15. Click on F((forall x)(A)). Then the computer asks for a variable
v to be used for x. The human replies with v. The computer determines
whether v is substitutable for x in A and v is not free in the other
formulas in the list. If so, then F(A[x/v]) appears on the list. If
not, the human is notified, and is asked to pick another variable.
The default choice is x.

16. Doubleclick on any entry. The entry disappears.

17. You can restore the disappearing entries from 16, one at a time.

Tasks are considered completed when and only when an absurdity,
T(B),F(B') appears, where B,B' are alphabetic variants.

There is one more move of essential importance.

18. The user can simply pick any extended formula out of the air, and
put it on the list. Presumably this makes the task easier - that is
the point of putting it on the list. The cost is that the computer
then stores the following task to be sprung later on the human: the
same list but with the new extended formula dualized.

I.e., the human wishes to add the new formula T(A) to the list L.
Then the computer stores the task consisting of: L together with F(A)
for later. Or the human adds the new formula F(A) to L. Then the
computer stores the task: L together with T(A).

Of course this is an obvious waste of time if T(A) or F(A) is already
on the list. The user is prevented from such foolishness.

This is also subject to the undo operation.

This last move occurs in the following typical situation. The human
realizes that if only they had T(A) then they would know just how to
proceed, and they have confidence that they can get T(A) when asked
to. They will get asked to in the form of being presented with the
associated task with F(A), but a later time when they are ready to

That's it. If the Lemma is A, then the original task is

F(A).

More generally, suppose that the Lemma takes the form:

Assume A1,...,An. Then B.

T(A1),...,T(An),F(B).

I want to indicate how the computer stores the not yet completed tasks.

At any stage, the computer has a list of one or more not yet
completed tasks. Recall that a task gets completed when and only when
an absurdity becomes present in the form of some T(B),F(B).

Each task is merely a nonempty finite set of signed formulas.
However, we want more data. We mark some of them as disappearing from
view, and we assign a number to them, so that we can implement 14
above. This extra data is maintained as the list changes, so that
when the list is revisited because of, say, 1, we have maintained the
instructions on how to implement 14 above.

We indicate what the computer does according to moves 1-18.

First of all, the computer keeps a count of the tasks not yet
completed, and displays this to the human. It also displays the
position of the present task being displayed, in this list. E.g., the
computer might have 18 incompleted tasks, and the one on display is
number 7.

1. The present task is put at the end of the list of incomplete tasks
(default), or at a position specified by the human by number.

8-11. The new hidden tasks become the very next task (default), or is
put into the list of incomplete tasks at the position specified by
the human.

16. The entry is put at the end of the list of disappearing entries

17. Make them appear, one at a time, from the back of the list of
disappearing entries.

18. The new hidden task becomes the very next task (default), or is
put at a position specified by the human by number.

If a task becomes completed - i.e., gets an absurdity of the form
T(B),F(B'), where B' is an alphabetic variant of B, then it is
deleted from the list of incomplete tasks.

The object of the human is to get all tasks deleted.

In the next installment, we will consider free predicate calculus,
which is predicate calculus with equality and undefined terms.

*********************************************

I use http://www.mathpreprints.com/math/Preprint/show/ for manuscripts with
proofs. Type Harvey Friedman in the window.
This is the 183rd in a series of self contained numbered postings to
FOM covering a wide range of topics in f.o.m. The list of previous
numbered postings #1-149 can be found at
http://www.cs.nyu.edu/pipermail/fom/2003-May/006563.html  in the FOM
archives, 5/8/03 8:46AM. Previous ones counting from #150 are:

150:Finite obstruction/statistics  8:55AM  6/1/02
151:Finite forms by bounding  4:35AM  6/5/02
152:sin  10:35PM  6/8/02
153:Large cardinals as general algebra  1:21PM  6/17/02
154:Orderings on theories  5:28AM  6/25/02
155:A way out  8/13/02  6:56PM
156:Societies  8/13/02  6:56PM
157:Finite Societies  8/13/02  6:56PM
158:Sentential Reflection  3/31/03  12:17AM
159.Elemental Sentential Reflection  3/31/03  12:17AM
160.Similar Subclasses  3/31/03  12:17AM
161:Restrictions and Extensions  3/31/03  12:18AM
162:Two Quantifier Blocks  3/31/03  12:28PM
163:Ouch!  4/20/03  3:08AM
164:Foundations with (almost) no axioms, 4/22/0  5:31PM
165:Incompleteness Reformulated  4/29/03  1:42PM
166:Clean Godel Incompleteness  5/6/03  11:06AM
167:Incompleteness Reformulated/More  5/6/03  11:57AM
168:Incompleteness Reformulated/Again 5/8/03  12:30PM
169:New PA Independence  5:11PM  8:35PM
170:New Borel Independence  5/18/03  11:53PM
171:Coordinate Free Borel Statements  5/22/03  2:27PM
172:Ordered Fields/Countable DST/PD/Large Cardinals  5/34/03  1:55AM
173:Borel/DST/PD  5/25/03  2:11AM
174:Directly Honest Second Incompleteness  6/3/03  1:39PM
175:Maximal Principle/Hilbert's Program  6/8/03  11:59PM
176:Count Arithmetic  6/10/03  8:54AM
177:Strict Reverse Mathematics 1  6/10/03  8:27PM
178:Diophantine Shift Sequences  6/14/03  6:34PM
179:Polynomial Shift Sequences/Correction  6/15/03  2:24PM
180:Provable Functions of PA  6/16/03  12:42AM
181:Strict Reverse Mathematics 2:06/19/03  2:06AM
182:Ideas in Proof Checking 1 10:50PM  6/21/03

Harvey Friedman
```