[FOM] FTGI1:Classical Propositional Calculus

Harvey Friedman friedman at math.ohio-state.edu
Mon Apr 21 01:06:38 EDT 2003


As some of you have noticed, I have been making two kinds of postings:

1. Numbered postings, currently numbered 1 through 163. These are, 
generally, technical abstracts regarding my recent research in f.o.m.

2. Other postings.

I want to start a new series of numbered postings with a particular 
purpose. The series will be consecutively numbered with the prefix 
ETGI.

FTGI stands for "fundamental topics of general interest".

The idea is to *systematically* choose fundamental topics of general 
interest in the foundations of mathematics, mention some background 
and basic results, and

***ASK THE READERSHIP TO CONTRIBUTE THEIR OWN EXPLANATIONS, COMMENTS, 
REFERENCES, AND/OR POINTERS TO THE LITERATURE REGARDING THEM. (It 
would be best to use the reply option in order to make the thread 
form of the archives most useful for readers of the archives).***

As opposed to the numbered postings 1 - 163, I am not choosing these 
topics on the basis of my research on them. They are chosen for their 
fundamental nature and their general interest. Many people on the FOM 
know a lot more than I do about many aspects of these topics.

The hope is that over time, this becomes a particularly valuable 
resource concerning f.o.m. for anyone who is open to learning or 
thinking more about f.o.m. Hopefully this will become of great value 
as a resource for professionals and students alike - as a guide to 
literature they may not be (adequately) familiar with - and as a 
source of possible research topics for themselves and/or students. 
This includes possibly students at the undergraduate level.

I will endeavor to make sure that this does not degenerate into an 
unstructured collection of references to the technical literature. 
The key words are "fundamental" and "general interest".

Each FTGI topic will likely occupy several different FTGI postings. 
In particular, I will likely comment in subsequent postings regarding 
some of the postings by the subscribers that are made on these FTGI 
topics.

#####################

The first FTGI topic is Classical Propositional Calculus, abbreviated 
CPC. When we get to classical quantificational calculus some time 
from now, we will use the abbreviation CQC.

CPC is based on the classical logical connectives not, and, or, if 
then, if and only if.

An examination of mathematics reveals that these five connectives are 
used over and over again. And no other connectives are used.

Some comments.

1. This is not strictly true. In mathematics, one sometimes sees 
things like "the following are equivalent", or "neither nor", or 
"exactly one of the following holds". But one sees these five 
overwhelmingly, and so it for f.o.m., it appears that we might as 
well go ahead and ignore these exceptions.

2. Mathematics appears to be crippled if we don't allow all five of 
these. The removal of some of these is more crippling than others.

3. If we remove not, then the challenge is to formalize mathematics 
without negation. There is a sense in which this can be 
straightforwardly achieved. We cannot discuss this in any detail 
until we have come to discuss at least the quantificational calculus.

4. Nevertheless, there seems to be a cost in doing this, as 
mathematicians find it very convenient to use "not". Is there 
anything to be gained by removing "not"? Again, we can't go deeply 
into this until we have at least the quantificational calculus.

5. What is the statistics on the observed distribution of uses of 
these five connectives in the mathematical literature? Does it vary 
from field of mathematics to field of mathematics? Does it vary in a 
given field as time passes? Does it vary in mathematics as a whole as 
time passes? Can one get a reasonably robust count? Should one 
concentrate on the distribution in the statements of theorems only? 
Does it differ significantly from author to author expositing the 
same material?

6. If we remove "if then", then it appears that this is so crippling 
as to make mathematics undoable in the practical sense. Of course, we 
can replace "if A then B" with "notA or B", or with "not(A and 
notB)". But this is too horrible to contemplate using. The following 
question naturally arises.

7. Is there any formal reason for why removing if then is 
particularly crippling? One immediate "answer" is that any equivalent 
of if then in terms of not, or, and, iff, must involve the use of two 
distinct connectives from these remaining four. But can this answer 
be improved?

We now come to the question as to just what not, and, or, if then, iff, mean.

The usual explanation of "not" goes something like this. not(B) 
counts as true if and only if B does not count as true. However, 
notice that I have used "not" in this explanation of "not".

Is this apparent circularity a real problem? Does the display of the 
truth table for negation resolve any problem that arises?

In order for the truth table for "not" to help, one seems to have to 
explain just what the truth table means. The idea is that it displays 
the truth value of not(B) in terms of the truth value of B.

But one has to first explain truth values. The usual explanation begins with

*every sentence has a truth value, which is either T or F, but not both.*

But does this explanation have a circularity in that it uses "or" and 
uses "not"?

The standard view is that one must start somewhere, and the idea of 
truth tables looks like a good place.

If one is to accept the universal applicability of classical 
propositional calculus, one seems pushed into the view that

*sentences have a well determined truth value - either T or F but not 
both - independently of any means we have for determining what the 
truth value is, or even independently of any description of how we 
might go about determining what the truth value is, or even how we 
might go about getting evidence for what that truth value is.*

On the other hand, there does seem to be room for some hybrid views. E.g.,

a. Classical propositional calculus is perhaps not universally 
applicable, but at least it is applicable in many contexts.

b. Classical propositional calculus is equally applicable in all 
contexts, but we somehow avoid being committed to the asterisked 
statement just above.

Most people would want to leave such knotty philosophical issues to 
at least the time where we discuss nonclassical propositional 
calculus, some time from now. I will take this approach, and so we 
will take for granted the idea of truth values, and truth tables for 
CPC.

The formulas of CPC are defined inductively as follows.

i. Each propositional "atom" p_n, n >= 0, is a formula.
ii. If A,B are formulas then so are (notA), (A and B), (A or B), (A 
implies B), (A iff B).
iii. Formulas are constructed only by means of 1,2.

So the formulas of CPC are given by a kind of inductive definition. 
But if CPC is supposed to be the starting point for f.o.m., then it 
seems that we have made a huge jump by moving into the realm of 
inductive definitions, which themselves require an elaborate 
explanation, where beginning propositional logic hardly even 
scratches the surface.

There seems to be a few responses to this.

1. Give up on the idea that we are building foundations for 
mathematics from almost nothing. Live with some circularity, but seek 
to minimize it. Ultimately, try to prove that a certain dose of 
circularity is absolutely necessary for f.o.m. Relish in that it can 
be pretty small.

2. Assert that we do not need to consider arbitrary formulas of CPC, 
but only formulas of CPC of quite limited size. Then we just have the 
problem of defining the formulas of CPC of small length. Also, small 
subscripts on the atoms.

Of course there are some problems with both 1 and 2.

For 1, I have never seen it really carried out. One immensely 
interesting and challenging task is to define what exactly one means 
to have a foundation for mathematics. Normally, one just develops the 
conventional foundation for mathematics (through formal set theory), 
and makes comments about just what is accomplished. One relishes in 
the fantastic discoveries starting with Godel. It seems obvious that, 
say, the apparatus needed to set up formal set theory and recognize 
the routine translation of mathematics into it, is a tiny minimal 
fragment of a fragment of a fragment of what has been developed. 
Nevertheless, at least something apparently more than mere ordinary 
commonsense thinking is required to set this up. I have certainly not 
seen any in depth discussion as to just what is needed, together with 
a demonstration that one cannot get away with less. The closest I 
have seen to this would be the development of some theories of finite 
strings with enough induction to support the facts that one needs to 
understand syntax and related matters. Of course, such minimalistic 
theories of syntax are themselves presented as formal systems, and 
also need something like themselves to support them. What is one to 
make of this circularity?

For 2, there is the option of considering only an entirely finite 
form of formal set theory, with finitely many formulas, finitely many 
proofs, etc. That is obviously comparatively far more minimalistic, 
and yet just what kind of foundations of mathematics does this really 
amount to?

We will now move on past such issues. We still want to make some 
points about the syntax of CPC.

It is best to reformulate the syntax of CPC in terms of finite 
strings in a finite alphabet, for a number of reasons including 1 and 
2 just above. This is normally done more or less as follows.

Use the finite alphabet of letters

not
and
or
arrows
doublearrows
p
0
1
(
)

The atoms are then strings of the form p followed by zero or more 0's and 1's.

There are various setups that allow for formulas of CPC to have fewer 
parentheses, but still have unique parsing.

In the purely syntactic approach, one does not rely on any semantic 
properties of the connectives (such as associativity), and does not 
rely on precedence tables (all connectives are treated equally). For 
example, there is no need for any parentheses in

p or not p

and no need for any parentheses in

not not not not p.

A pseudo formula is a formula generated in the usual way, but where 
sometimes parentheses are not inserted. Presumably the general theory 
of parsing tells us which pseudo formulas are inherently ambiguous 
and which are inherently unambiguous, regardless of the meaning of 
the connectives. This is presumably a purely syntactic matter when 
set up properly, and the general theory of parsing also gives 
efficient algorithms for determining this.

There are some straightforward grammars involving categories of 
formulas, for the syntax of CPC, and the question arises as to 
whether any of them is optimal in the sense of generating just the 
formulas that do not have any unnecessary parentheses. Presumably it 
is well known whether any such simple grammars are optimal in this 
sense.

In practice, it has become conventional to use the following 
precedence among the connectives:

not has the highest precedence;
and, or have the next highest, but equal precedence;
arrows and doublearrows are last, and equal.

We can revisit all of the above issues regarding optimality in these 
terms. One can also allow that one is allowed to use the semantic 
associativity of and and the semantic associativity of or and the 
semantic associativity of doublearrows to further eliminate 
parentheses. This is, in practice, a very good idea for ease of use. 
Again revisit all such syntactic issues.

A purely combinatorial counting problem arises. What can be say about 
the number of formulas of a given size? What notion of size should be 
used for such a study? One can take the total number of atoms that 
appear, or the total number of connectives that appear, or the total 
number of symbols that appear, or the depth of the formula. In the 
latter case, one has to introduce an appropriate equivalence class in 
order to keep the number of formulas of a given depth finite.

Obviously, we can only expect recurrence relations. But one can ask 
that the recurrence relation be of reasonable form; e.g., allow 
nested recurrence relations - i.e., recurrence relations involving 
recurrence relations, etcetera.

Also we can ask that the count on the number of formulas be 
efficiently computed. This is closely related to the existence of 
recurrence relations. Recurrence relations, even nested, tend to give 
efficient algorithms for counts, whereas if we have to actually count 
the number, then that will be generally take exponential time, or 
perhaps polynomial space - but not obviously polynomial time.

We now come to the matter of truth assignments and tautologies and 
semantic implication and semantic equivalence.

It is well known (due to Cook, who is an FOM subscriber) that testing 
for tautologies is co-NP complete. So P = NP is equivalent to 
"tautologies can be tested in polynomial time".

One should be able to prove that no simple minded polynomial time 
algorithm for testing tautologies. For instance, one might do the 
following. It is more convenient to think about satisfiability. Take 
an atom that appears, and split into two formulas, the first obtained 
by replacing the atom by T and simplifying, and the second obtained 
by replacing the atom by F and simplifying. Continue in this way, 
with the second atom, getting four nodes. Continue through all nodes. 
In theory one may have an exponential tree. However, generally some 
of the nodes that are not too high off of the root (root at the 
bottom) are going to have formulas that have been fully resolved as T 
or F. Of course, if any are resolved as T then the original formula 
is satisfiable. So in general, we hope that either that happens, 
instead we get a lot of F's, and the tree may well have paths of 
length = to the number of distinct atoms in the original formula, but 
still has far fewer than exponentially many nodes.

Presumably such schemes have been analyzed for SAT in the literature, 
and shown to fail, in that sometimes the tree is exponential. Has the 
literature also discussed anything about the distribution of the 
sizes of these trees over the input formulas?

With careful programming, and a big machine, perhaps this procedure 
is known to work for formulas of size n, where n is a reasonable 
number. What is the state of the art on this?

For that matter, what results are there concerning the number of 
formulas of a given size that are tautologies? Is it conceivable that 
we could get an exact count of this efficiently without butting up 
against P = NP and the like?

 From some points of view, it is more attractive to work with 3-SAT, 
another NP complete problem. What can be said about the relationship 
between 3-SAT and SAT with regard to these issues? Of course, there 
is a reduction of SAT to 3-SAT, but it makes a certain kind of 
distortion. What if we ask for a very tight reduction. Then does that 
become equivalent to P = NP again if it is very demanding?

In particular, when does 3-SAT and SAT become too hard for existing 
technology, in terms of input size?

This is enough for now. I'm sure that many people on the FOM know a 
lot more about such issues than I do, and can make some much 
appreciated postings of research and educational value. Next time I 
will take up issues of axiomatization in CPC.





























More information about the FOM mailing list