[FOM] FTGI1:Classical Propositional Calculus
Harvey Friedman
friedman at math.ohio-state.edu
Mon Apr 21 01:06:38 EDT 2003
As some of you have noticed, I have been making two kinds of postings:
1. Numbered postings, currently numbered 1 through 163. These are,
generally, technical abstracts regarding my recent research in f.o.m.
2. Other postings.
I want to start a new series of numbered postings with a particular
purpose. The series will be consecutively numbered with the prefix
ETGI.
FTGI stands for "fundamental topics of general interest".
The idea is to *systematically* choose fundamental topics of general
interest in the foundations of mathematics, mention some background
and basic results, and
***ASK THE READERSHIP TO CONTRIBUTE THEIR OWN EXPLANATIONS, COMMENTS,
REFERENCES, AND/OR POINTERS TO THE LITERATURE REGARDING THEM. (It
would be best to use the reply option in order to make the thread
form of the archives most useful for readers of the archives).***
As opposed to the numbered postings 1 - 163, I am not choosing these
topics on the basis of my research on them. They are chosen for their
fundamental nature and their general interest. Many people on the FOM
know a lot more than I do about many aspects of these topics.
The hope is that over time, this becomes a particularly valuable
resource concerning f.o.m. for anyone who is open to learning or
thinking more about f.o.m. Hopefully this will become of great value
as a resource for professionals and students alike - as a guide to
literature they may not be (adequately) familiar with - and as a
source of possible research topics for themselves and/or students.
This includes possibly students at the undergraduate level.
I will endeavor to make sure that this does not degenerate into an
unstructured collection of references to the technical literature.
The key words are "fundamental" and "general interest".
Each FTGI topic will likely occupy several different FTGI postings.
In particular, I will likely comment in subsequent postings regarding
some of the postings by the subscribers that are made on these FTGI
topics.
#####################
The first FTGI topic is Classical Propositional Calculus, abbreviated
CPC. When we get to classical quantificational calculus some time
from now, we will use the abbreviation CQC.
CPC is based on the classical logical connectives not, and, or, if
then, if and only if.
An examination of mathematics reveals that these five connectives are
used over and over again. And no other connectives are used.
Some comments.
1. This is not strictly true. In mathematics, one sometimes sees
things like "the following are equivalent", or "neither nor", or
"exactly one of the following holds". But one sees these five
overwhelmingly, and so it for f.o.m., it appears that we might as
well go ahead and ignore these exceptions.
2. Mathematics appears to be crippled if we don't allow all five of
these. The removal of some of these is more crippling than others.
3. If we remove not, then the challenge is to formalize mathematics
without negation. There is a sense in which this can be
straightforwardly achieved. We cannot discuss this in any detail
until we have come to discuss at least the quantificational calculus.
4. Nevertheless, there seems to be a cost in doing this, as
mathematicians find it very convenient to use "not". Is there
anything to be gained by removing "not"? Again, we can't go deeply
into this until we have at least the quantificational calculus.
5. What is the statistics on the observed distribution of uses of
these five connectives in the mathematical literature? Does it vary
from field of mathematics to field of mathematics? Does it vary in a
given field as time passes? Does it vary in mathematics as a whole as
time passes? Can one get a reasonably robust count? Should one
concentrate on the distribution in the statements of theorems only?
Does it differ significantly from author to author expositing the
same material?
6. If we remove "if then", then it appears that this is so crippling
as to make mathematics undoable in the practical sense. Of course, we
can replace "if A then B" with "notA or B", or with "not(A and
notB)". But this is too horrible to contemplate using. The following
question naturally arises.
7. Is there any formal reason for why removing if then is
particularly crippling? One immediate "answer" is that any equivalent
of if then in terms of not, or, and, iff, must involve the use of two
distinct connectives from these remaining four. But can this answer
be improved?
We now come to the question as to just what not, and, or, if then, iff, mean.
The usual explanation of "not" goes something like this. not(B)
counts as true if and only if B does not count as true. However,
notice that I have used "not" in this explanation of "not".
Is this apparent circularity a real problem? Does the display of the
truth table for negation resolve any problem that arises?
In order for the truth table for "not" to help, one seems to have to
explain just what the truth table means. The idea is that it displays
the truth value of not(B) in terms of the truth value of B.
But one has to first explain truth values. The usual explanation begins with
*every sentence has a truth value, which is either T or F, but not both.*
But does this explanation have a circularity in that it uses "or" and
uses "not"?
The standard view is that one must start somewhere, and the idea of
truth tables looks like a good place.
If one is to accept the universal applicability of classical
propositional calculus, one seems pushed into the view that
*sentences have a well determined truth value - either T or F but not
both - independently of any means we have for determining what the
truth value is, or even independently of any description of how we
might go about determining what the truth value is, or even how we
might go about getting evidence for what that truth value is.*
On the other hand, there does seem to be room for some hybrid views. E.g.,
a. Classical propositional calculus is perhaps not universally
applicable, but at least it is applicable in many contexts.
b. Classical propositional calculus is equally applicable in all
contexts, but we somehow avoid being committed to the asterisked
statement just above.
Most people would want to leave such knotty philosophical issues to
at least the time where we discuss nonclassical propositional
calculus, some time from now. I will take this approach, and so we
will take for granted the idea of truth values, and truth tables for
CPC.
The formulas of CPC are defined inductively as follows.
i. Each propositional "atom" p_n, n >= 0, is a formula.
ii. If A,B are formulas then so are (notA), (A and B), (A or B), (A
implies B), (A iff B).
iii. Formulas are constructed only by means of 1,2.
So the formulas of CPC are given by a kind of inductive definition.
But if CPC is supposed to be the starting point for f.o.m., then it
seems that we have made a huge jump by moving into the realm of
inductive definitions, which themselves require an elaborate
explanation, where beginning propositional logic hardly even
scratches the surface.
There seems to be a few responses to this.
1. Give up on the idea that we are building foundations for
mathematics from almost nothing. Live with some circularity, but seek
to minimize it. Ultimately, try to prove that a certain dose of
circularity is absolutely necessary for f.o.m. Relish in that it can
be pretty small.
2. Assert that we do not need to consider arbitrary formulas of CPC,
but only formulas of CPC of quite limited size. Then we just have the
problem of defining the formulas of CPC of small length. Also, small
subscripts on the atoms.
Of course there are some problems with both 1 and 2.
For 1, I have never seen it really carried out. One immensely
interesting and challenging task is to define what exactly one means
to have a foundation for mathematics. Normally, one just develops the
conventional foundation for mathematics (through formal set theory),
and makes comments about just what is accomplished. One relishes in
the fantastic discoveries starting with Godel. It seems obvious that,
say, the apparatus needed to set up formal set theory and recognize
the routine translation of mathematics into it, is a tiny minimal
fragment of a fragment of a fragment of what has been developed.
Nevertheless, at least something apparently more than mere ordinary
commonsense thinking is required to set this up. I have certainly not
seen any in depth discussion as to just what is needed, together with
a demonstration that one cannot get away with less. The closest I
have seen to this would be the development of some theories of finite
strings with enough induction to support the facts that one needs to
understand syntax and related matters. Of course, such minimalistic
theories of syntax are themselves presented as formal systems, and
also need something like themselves to support them. What is one to
make of this circularity?
For 2, there is the option of considering only an entirely finite
form of formal set theory, with finitely many formulas, finitely many
proofs, etc. That is obviously comparatively far more minimalistic,
and yet just what kind of foundations of mathematics does this really
amount to?
We will now move on past such issues. We still want to make some
points about the syntax of CPC.
It is best to reformulate the syntax of CPC in terms of finite
strings in a finite alphabet, for a number of reasons including 1 and
2 just above. This is normally done more or less as follows.
Use the finite alphabet of letters
not
and
or
arrows
doublearrows
p
0
1
(
)
The atoms are then strings of the form p followed by zero or more 0's and 1's.
There are various setups that allow for formulas of CPC to have fewer
parentheses, but still have unique parsing.
In the purely syntactic approach, one does not rely on any semantic
properties of the connectives (such as associativity), and does not
rely on precedence tables (all connectives are treated equally). For
example, there is no need for any parentheses in
p or not p
and no need for any parentheses in
not not not not p.
A pseudo formula is a formula generated in the usual way, but where
sometimes parentheses are not inserted. Presumably the general theory
of parsing tells us which pseudo formulas are inherently ambiguous
and which are inherently unambiguous, regardless of the meaning of
the connectives. This is presumably a purely syntactic matter when
set up properly, and the general theory of parsing also gives
efficient algorithms for determining this.
There are some straightforward grammars involving categories of
formulas, for the syntax of CPC, and the question arises as to
whether any of them is optimal in the sense of generating just the
formulas that do not have any unnecessary parentheses. Presumably it
is well known whether any such simple grammars are optimal in this
sense.
In practice, it has become conventional to use the following
precedence among the connectives:
not has the highest precedence;
and, or have the next highest, but equal precedence;
arrows and doublearrows are last, and equal.
We can revisit all of the above issues regarding optimality in these
terms. One can also allow that one is allowed to use the semantic
associativity of and and the semantic associativity of or and the
semantic associativity of doublearrows to further eliminate
parentheses. This is, in practice, a very good idea for ease of use.
Again revisit all such syntactic issues.
A purely combinatorial counting problem arises. What can be say about
the number of formulas of a given size? What notion of size should be
used for such a study? One can take the total number of atoms that
appear, or the total number of connectives that appear, or the total
number of symbols that appear, or the depth of the formula. In the
latter case, one has to introduce an appropriate equivalence class in
order to keep the number of formulas of a given depth finite.
Obviously, we can only expect recurrence relations. But one can ask
that the recurrence relation be of reasonable form; e.g., allow
nested recurrence relations - i.e., recurrence relations involving
recurrence relations, etcetera.
Also we can ask that the count on the number of formulas be
efficiently computed. This is closely related to the existence of
recurrence relations. Recurrence relations, even nested, tend to give
efficient algorithms for counts, whereas if we have to actually count
the number, then that will be generally take exponential time, or
perhaps polynomial space - but not obviously polynomial time.
We now come to the matter of truth assignments and tautologies and
semantic implication and semantic equivalence.
It is well known (due to Cook, who is an FOM subscriber) that testing
for tautologies is co-NP complete. So P = NP is equivalent to
"tautologies can be tested in polynomial time".
One should be able to prove that no simple minded polynomial time
algorithm for testing tautologies. For instance, one might do the
following. It is more convenient to think about satisfiability. Take
an atom that appears, and split into two formulas, the first obtained
by replacing the atom by T and simplifying, and the second obtained
by replacing the atom by F and simplifying. Continue in this way,
with the second atom, getting four nodes. Continue through all nodes.
In theory one may have an exponential tree. However, generally some
of the nodes that are not too high off of the root (root at the
bottom) are going to have formulas that have been fully resolved as T
or F. Of course, if any are resolved as T then the original formula
is satisfiable. So in general, we hope that either that happens,
instead we get a lot of F's, and the tree may well have paths of
length = to the number of distinct atoms in the original formula, but
still has far fewer than exponentially many nodes.
Presumably such schemes have been analyzed for SAT in the literature,
and shown to fail, in that sometimes the tree is exponential. Has the
literature also discussed anything about the distribution of the
sizes of these trees over the input formulas?
With careful programming, and a big machine, perhaps this procedure
is known to work for formulas of size n, where n is a reasonable
number. What is the state of the art on this?
For that matter, what results are there concerning the number of
formulas of a given size that are tautologies? Is it conceivable that
we could get an exact count of this efficiently without butting up
against P = NP and the like?
From some points of view, it is more attractive to work with 3-SAT,
another NP complete problem. What can be said about the relationship
between 3-SAT and SAT with regard to these issues? Of course, there
is a reduction of SAT to 3-SAT, but it makes a certain kind of
distortion. What if we ask for a very tight reduction. Then does that
become equivalent to P = NP again if it is very demanding?
In particular, when does 3-SAT and SAT become too hard for existing
technology, in terms of input size?
This is enough for now. I'm sure that many people on the FOM know a
lot more about such issues than I do, and can make some much
appreciated postings of research and educational value. Next time I
will take up issues of axiomatization in CPC.
More information about the FOM
mailing list