[FOM] Concerning definition of formulas

Vaughan Pratt pratt at cs.stanford.edu
Fri Oct 5 20:57:19 EDT 2007

Saurav's question (on how to break the vicious circle between the 
definition of wff and the definition of ZFC which seemed interdependent) 
was responded to initially by Andrej Bauer in terms of three 
methodologies for syntax, namely trees, inductive types, and initial 
algebras, and then by Richard Heck on Sept. 30 who proposed breaking the 
circle using "ordinary" set theory understood as being prior to ZFC.

In response to Heck's post I proposed on Sunday taking his non-ZFC 
(naive) set theory approach a step further to eliminate any dependency 
at all on recursion.  Regrettably this step was rejected by the 
moderator as "confusing the entirely syntactic notion of formula with 
semantic notions."  During the week spent defending the purely syntactic 
content of my post to the board, Hazen, Avron, Forster, Aitken, Fugard, 
McCarthy, Jones, and Blum also responded, making it clear that this 
issue bothered many people, not least students who were puzzled by it 
when first exposed to it.

The board having approved my post as of today, here's my proposal, 
rather longer now than on Sunday as a result of my attempts to clarify 
it to the board's satisfaction, but it is surely reasonable that clarity 
trump brevity on FOM.

Answering Feferman's "what rests on what", the following assumes only 
pages 1-33 of Halmos, Naive Set Theory, namely up to the chapter on 
functions, material that every college mathematics student should 
possess.  No recursion or induction, either explicit, or implicit in the 
use of numbers, is involved in the definition.  The wffs accommodated 
are those in prenex normal form with a full DNF matrix, sufficient for 
all definitional purposes in bootstrapping up from naive set theory. 
The definition permits some flexibility regarding variants as per the 
notes below.


All sets are assumed finite.

1.  Language.  Let S (Sigma) be any alphabet.  A (relation) *symbol* is 
a pair (s,A) where s \in S and A is a set constituting the formal 
parameters or *arity* (as a set) of the symbol.  A *language* L is a set 
of symbols.

2.  Wff.  A *wff* W = ((E <= Q <= V), P, m, <) over a language L is a 
4-tuple such that

(i) V is the set of variables used in W, a subset Q of which are the 
quantified variables, with a subset E of Q being the existentially 
quantified variables.  V\Q is the set of free variables, while Q\E is 
the set of universally quantified variables.

An *atom* (atomic formula) is a pair ((s,A), b: A --> V) consisting of a 
symbol (s,A) of L and a binding b of its parameters to variables. 
Example: R(x,y,...,z) where x,y,...,z \in V.

(ii) P is a set of atoms, namely those appearing in W.

A *disjunct* is a subset D of P.  The intended interpretation of D is as 
the conjunction of the elements of D conjoined with the conjunction of 
the logically negated elements of P\D.  (So every atom appears exactly 
once as a literal in every disjunct D, bearing a sign, + if in D, - if 
in P\D.  There are therefore 2^|P| possible disjuncts.)

(iii)  m is a set of disjuncts (disjunctive normal form).  These are the 
disjuncts whose disjunction forms the matrix of W.

(iv)   < linearly orders Q from left to right (prenex normal form).  So 
if x < y and y is in E but not x, corresponding to the quantification 
...Ax...Ey...m, then y depends on x.


1.  A refinement of (iv) is to allow < to be a partial order.  This 
permits Henkin-style quantification in which w can depend on x and y on 
z without entailing any other dependencies the way the linear order must.

2.  With the exception of Q (and hence E), the finiteness restriction is 
inessential.  This makes essential use of the existence of the free 
complete atomic Boolean algebra on an arbitrary set P, namely 2^(2^P), 
in striking contrast to the 1964 Gaifman-Hales result that infinite free 
complete Boolean algebras do not exist.

3.  Dropping finiteness for Q raises the question of the meaning of 
quantification when Q is say the reals and E the rationals as a subset 
of the reals, with < as their standard numerical order.  An intermediate 
generalization would be to allow Q to be infinite but require < to be a 
partial order of finite height (no infinite chains), so that each 
variable in E depends on only finitely many variables in Q\E.

4.  The definition is "framework-neutral" in the sense that it is 
equally applicable to conventional formulations of first order logic as 
encountered in any introduction to logic; to Tarski's cylindric algebra 
(CA) formulation; and to Lawvere's conception of existential and 
universal quantifiers as respectively left and right adjoints to 
substitution.  In CA the set V is fixed for all the elements of a given 
algebra, whereas in Lawvere's approach V increases with substitution and 
decreases with quantification.  This definition of wff then provides 
considerable flexibility in answering Sid Smith's question yesterday of 
how to understand quantifiers: if none of those three work for you, at 
least you have a reasonably clean syntactic foundation on which to erect 
your fourth solution.

5.  A student picking up Paul Cohen's "Decision Procedures for Real and 
p-Adic Fields," 1969 and reading in the first paragraph, after a listing 
of the logical and relation symbols and variables, "Of course, these 
symbols must be used in the grammatically correct fashion with which we 
are all familiar," might wonder why anything more is needed.  One 
response would be to ask the student to make sense of Notes 1-4 on the 
basis of that account.

Vaughan Pratt

More information about the FOM mailing list