[FOM] Foundations through categories, part II

Michael Makkai makkai at math.mcgill.ca
Fri Jul 18 18:58:00 EDT 2003

A new foundation for abstract mathematics II.

1. This continues my posting of June 22, 2003. Unfortunately, I forgot to 
write anything in the "subject" heading at the time. Still, I had three,
very valuable, responses; I will respond to them in this posting.

The story is a long(ish) one. I have decided to take the time and follow a
"natural" course in telling this story, possibly resulting in several more
postings. As the subject heading says, this is a "categorical"  
(category-theoretical) foundational system (called the System in what
follows). It is one of my tasks to motivate the entry of (Eilenberg/Mac
Lane) categories into the System; another one is to show the need to 
generalize and modify category theory to suit the System.

The main thing in the first part was a sketch of a particular
formalization of a theory of *abstract sets*: sets with urelements
(alone). This was done in something I called FOLDS: First Order Logic with
Dependent Sorts. The latter, "all purpose", logical system was also
briefly described in general (section 3), but, possibly, the full
understanding of FOLDS has to await further examples.

Let me say here, just as an indication of what will come later, that FOLDS
is not just another formal language: the main thing about FOLDS will be a
"semantics of identity". The usual equality will be completely abandoned,
and there will be a signature dependent concept of identity of structures
as well as of elements of structures. An example is that of two groups one
cannot say in the System that they are *equal*. *Isomorphism* of groups
will be their "true" notion of identity, and in fact, the System cannot
say about a group but statements that are invariant under isomorphism.

2. Abstract sets and functions, again.

I thank three contributors: Harvey Friedman (June 22), A. P. Hazen (June
23) and Bill Taylor (June 24) for their responses to my (first) posting.  
Their comments will now help me to (re-)state the theory of abstract sets,
which was given in part I in a very sketchy way. 

Before I get to details, let me say (in fact, repeat) that the present
view of abstract sets is essentially identical to the one given by F. W.
Lawvere as quoted in part I. Moreover, (Elementary) Topos Theory of
Lawvere (a well-established subject in present-day category theory) can be
viewed as a formal exposition of the theory of abstract sets. This makes
clear part of the System's indebtedness to Category Theory, and Topos
Theory in particular. I must say here that the novelty of the System lies
in the way it goes *beyond* Topos Theory, mainly by having "many" new
types of "higher" totalities beyond sets and categories. A secondary 
novelty is the formalization, within FOLDS, of the theory of abstract sets 
-- and in fact, of anything and everything in the System. 

What I will do here is give the beginnings of a formal theory of abstract
sets *from scratch*. Nothing but ordinary logical notation will be used --
although, as a matter of fact, I will follow closely what I learned from
Topos Theory.

>From now on, "set" means "abstract set": all elements of it are

The first point is about *equality*: each set A is equipped with its own
(primitive) equality relation =_A (=-sub-A), and the expression a(=_A)b is
meaningful only if a and b are already (declared to be) elements of A.
There is no "global" equality of urelements or other things more
generally; in fact,

	*there is no (primitive) equality at all other than that for
elements of single individual sets*.

Moreover, each set functions as a *type*: each variable of urelements has
to be typed by a set. Writing a:A is a type declaration: it declares the
variable a as being of type A (A itself being a variable); it is not a
proposition; it cannot be negated, for instance: you cannot write
(not)(a:A), or: (not)(a is-in A).

Because of these decisions, one cannot, for instance, write down the 
condition for sets A and B that they have the same elements: I cannot 

	(for-all x)(x is-in A iff x is-in B)

because this uses an untyped variable x . We may try to replace it by 
typed variables:

	(for-all a:A)(there-is b:B)(b=a) & vice versa

but this suffers from an (unremovable) untyped use of equality. 

Note that, under this view of sets, expressions such as A-union-B, or 
A-intersection-B cannot be given any meaning: one wonders what one can do 
with sets at all ... 

Reflecting on what Harvey wrote: his (first) system is different mainly
because of his axiom 3: "Two sets have the same elements if and only if
they are equal". There are two points here: one is the one stated above,
namely that we cannot now say that "two (given) sets have the same
elements"; the second is that, *therefore*, we do not *want* to say of two
sets that they are the same: there is (to be) no equality *of* sets!

So far, each set is a Leibnizian self-contained monad. However, we have a 
new primitive concept of "function". For sets A and B, we have the notion 
of "f is a function from A to B", abbreviated f:A--->B . A-union-B , and 
A-intersection-B are still meaningless; but, as you will see, substitutes 
are available ... 

Functions are "variable identifications of elements of one set with those 
of another". 

Of course, in this move (of adopting "functions" as primitives) we follow 
the hint of Category Theory. But you don't have to *know* any category 
theory to appreciate what one can do with sets and functions -- of which I 
will give some illustrations below. 

But before doing that, let me attempt at a justification of this way of
dealing with sets and functions -- I call it "radical abstraction" -- , in
terms of mathematical practice. I claim that in abstract algebra, such as
group theory, radical abstraction is present on a basic level. Given two
"arbitrary" groups G and H, we do not want to know anything about what
possible common elements they have, for instance, before we make a
construction of a third group G#H out of G and H. The group-theoretical
instinct will reject a possible definition of G#H that takes the
underlying set of it to be G-union-H , or (G-intersection-H)-union-{0},
even though one can give perfectly correct usual-set-theoretical
definitions of a group G#H with either of these prescribed underlying

One might "object" that it is not "instinct" that works here; rather, the
mathematical fact that the attempted definitions of G#H will not (cannot
possibly) satisfy a basic requirement: invariance under isomorphism:

	G iso-to G'  & H iso-to H'  implies  G#H iso-to G'#H'

(it is obvious that G#H defined to have either said underlying set will
fail to satisfy invariance under isomorphism). All right: this is very
much what we are heading towards to: a System in which one cannot do
anything to groups that fails to be invariant under isomorphism!. In fact,
we will have metamathematical results to the effect that what we do is
*precisely* what is necessary to maintain the right invariance properties
of all statements and constructions. 

On the other hand, note that "isomorphism" is a relatively sophisticated
notion; one may wonder how *it* comes into the picture in the first place.
In fact, the (theory around the) System contains a general concept that
comes down to the notion of isomorphism in the case of groups; see later.

In the section after the next, I will spend some time developing
"radically abstract set theory", in fact, in a formalized manner. I hope
this will help. Concerning the first posting, Bill Taylor said that "the
more nitty-gritty matters you started on, involving Set El Eq and so on,
seem a little bit cumbersome somehow". You were right; in fact, you were
very gentle in your criticism; I have to do a better job of explaining.

3. Replies to A. P. Hazen's questions.

(i) A.P.H. says: "The description of abstract sets given (in part by
quoting Lawvere) sounds very much like what mathematicians uninfluenced by
Frege and Russell would have found it natural to say in the late 19th C.  
(Cantor, and arguably perhaps Plato, seem to have IDENTIFIED cardinal
numbers with "abstract: sets of the proper cardinality!)"

That is quite right as far as it goes: the notion of abstract set
*originates* in naive set theory. In my part I, I do say: "I think, this 
notion of "set" is the "original" idea of set". Lawvere also is quite 
aware of the fact. Just before the quotation from his paper I gave in Part 
I, he says: "...to formulate the following "purified" concept of 
(constant) *abstract set* [italics his] as the one actually used in naive 
set-theretical practice of modern mathematics". 

However, Lawvere's notion is more specific than the naive notion of
abstract set; and I follow his non-formal hints to give an even more
specific, indeed formalized, concept of abstract set. The main additional
elements with respect to the naive idea are the (linguistic and
conceptual) *constraints* that are part of the concept. I think, these are
clear from what I said above in section 2: I am referring to the absence
of global equality, and the fact that membership is not a predicate, but a
(linguitically more constrained) type declaration. What typically, almost
by definition, is absent from a *naive* (set-theoretical, or other)  
attitude is exactly *constraints*:  the naive attitude brings together all
the available ideas, regardless of possible incompatibilities, and uses
"common sense" to decide which one to use at any given occasion.

The "First question" of A. P. Hazen posting is: "how seriously should we
take this language, and how similar are Kit Fine's constructions to
Makkai's (more category-theoretical) ones?"

(ii) First, concerning the second part of the question: 

I took a look at Kit Fine's book that A. P. Hazen mentions: "The limits of
abstraction". So far, I have studied the book only superficially. My
impression so far is that, beyond a very basic commonality of interests
in abstract objects and concepts, there is little in common in Fine's 
approach and mine. In fact, in a sense, the focus of Fine's interest: the 
abstraction principle, is something that is, one might even say, rejected 
by the System, and in fact, by Category Theory. Let me explain. 

In Fine's formulation: "An abstraction principle associates objects with
the items from a given domain, the objects associated with two items being
the same when the items are suitably related and the objects being
distinct whem the items are not so related". Of course, this is
exemplified by the common mathematical practice of creating new objects as
suitable equivalence classes of old ones. The procedure turns an
equivalence into  equality: this is successful because of the
extensional equality of sets. 

But consider what happens when you want to turn isomorphism of groups, a
nice equivalence relation, into equality in the same way. Aside from a
trivial and unimportant size problem, one could define GROUP as an
isomorphism class of groups; two GROUPs are the same just in case the
(any) corresponding representative groups are isomorphic. This is
useless, however: one wants to retain the homomorphisms of groups, and it
is not possible to define them on the isomorphism classes! (Try!).

This is in great contrast to the situation when one defines, successfully,
the usual order of rational numbers, the latter construed as equivalence
classes of pairs (p,q) of integers (q not=0). And indeed, there is an
underlying (partial) order on groups: G <= H iff there is f:G-->H, which
is inherited to GROUPS; but this order is not enough: we want actual
mappings, or something that works as they do -- and this is not possible

This is a good point to see why the identity concepts such as isomorphism,
and later, equivalence of categories, and then, higher dimensional
versions of them, are seriously non-reducible to equality, or to each
other in fact!

(iii) Next: "how seriously should we take this language?" I take that
"this language" is my language. I am not sure if it is really up to me, at
this point, to fully answer the question. I think the Reader will have to
make up his/her mind how seriously he/she can take this language, once
there is enough information available. Still, I can't resist saying this

As I tried to emphasized in section 1 of part I, the fully explicit nature
of the language, indeed the formality of it, is a paramount requirement (I
referred to it as Frege's imperative). And indeed, the System is --
potentially -- fully formalized. I say "potentially" because the System is
expandable; however, it has definite and significantly large completely
formalized parts of it. In this sense of being formal, "this language" is 

The language is also "serious" in the sense that it is accessible to
concepual and mathematical inquiries: it has a good metatheory. I will
present evidence on this as we go along.

Finally, the language is "serious" in the sense that it is *expressive*. I
will now mention a relevant fact, which is of interest also because of the
connections to Cantorian set theory.

There is a "good" way of talking about Cantorian sets: members of the
cumulative hierarchy, using the framework of abstract sets. The idea is
due to W. Mitchell and J. C. Cole, and it is described, for instance, in
P. T. Johnstone: Topos Theory; see section 9.2, p. 303. In fact, it is a
commonplace idea, from our present-day point of view. The idea is that a
Cantorian set a is determined uniquely by the isomorphism type of the
following structure:

	 (tr({a}); epsilon-restricted-to tr({a}), a)

(here, tr(b) is the transitive closure of b). Thus, we can *define* a 
Cantorian set as a structure 


where x is an abstract set, E is a well-founded and extensional binary
relation on x, and a is an element of x. I leave it to the reader to
define membership of Cantorian sets so construed; it will involve
functions between abstract sets. Of course, one should be able to define
"extensional" and "well-founded" in abstract-set theory;  but this is done
in Topos Theory in standard ways: see Johnstone's book.

I must admit that I had an ulterior motive with this example: it shows the
"silliness", from the point of view of abstract set theory to be sure, of
using a concept of "group" (say) that uses an underlying set which is a
Cantorian set rather than just an abstract set. It carries a superfluous
structure (the x and the E) that is no use from the point of view of group
theory. I have to correct myself though: for any given group, the 
*existence* of a Cantorian structure on it could be significant; but one 
still should not carry that structure around as part of the group 

(iv) To Question 2: "is my suggested formalization of brotherhood a good
example of dependent sorts?", I will return at the end of the next
section, after some more experience with dependent sorts.

4. Dependent sorts in action.

Let us return to abstract set theory. 

The grammar of functions uses dependent sorts in an essential way. "We can
talk about a morphism (in particular, a function) only after its domain
and codomain have been fixed": this is an adage that a category theorist
would probably find natural; it is similar in spirit to the more
well-known adage: "one should not talk about equality of objects"  
(instead, one should talk about isomorphism). 

What this means is that a function variable f is always of type Ar(A,B),
where A and B are some (any) set-variables. Ar(A,B) is a dependent type
(sort): Ar is a *kind* (an extralogical symbol given in the signature that
we choose for our theory of sets and functions); and the only context it
appears in is the forming of the type Ar(A,B)  where A and B are arbitrary
set-variables, which may be the *same*. 

The complete signature is as follows: 

Kinds: Set, El, Ar, Eq, Ap. The typing of variables is as follows:

	A:Set, a:El(A);
	A:Set, B:Set, f:Ar(A,B);
	A:Set, a:El(A), b:El(A), e:Eq(A,a,b);
	A:Set, B:Set, a:El(A), b:El(B), f:Ar(A,B), t:Ap(A,B,a,b,f). 

Each line (called "context") exhibits the use of the kind that appears at
the end of the line;  Set is (also) a type in itself, not depending on any
variable; any type headed by Ap must be of the form Ap(A,B,a,b,f), where,
in turn, the variables involved must be declared in the way they appear in
the last line in the display (this means, on the semnatic level, that a
must be an element of A, b an element of B, and f a function from A to B).

Note that there are no relation or operation symbols in the usual senses.  
For instance, the kind Ap is used to handle the operation, which is
application of a function to an argument. In fact, we may use the

	f'a=b 	for  	(There-is-t:Ap(A,B,a,b,f)).True

and write down, as an axiom, that

	(For-all A:Set)(For-all B:Set)(For-all a:El(A))(For-all f:Ar(A,B))
		(There is b:El(B)).[f'a=b]

expressing that "application of a function to an element of the 
domain results in at least one value". We can also write down (using Eq) 
that the value f'a is unique; this will be another axiom. 

I have deliberately chosen a formulation which is different from the 
"usual" one, namely, the one that follows Topos Theory more closely, which 
does not use elements (El) at all, and instead of Ap (application), uses 
composition of functions (and  makes the system of sets and functions 
into a category). I wanted to point to a flexibility of the ways 
of expression in the System. 

A definite, although only a beginning, part of the System will consist of
the axioms, formulated in the language specified, that are translations of
the topos axioms. I will not repeat them here now; let me just say that
Cartesian products and power-sets will be defined, and they will exist. 
This posying is already way too long; I may come back to some 
illustrations of "topos theory written in FOLDS" later.  

Turning "Question 2" above: I would write your suggestion in the form of 
the type discipline

	m:Man, b:Brother(m), j:J(m,b)

and your statement as

	(For-all m:Man)(There-is b:Brother(m))(There is j:J(m,b))

(here, j is a "piece of jealousy between in m towards b").  

However, maybe, there is something unnatural in this in the sense that
writing b:Brother(m)  somehow presupposes that there is no sense of
talking about b before we have fixed m as a reference point. Remember that
this is the kind of intuition when one writes a:El(A) ("a is an element of
A"), or f:Ar(A,B)  ("f is a function from A to B"). One would tend to say
that brotherhood is more like a relation between men, just like jealousy. 
I must admit that reformulated in this way, your example will become less 

A possibly better example would be to talk about families, and members of 
families -- but of course, this is just an application of the language for 
set-theory sketched above.  

More information about the FOM mailing list