Can Proof Technology Even Do this?

Buzzard, Kevin M k.buzzard at
Sun Mar 21 05:48:53 EDT 2021

Sorry for the delay in replying.

So what Harvey calls "the tech" seems to me to be actually perhaps four distinct things.

First there are the ATPs, the ones where you give them the axioms of a group and then ask them to prove some theorems about groups and they do it by flinging axioms around incredibly quickly. These are the things which originally proved the Robbins conjecture and have since then come up with proofs of results in quasi-group theory which are literally thousands of lines long.

Then there are the ITPs, which right now are mostly being used by humans to type in proofs of things -- formerly undergraduate things but in recent years there has been a solid accumulation of results which suggest that now in theory the sky is the limit (e.g. I now have no doubt that a proof of FLT could be formalised by humans, the only problem now is finding enough humans; the point is that the technical issues have been resolved).

Then there are the hammers, which try and automatically prove results in ITPs by passing them off to ATPs and then attempting to translate the results back.

And finally there are the machine learning people, who take databases of theorems and proofs (perhaps from an ITP) and then try and make more proofs. This is where people like Josef Urban come in. All I can say about this area is that it seems to me that these systems seem to be getting very good at finding short proofs -- surprisingly good, in my opinion, especially given that some of them are not even taught the language they should be writing the proofs in, they are just shown hundreds of thousands of lines of code and told "it looks like that".

Because of this confusion about what "the tech" is, these questions become perhaps harder to answer than one might first expect.

As has been pointed out already, the question about automatically proving that every natural is even or odd is unfortunately an unnatural one. The ATP's might have a go at this but my understanding is that right now they are lousy at knowing when to apply induction (however I was recently on Yutaka Nagashima's thesis committee and he has been trying to come up with heuristics to spot when induction is an appropriate tactic -- for an ITP though). However all the ITP's know these results already, and so all the machine learning systems will be able to prove them too because they'll have seen the proofs before. In particular, whilst this does look like a natural question at first, I guess it's like saying to Deep Blue (which according to Wikipedia was trained with 700,000 grandmaster games) "Ok so what do you think the best opening move is?", to which a response might be "well e4 and d4 seem pretty popular amongst humans". To give another analogy -- consider the systems which can tell a picture of a cat from a picture of a dog. These things do not work by reading the Wikipedia pages for cats and dogs and then they can distinguish a jpeg of a cat from one of a dog -- these things are trained with thousands of pictures of cats and dogs first. Telling a system the axioms of ZFC and then asking them to prove that every number is even or odd is a question of a similar nature. The idea is that we should bombard the system with lots of proofs of induction *first*, and then unfortunately the easy examples become trivial because the system has seen them before.

Of course we should remind ourselves here that as mathematicians, the vast majority of what we do is just using techniques which we've seen many times before. The hard part is having the intellectual leap which is not already in our library, but you can get an extremely long way just regurgitating techniques which are already well-known, in the right order.

In particular, I think the question is *not* "how do we get these things to have what look like new ideas?". Right now the question is how we get the tech to understand the currently known ideas so it is as good as an unimaginative graduate student who can just about chase through the algebra. And one approach to this is by making a huge database of known theorems. This is what we're doing with things like algebra, number theory, analysis, topology and geometry in Lean's maths library.


PS a huge amount is known about solutions to "small" Diophantine equations :-) Number theorists have got quite good at them. Right now we're stuck on cubics in two variables, but if the coefficients are small then there are plenty of techniques. In the general case we run into BSD and related stuff.

From: FOM <fom-bounces at> on behalf of Harvey Friedman <hmflogic at>
Sent: 01 March 2021 02:57
To: dennis at <dennis at>; Foundations of Mathematics <fom at>
Subject: Re: Can Proof Technology Even Do this?

This email originates from outside Imperial. Do not click on links and attachments unless you recognise the sender.
If you trust the sender, add them to your safe senders list to disable email stamping for this address.
On Sun, Feb 28, 2021 at 9:08 PM <dennis at> wrote:
> > a formal system S and a sentence phi, and the problem is to prove phi from S


> You seem to equate "proof assistants" (ITPs) with "automated theorem
> provers" (ATPs). Proof assistants assist with proving, they don't do the
> proving (entirely) for you. Almost every proof assistant contains some
> or several ATPs for convenience, but some don't (e.g. metamath).
> I think you're making a mistake by measuring the success of ITPs merely
> by how powerful they are as ATPs.


I tried to make clear what I was trying to do by seeing what can be
done with the proof technology AS PURE AUTOMATED UNAIDED DEDUCTION.

The reason I am trying to do this is to UNDERSTAND THE JUMPS THAT THE

In a real situation, one has vast libraries and special decision
procedures and the kitchen sink to use in order for the tech to prove
something new.

BUT if the tech is going to do ANYTHING LIKE what people who think
(wrongly I think) that Proof Assistants are really going to make a
difference with Garden Variety Professional State of the Art
Mathematical Advances - other than LIMITED SPECIAL THINGS WE ALL SEE
THE TECH DO, the Tech is going to have to make SERIOUS JUMPS.

So I was trying to find UNCLUTTERED situations where it is a matter of

Of course I am aware that any Library has for decades had those PA
examples as an integral part. AND the most elemental of what I was
looking at is in the language of PRESBURGER and so decision packages
are available for them. Also there are extended decision packages
which certainly cover NO SQRT(2). BUT what is the most general kind of
statement that they cover?

DIGRESSION. What is the most powerful decision procedure we have for
arithmetic statements that goes into ring statements (i.e., beyond
Presburger) and covers things like NO SQRT(2)? I.e., what are some of
the really powerful and understandable decision procedures out there
for fragments of arithmetic?
For my own work, I pretty much know when Tech is going to be useful.

I have from time to time made the following move: IS THE FOLLOWING
TRUE FOR ALL *SMALL* CASES?, whatever I am looking at. However, that
is not the typical thing that I do although I plan to do more of it.

FOR EXAMPLE, I had a project which got some traction, but has since


FIrst of all, some simple TECH is useful to organize and compile these
small equations. But it would be useful for the TECH to automatically
search for solutions to get rid of the easy ones so I can concentrate
on the interesting ones.

But how about this problem?:


I submit to you that TECH is not going to help with this. Of course
TECH can be used to formally verify my work on this, but WAY WAY WAY
WAY after I have very polished complete manuscripts with my personal

Harvey Friedman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/fom/attachments/20210321/460f3a5d/attachment-0001.html>

More information about the FOM mailing list