[FOM] Talk on Formal Verification

Harvey Friedman hmflogic at gmail.com
Sun Jan 31 23:49:12 EST 2016


Here are my comments on the very interesting posting of Larry Paulson,
 http://www.cs.nyu.edu/pipermail/fom/2016-January/019449.html.

 I have yet to process Avigad,
http://www.cs.nyu.edu/pipermail/fom/2016-January/019441.html,
Constable, http://www.cs.nyu.edu/pipermail/fom/2016-January/019459.html,
and some others.

On Thu, Jan 28, 2016 at 10:02 AM, Lawrence Paulson <lp15 at cam.ac.uk> wrote:
>> On 22 Jan 2016, at 06:18, Harvey Friedman <hmflogic at gmail.com> wrote:
>>
>> I gave a general talk on formal verification, at
>>
>> https://u.osu.edu/friedman.8/
>> downloadable lecture notes
>
> Here is my attempt at answering some of the questions posed by this keynote lecture.
>
> IS THE SIZE OF PURELY FORMAL PROOFS OBTAINED FROM
> SEMIFORMAL PROOFS OF MAJOR THEOREMS REASONABLE? E.G., AT
> MOST TEN THOUSAND PRINTED PAGES? OR IS THERE AN EXPONENTIAL
> TYPE BLOWUP INVOLVED WHEN WE MOVE FROM THE USUAL SEMIFORMAL
> PROOFS CREATED BY MATHEMATICIANS TO A FORMAL PROOF?
>
> This is sometimes called the "de Bruijn factor”, after NG de Bruijn, who posed a similar question in relation to his AUTOMATH system. Generally the blowup is regarded as tolerable. For a specific example, I recently formalised the Stone-Weierstrass proof given here:
>
> http://www.jstor.org/stable/2043993
>
> The original is 3 ½ pages, including some nice explicit calculations. Even as someone who last studied such material 40 years ago, I didn’t have too many difficulties. The corresponding formal > text in Isabelle is 579 lines or about 10 pages.

I see extensive discussion on the FOM of the possibility - and reality
- that a semiformal proof relies on a large computer computation. Then
depending on the precise interpretation of my question, there would be
a big blowup when making the proof formal.

Of course, this is not quite what I had in mind. I was thinking of
looking only at proofs without computers, or looking only at the parts
of proofs not including the computer components.

So Larry is definitely addressing what I had in mind here. But I think
I raised the issue that this factor might be much higher for really
deep proofs like of FLT or Poincare's Conjecture, etcetera. Although I
have some confidence that in these situations, the factor is going to
be tolerable, 10/3.5 seems rather optimistic. Of course, there is the
problem of comparing apples to apples here.

I remember mentioning computer assisted proofs in the talk.
Specifically, is there a real world example where we use substantial
computation for a proof, and can establish that without such, all
proofs are too long to be humanly feasible?

This is very close to simply asking for a real world example where a
theorem has a proof of quite high but not absurd length, and no proof
that isn't quite high. One can use numbers like 10^15 or 10^20 for
this.

The classical work on blowup of proofs, and my work on blowups of
proofs with regard to (variants of) Kruskal's theorem, is not refined
enough to properly deal with this. It would seem much more difficult
yet to address this issue with the kind of examples that Tim Chow has
been discussing recently on the FOM. E.g.,
http://www.cs.nyu.edu/pipermail/fom/2016-January/019455.html

> IS THE MENTAL EFFORT INVOLVED IN GOING FROM A FULLY
> UNDERSTOOD SEMIFORMAL PROOF TO A FORMAL PROOF REASONABLE?
>
> It very much depends on the subject matter. Where the intuitions are strongest, the effort of formalisation is the most frustrating. The Stone-Weierstrass proof above is a precisely laid out >technical argument. But many obvious observations about finite sets, continuity, limits, etc need to be spelt out, sometimes at unreasonable length. In some cases, specialised automation could >help.

I appreciate very much the second sentence above. The question is:
why? Are we missing some powerful fundamental principles that are
still absolutely certain - e.g., derivable from ZFC (and fragments),
that we are using implicitly? Or are we humans really operating on a
fundamental intuition that defies any kind of direct formalization?
And so the frustrating thing that we keep doing is ignoring these
underlying unstated fundamental principles, and keep repeating
different forms of the proof that these unstated principles are in
fact derivable from ZFC?

This situation may be behind what I think is general skepticism among
mathematicians totally uncomfortable with formalization that formal
verification is even feasibly possible.

However, the amazing fact that is being established is that formal
verification is feasibly possible.

Now as you see from my FOM postings and to some extent my talk on
formal verification, I am someone who gravitates toward starting all
over. This looks very dubious, to say the least, to people who are
full time (or heavily part time) experts in a rather impressive
scientific/engineering field involving tens of thousands of man-years
(maybe much more?), and accelerating?

So it has just occurred to me that one possible rationale for me to
even think of "starting over on paper" is the prospects of uncovering
some new powerful fundamental principles of mathematical reasoning
that - although they are "totally" captured in the usual ZFC
formalization - that are so much closer to ordinary mathematical
thinking that they save a lot of time and effort in giving formal
verifications.

In a way, you can look at what Voevodsky et al are proposing as long
these lines. But I am looking for something seriously general purpose.
E.g., I would think that something like this should already rear its
head at least in some interesting way in the proof of Zermelo's well
ordering theorem (ZWOT).

In fact, here are some questions about this ZWOT:

1. Let's say you start with some modestly sugared version of ZFC. No
significant library. Under current known methods for automated
reasoning with no human interaction, how long would it take to come up
with a proof of ZWOT (after posing ZWOT)?

2. What can we say about how long it would take for a human to come up
with a proof or "proof" of ZWOT? Why the gap? What is going on here?

3. What about the year 2100 or 10,000 under Deep Learning?

4. It would appear that I am comparing apples with oranges at several
levels. Explain.

> CREATE FORMALLY VERIFIED TREATMENTS OF VARIOUS BASIC AREAS
> OF MATHEMATICS WHICH ARE SO READABLE THAT THEY CAN BE
> EFFECTIVELY USED AS TEXTBOOKS WITH REAL WORLD NON COMPUTER
> SCIENCE STUDENTS.
>
> Readability of formalised mathematics remains a big problem. Much progress has been made on better notations for proofs, building on the structured proofs of the Mizar system
>(http://www.mizar.org). Mathematical formulas themselves are much more readable than before, but the need for an absolutely unambiguous syntax precludes innumerable well-known notational >conventions.

There is a relevant meeting here
https://www.fields.utoronto.ca/programs/scientific/15-16/semantic/
Once again, I am offering up some ideas as a highly interested
observer. I like to separate out the problem of the formal
representation of mathematical statements from the much more ambitious
formal verification and formal representation of mathematical proofs.
Even for this much more limited goal, there are some very serious
issues that need to be handled creatively.

In my humble opinion, a lot of mathematical notation needs to be
simultaneously overhauled and improved in order to make major progress
in the formal representation of mathematical statements.
>
> THE MATHEMATICIANS WANT FREE LOGIC.
>
> A number of approaches to undefined terms have been tried, but as a rule they have not been popular with users. A very lightweight approach will have to be invented.

Here I am aiming at the vast majority of "users" in the mathematical
community that don't want to get involved in formalizing mathematics
at all! My conjecture is that for these total non users, one of the
major pieces of the puzzle to get them comfortable is a very friendly
free logic discipline.
>
> I WOULD LIKE TO SUGGEST A RELATED IDEA THAT SOUNDS VERY BOLD. WHY NOT TRY TO DECIDE ABSOLUTELY ANY INTERESTING CLASS OF MATHEMATICAL STATEMENTS, EVEN IF IT IS WELL KNOWN TO BE ALGORITHMICALLY UNDECIDABLE? HERE BY "DECIDE", I SIMPLY MEAN: PARAMETERIZE THE STATEMENTS IN QUESTION, AND START WITH VERY SMALL CHOICES OF PARAMETERS, CREATING MODEST SIZED FINITE SETS OF TARGET STATEMENTS. DEVELOP TOOLS TO HANDLE THESE EFFICIENTLY, AND THEN SLOWLY RAISE THE PARAMETERS, DEVELOPING MORE TOOLS, ETC.
>
> This has been the approach to automation in Isabelle for 25 years, based on the common student observation that many problems can be solved simply by doing all obvious steps, combined
> with a small amount of trial and error. Isabelle’s automation can be supplied with theorems along with hints as to whether they should be regarded as obvious or non-obvious (the latter allowing
> search), and how they should be used. Compared with decision procedures, this approach has the great merit that it is almost always applicable and almost always makes progress.

I am glad that this has been the usual approach in Isabelle - at least
under one interpretation of what I am suggesting.

What I would like to see is perhaps more in the way of science. E.g.,
I would like to see say 100 hard nosed cleanly stated fundamentally
interesting large finite decision problems of obvious general purpose
significance (for formal verification), where the theorem is that
"there is a decision procedure that works for all problem instances
with at most such and such resources under a fundamental computational
model".

We already have a fair amount of pure science like this for decision
procedures with infinitely many cases, in the usual sense, in math
logic, starting with Turing. But this is not what I am talking about,
of course.
>
> WHAT DOES IT MEAN TO SAY THAT WE HAVE GIVEN A CORRECT STATEMENT OF A THEOREM?
>
> That’s an excellent question, and my main worry when formalising anything. the system can check the proof that it can’t check the statement, apart from types, which give only a crude well- formedness test.

I wonder if an extreme modular discipline would be interesting here.
In an agreed upon formal language for mathematical propositions, there
is a notion of tiny component. One can ask that all formalizations of
mathematical propositions consist of a hierarchy of tiny components,
each separated from the others. What this disallows is one writing
down anything of any significant complexity. It must be broken down. I
wonder how far this can or should be pushed in formulating
mathematical statements. All tiny components must be instantly
readable.

Years ago, I formulated projects to the effect of actually taking
complex mathematical notions and casting them in such a granular
hierarchical way, according to what mathematicians would be normally
most comfortable with, and actually examining the hierarchical
structures that result. I called these "concept trees". I conjectured
that the shape of the naturally constructed "concept trees" in the
various areas and kinds of mathematics were quite different.
>
> HOW CAN WE DESIGN THE ULTIMATE VERIFIER IN ORDER TO BE BEST SUBJECT TO VERIFICATION?
>
> In recent years, there has been much promising research in the use of proof assistants (especially HOL Light and HOL4) to verify their own kernels, and even to verify their translation into machine code. This is as high as standard as one could wish for. On the other hand, if we are demanding that these systems should be verified, it seems perverse to trust them for their own verification. Therefore we need the proofs to human-legible as well. They aren’t at all just now. But as somebody who uses such systems almost daily, I’d like to remark that confidence in such a system can be confirmed by its stubborn refusal to prove something in the presence of even the tiniest error in the set up.
>
I am very comfortable with this perspective. I want to emphasize that
my strong interest in "certainty" and the like is principally
foundational and philosophical, and not practical. Of course, if your
community actually does certify something that later turns out to be
flawed, and top researchers were involved, then that is a whole new
story. I assume that that hasn't happened, at least for a long time?
What can we learn from when this did happen (if ever)?

Harvey Friedman


More information about the FOM mailing list