FOM: Computer-verified theorems [was Re: The meaning of truth]
JoeShipman@aol.com
JoeShipman at aol.com
Sun Nov 5 07:42:09 EST 2000
I agree with practically everything Bauer says in his post about the
difficulty of software verification, and have said similar things in the past
on FOM. My point to Kanovei was that the tiny chance of incredibly bad luck
in random choice of witnesses to compositeness was dwarfed even by the tiny
chance of "machine error", for simple types of machine error like hardware
faults and cosmic rays, so that it was no obstacle to attaining a confidence
level of "scientifically proven" in the truth of a mathematical theorem.
(The confidence level of "mathematically proven" is still not extremely close
to 100% because of the possibility of human error in verifying the proof, but
it is SUBJECTIVELY a different order of confidence when you have verified a
proof by actually checking it in the traditional way.)
Of course the other types of computational error Bauer refers to are much
more serious. I would like to repeat the recommendation I made on the FOM on
August 30 1998 (excerpted below):
**********************************************************************
4CT was "verified" quickly because the algorithm was straightforward enough,
and the input data manageable enough, that independent researchers could
write their own implementations and check the outputs.
But it is widely underappreciated how difficult and dangerous verification
is. For a hilarious horror story, read how Thomas Nicely of Lynchburg
College discovered the flaw in the Pentium chip while trying to calculate
some constants related to the distribution of twin primes ("Enumeration to
1e14 of
the twin primes and Brun's constant", Virginia Journal of Science, Volume 46,
Number 3 (Fall 1995), pp.195-204, available online at
http://www.lynchburg.edu/public/academic/math/nicely/twins/twins.htm ).
The following quote from this paper summarizes the lessons Nicely learned:
<<<<<
After this tempest began to settle (mathematicians in general, and this one
in particular, are not accustomed to being besieged by inquiries and visits
from network news, the BBC, the Washington Post, the Wall Street Journal, and
the New York Times, as well as journalists from France, Australia, and
Malaysia), attention returned to the work at hand. With commendable haste,
Intel provided
a replacement chip for the errant Pentium (as well as one for a colleague's
system) and also supplied a 90-MHz Pentium system to help make up for lost
time. Naturally, all remaining calculations were carried out in duplicate,
and the wisdom of this caution was confirmed when other discrepancies
appeared between duplicate runs. Twice the errors were tracked to
intermittently defective memory (SIMM) chips; parity checking had failed to
report either error. On another occasion, a disk subsystem failure generated
a wholesale lot of incorrect, yet plausible data. In the most recent
instance, a soft memory error appears to be the culprit.
Although many may dismiss machine errors such as these as of no practical
consequence (except to mathematicians) or as evidence of the unreliability of
personal computers, I believe there is a more important conclusion to be
drawn. Any significant computation, for which there is no simple check
available, should be performed in duplicate, preferably on two different
systems, using different CPUs, hardware, operating systems, application
software, and algorithms. In practice, it is difficult to obtain this degree
of independence; but certainly, one must be suspicious of the result of any
single, lengthy machine computation, whether it is performed on a personal
computer or the world's finest supercomputer. As we tell our students---check
your work!
>>>>>
The five different levels Nicely recommends duplicating (CPU, hardware,
operating system, application software, and algorithms) all experienced
failures at different stages of the project, and of course only the last of
these five was his fault!
In the absence of rigorous proofs of the correctness of computer systems
(including chip design, programming language, compiler, and operating
system), we can not have the same type of certainty about the correctness of
a computer-aided proof as we can of a humanly surveyable proof. But we can
attain "moral" certainty if we follow Nicely's recipe with sufficient
thoroughness (I would demand more than two verifications in which all the
components of the system were independent for an important result like 4CT or
the Kepler conjecture).
*******************************************************************
-- Joe Shipman
More information about the FOM
mailing list