[FOM] Formally verifying "checkers is a draw"

Josef Urban josef.urban at gmail.com
Sun Aug 3 18:35:56 EDT 2014


On Aug 3, 2014 8:17 PM, "Timothy Y. Chow" <tchow at alum.mit.edu> wrote:
>
> On Sun, 3 Aug 2014, Josef Urban wrote:
>>
>> Both the Four Color theorem and Flyspeck involved at some point
enumeration of a large number of cases. If I understand correctly, the
technique used in both cases was (essentially) to verify the
case-enumerating algorithms and run them, instead of verifying the
certificates. It would be interesting to compare the orders of magnitude
dealt with with in those cases with Schaeffer's.
>
>
> Thanks for the pointers.  I found this relevant paper:
>
> http://arxiv.org/abs/1301.1702
>
> The abstract says, in part, "The Flyspeck project includes about 1000
nonlinear inequalities. We successfully tested our method on more than 100
Flyspeck inequalities and estimated that the formal verification procedure
is about 3000 times slower than an informal verification method implemented
in C++."

The latest factor for this was taken down to about 800. But the case
enumeration  I meant (done by generating verified ML code) is the "Tame
graph classification" by Nipkow and Bauer . It should be of the order of
10000 cases.

>
> As for the four-colour theorem, the Robertson et al. proof involved only
633 configurations, so I assume that the Coq proof involved a similar
number of configurations.

I think Gonthier distinguishes two orders of magnitude, one about 10^4 (the
main case split?) and one several orders of magnitude more. It should be in
some of his papers. But his Coq code runs under a day I think.

>
> For comparison, the Schaeffer et al. paper says, "The stored proof tree
is only 10^7 positions. Saving the entire proof tree, from the start of the
game so that every line ends in an endgame database position, would require
many tens of terabytes, resources that were not available. Instead, only
the top of the proof tree, the information maintained by the manager, is
stored on disk.  When a user queries the proof, if the end of a line of
play in the proof is reached, then the solver is used to continue the line
into the databases. This substantially reduces the storage needs, at the
cost of recomputing (roughly 2 min per search)." Later on the paper also
says, "How much computation was done in the proof? Roughly speaking, there
are 10^7 positions in the stored proof tree, each representing a search of
10^7 positions (relatively small because of the extensive disk operations).
Hence, 10^14 is a good ballpark estimate of the forward search effort."

10^7 each with 2 minutes search does not sound so bad today - 14 days on a
1000-CPU cluster. But for a comparison, Hales's original (Java?) 1990s code
for the tame enumeration runs under a day I think (this is all in the
papers - I don't have a computer with me).

>
> I hesitate to draw firm conclusions without understanding the details
better, but it sounds to me like the checkers case is many orders of
magnitude larger than the four-color case or the Flyspeck case (even
granting that the Flyspeck project involves additional tedious computations
beyond the nonlinear equations).  It sounds like one might either have to
write down 10^14 cases, or write down a database of 10^7 cases and deal
with a lot of computation---perhaps 10^7 * (2 min) * 3000?

... * 800 (and only relatively straightforward tricks were used to get the
factor this low in HOL Light). But this assumes the verification model when
all the computation is done inside the proof assistant and pushed through
its kernel. The tame enumeration by Nipkow & Bauer is different: it does
not go through the LCF kernel but runs in "correct-by-construction" ML
(exported from the Isabelle definitions and theorems) -  it might be faster
than the original Java.

I agree that it looks bigger CPU-wise than Flyspeck and 4 color, but it
might not be so big for today's clusters if a good verification approach is
used, and the amount of human verification work might be quite a bit lower
than in Flyspeck and perhaps even 4-color. Looks like an interesting
project.

Josef

>
>
> Tim
> _______________________________________________
> FOM mailing list
> FOM at cs.nyu.edu
> http://www.cs.nyu.edu/mailman/listinfo/fom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/fom/attachments/20140804/a1909bbb/attachment.html>


More information about the FOM mailing list