[FOM] Formally verifying "checkers is a draw"

Timothy Y. Chow tchow at alum.mit.edu
Sat Aug 2 20:31:01 EDT 2014

```On Sun, 3 Aug 2014, Josef Urban wrote:
> Both the Four Color theorem and Flyspeck involved at some point
> enumeration of a large number of cases. If I understand correctly, the
> technique used in both cases was (essentially) to verify the
> case-enumerating algorithms and run them, instead of verifying the
> certificates. It would be interesting to compare the orders of magnitude
> dealt with with in those cases with Schaeffer's.

Thanks for the pointers.  I found this relevant paper:

http://arxiv.org/abs/1301.1702

The abstract says, in part, "The Flyspeck project includes about 1000
nonlinear inequalities. We successfully tested our method on more than 100
Flyspeck inequalities and estimated that the formal verification procedure
is about 3000 times slower than an informal verification method
implemented in C++."

As for the four-colour theorem, the Robertson et al. proof involved only
633 configurations, so I assume that the Coq proof involved a similar
number of configurations.

For comparison, the Schaeffer et al. paper says, "The stored proof tree is
only 10^7 positions. Saving the entire proof tree, from the start of the
game so that every line ends in an endgame database position, would
require many tens of terabytes, resources that were not available.
Instead, only the top of the proof tree, the information maintained by the
manager, is stored on disk.  When a user queries the proof, if the end of
a line of play in the proof is reached, then the solver is used to
continue the line into the databases. This substantially reduces the
storage needs, at the cost of recomputing (roughly 2 min per search)."
Later on the paper also says, "How much computation was done in the proof?
Roughly speaking, there are 10^7 positions in the stored proof tree, each
representing a search of 10^7 positions (relatively small because of the
extensive disk operations). Hence, 10^14 is a good ballpark estimate of
the forward search effort."

I hesitate to draw firm conclusions without understanding the details
better, but it sounds to me like the checkers case is many orders of
magnitude larger than the four-color case or the Flyspeck case (even
granting that the Flyspeck project involves additional tedious
computations beyond the nonlinear equations).  It sounds like one might
either have to write down 10^14 cases, or write down a database of 10^7
cases and deal with a lot of computation---perhaps 10^7 * (2 min) * 3000?

Tim
```