[FOM] Formally verifying "checkers is a draw"

Sat Aug 2 20:31:01 EDT 2014

On Sun, 3 Aug 2014, Josef Urban wrote:
> Both the Four Color theorem and Flyspeck involved at some point 
> enumeration of a large number of cases. If I understand correctly, the 
> technique used in both cases was (essentially) to verify the 
> case-enumerating algorithms and run them, instead of verifying the 
> certificates. It would be interesting to compare the orders of magnitude 
> dealt with with in those cases with Schaeffer's.

Thanks for the pointers.  I found this relevant paper:

http://arxiv.org/abs/1301.1702

The abstract says, in part, "The Flyspeck project includes about 1000 
nonlinear inequalities. We successfully tested our method on more than 100 
Flyspeck inequalities and estimated that the formal verification procedure 
is about 3000 times slower than an informal verification method 
implemented in C++."

As for the four-colour theorem, the Robertson et al. proof involved only 
633 configurations, so I assume that the Coq proof involved a similar 
number of configurations.

For comparison, the Schaeffer et al. paper says, "The stored proof tree is 
only 10^7 positions. Saving the entire proof tree, from the start of the 
game so that every line ends in an endgame database position, would 
require many tens of terabytes, resources that were not available. 
Instead, only the top of the proof tree, the information maintained by the 
manager, is stored on disk.  When a user queries the proof, if the end of 
a line of play in the proof is reached, then the solver is used to 
continue the line into the databases. This substantially reduces the 
storage needs, at the cost of recomputing (roughly 2 min per search)." 
Later on the paper also says, "How much computation was done in the proof? 
Roughly speaking, there are 10^7 positions in the stored proof tree, each 
representing a search of 10^7 positions (relatively small because of the 
extensive disk operations). Hence, 10^14 is a good ballpark estimate of 
the forward search effort."

I hesitate to draw firm conclusions without understanding the details 
better, but it sounds to me like the checkers case is many orders of 
magnitude larger than the four-color case or the Flyspeck case (even 
granting that the Flyspeck project involves additional tedious 
computations beyond the nonlinear equations).  It sounds like one might 
either have to write down 10^14 cases, or write down a database of 10^7 
cases and deal with a lot of computation---perhaps 10^7 * (2 min) * 3000?

Tim