[SMT-COMP] QF_FP, min and zeros

Mon Apr 24 04:57:59 EDT 2017

Hi,

Le 21/04/2017 à 17:03, Christoph Wintersteiger a écrit :
> I think that's a good idea and this is exactly what those benchmarks have been designed to do. I
> agree that we can't make it mandatory for all participants to run regression tests for all
> logics/theories beforehand, but in the case of FP we can definitely send an extra email to all
> participants making them aware of the latest version of these test cases, just to make sure they
> have a "reference" set to look at when they are unsure.
>

We really appreciate them for that, we are actually trying to understand why we doesn't have the 
same result for fma :).

Do you mean an email with the results for 1s timeout?

>
> On Fri, 2017-04-21 at 12:04 +0200, François Bobot wrote:
>> A similar stage can be done in our case, running all the Wintersteiger's unit test with a
>> timeout of 1s and eliminating the provers that give a wrong result. But these benchmarks would
>> not be used for the provers comparison because the real examples are lost among the unit
>> tests.
>>
>> What do you think of that?
>
> Note that the competition comprises 3 tracks and over 40 divisions (logics). To keep this
> manageable, we shouldn't overcomplicate its design.

The design is just to add that a division can define unit tests that will be run before the division 
itself with a very small timeout. So prover developers doesn't have to run them themselves.

>
> At this point, I am inclined to believe that unit tests (i) can serve an important purpose in
> identifying incorrect solver implementations, and (ii) do not necessarily cause "real" examples
> to be "lost": if solving unit tests is easy, state-of-the-art solvers will only differ in how
> many of the real examples they can solve, so the (relative) ranking will depend on those.

They are easy if you support them, but for example some provers doesn't support fma or doesn't 
support not RNE (roundNearestTiesToEven) rounding mode. But since the industrial example doesn't use 
fma or not RNE rounding mode (except toy example with a parametric rounding mode) it is not worth 
adding support for them.

I think it is not a problem to answer unknown to a unit test.

Cheers,

-- 
François