[SMT-COMP] SMT-COMP 2019: Call for Comments, Benchmarks, Solvers

Mohamed Iguernlala iguer.auto at gmail.com
Wed Jan 30 06:15:45 EST 2019


Hi,

I agree with Tjark that it would be better to run the solvers on
all the benchmarks.

on the other hand, to limit the impact of the numerous benchmarks
that have become easy over the years, and that almost all the
solvers prove, one could imagine an altered score of each proved
goal. For instance, the new score can be obtained as follows:

new_score = "obtained score" / (2 ^ ("number of competitive solvers that 
proved the goal" - 1))

Best regards,

- Mohamed.

On 30/01/2019 11:39, Tjark Weber wrote:
> Aina,
>
> You made a number of good points, with which I generally agree. Thus,
> just briefly:
>
> On Tue, 2019-01-29 at 09:09 -0800, Aina Niemetz wrote:
>> The main reason for the decision to use a subset rather than the whole
>> set of benchmarks in SMT-LIB is not to increase the time limit. It is to
>> make the results less predictable and to discourage solver developers to
>> overfit to a fixed problem set.
> Re predictability: it is certainly good if multiple solvers receive
> recognition. A good way to achieve this is to use multiple tracks,
> divisions, and special awards. (Your plans to recognize solvers based
> on different timeouts, and based on sat/unsat performance only, are
> great examples.) However, for the results within each division, I would
> encourage you to focus on validity (i.e., is the competition measuring
> the right thing?) and reliability (i.e., would repeated measurements
> produce similar results?). May the best solvers win!
>
> Re overfitting: when you select a random subset of SMT-LIB benchmarks,
> isn't it still the best strategy for a solver to (over-)fit to the set
> of (all) SMT-LIB benchmarks? To discourage overfitting, it would help
> to have a significant proportion of new, previously unknown benchmarks
> each year (but that may not be very practical, and not publishing these
> benchmarks beforehand would of course be a break with tradition).
>
> Best,
> Tjark
>
>
>
>
>
>
>
>
>
> När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/
>
> E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy
> _______________________________________________
> SMT-COMP mailing list
> SMT-COMP at cs.nyu.edu
> https://cs.nyu.edu/mailman/listinfo/smt-comp


More information about the SMT-COMP mailing list