[SMT-COMP] SMT-COMP 2019: Call for Comments, Benchmarks, Solvers

Tjark Weber tjark.weber at it.uu.se
Thu Jan 31 16:29:53 EST 2019


Giles,

I agree with much of what you wrote. Just a few brief comments:

On Thu, 2019-01-31 at 13:39 +0000, Giles Reger wrote:

> Aina pointed out that part of our reasoning was “to make results less
> predictable” but I would perhaps extend this and make it “to make
> results less predictable and more meaningful”. I would argue that the
> benchmarks we are removing are not useful in distinguishing solvers,

Note that this argument applies also to benchmarks that are "too hard."
Running a hard benchmark only to find that every solver timed out is,
in a sense, not very useful either. (In fact, it is worse: it probably
takes as long as running hundreds of trivial benchmarks.)

> they just make the gap between solvers (in terms of % of problems
> solved) smaller e.g. if 90% of benchmarks are ‘uninteresting’ then the
> gap between solvers is about an order of magnitude smaller than it
> would be if those benchmarks were removed. Solving an extra 50
> problems
> out of a set of 500 is a more clear success than out of a set of
> 50000.

True, but it is not immediately clear to me that this gives a more
accurate picture. Of course, absolute numbers mean little without
context, but I feel vaguely reminded of
https://en.wikipedia.org/wiki/Misleading_graph#Truncated_graph

Perhaps it would be useful to understand why there are relatively many
easy benchmarks in SMT-LIB, and to what extent they actually appear in
applications.

Best,
Tjark









När du har kontakt med oss på Uppsala universitet med e-post så innebär det att vi behandlar dina personuppgifter. För att läsa mer om hur vi gör det kan du läsa här: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/en/about-uu/data-protection-policy


More information about the SMT-COMP mailing list