[SMT-LIB] An alternative to SMT-COMP?
Pascal Fontaine
Pascal.Fontaine at inria.fr
Mon Dec 31 04:10:53 EST 2012
Dear all,
sorry for the long delay.
Having a festive event like SMT-COMP (with T-shirts and live screen) is
a nice thing: it brings together people around the screen and generate
discussions. Attesting participation can be good when applying for
projects: it gives a hint to reviewers that the software reaches the
level of maturity required for the competition. Participating requires
some work (just like releasing a new stable version), and that is why we
did not participate last year. This work is unavoidable, and necessary
from time to time. SMT-COMP is not to blame for the fact we did not
participate, but rather ourselves (I apologize). Indeed, having more
time between successive SMT-COMPs, would be more convenient.
I think that the lost of interest in the SMT-COMP may be attributed to
predictable results. Leaving more time between competitions may help.
Also, running the competitions only on the categories for which solver
developers claim improvement would allow to concentrate on the
categories where some changes would occur; but in practice I admit this
would be difficult to do. We need more and better benchmarks (and
gathering new and good benchmarks is a very difficult task): this would
definitely increase the suspense. The numerous practical applications
(for example in my direct environment: Rodin, TLAPS) of SMT could
provide many new benchmarks; perhaps people are afraid to fill the
SMT-LIB with formulas that are redundant or trivial. Also
non-disclosure agreements may be a problem.
A final point about the competition, I think that there may be a way to
present the results such that everybody gets some credit, and such that
it does not put too much pressure on the everlasting winner. E.g.,
maybe highlight for each solver the number of files it was first to
solve. Just presenting a total order per category is not very
satisfactory. It would be nice also to highlight the solvers that solve
the highest number of benchmarks, regardless of the categories.
Now about the SMT-EVAL: it is a good thing for users to have a view of
the state of the art. And SMT-COMP only partly helps. The wiki page
http://en.wikipedia.org/wiki/Satisfiability_Modulo_Theories already
gives some summary on the capabilities of the various SMT solvers, and
enriching this with efficiency information would be valuable. For
SMT-EVAL, it would be possible to check the solvers against all formulas
in the SMT-LIB. It would be nice if the SMT-EVAL committee cooperates
with SMT developers, i.e. advertise the "snapshot" date, and interact
with developers if needed, e.g. in case of strange results or
difficulties to compile. Cesare highlighted the fact that it would be
good for the SMT-EVAL committee not to contain any developer, for
independence; my feeling is that one SMT developer in the committee
would perhaps ease interaction between the committee and the developers.
Best Regards, and Happy New Year,
Pascal
On 11/21/2012 09:48 AM, Tinelli, Cesare wrote:
> Dear all,
>
> As a follow up to my previous message about the future of SMT-COMP, there have been a few suggestions recently to have a different kind of event altogether: a comparative evaluation, let's call it SMT-EVAL, instead of a competition.
>
> This would be a non-live effort aimed at comparing solvers more thoroughly by running them on a much larger set of benchmarks and exploring several features and options for each solver, with the goal of getting a broader and more accurate picture of the state of the art.
> SMT-EVAL would perhaps not require official solver submissions and focus instead on publicly available solvers.
>
> The evaluation would clearly be a major endeavor and require a lot of work and a committed team of people. On the positive side, it would span a longer period of time, not have the sort of pressure and stress associated with a live competition, and for being more research oriented, could more easily lead to publications. Last but not least, the team in charge of it would benefit from the use of the StarExec execution service to run the experiments (seehttp://www.starexec.org/starexec/public/about.jsp). StarExec is going live soon and will be initially available only to a couple of logic solver communities, with SMT being one of them.
>
> The evaluation could be timed to end before SMT'13, which is expected to be in conjunction with SAT next year, so that its results could be presented there in lieu of the competition results.
>
> David Cok would be willing to be part of the evaluation team if SMT-EVAL replaced SMT-COMP next year. Aaron Stump has expressed a strong interest in being in the team too, both for its research aspects and because it would be a great stress test for StarExec. They both think that at least another person would be needed.
> I would add that it might be good if all the members of the team were not solver developers, to avoid any possible conflicts of interest.
>
> What do you think of the idea of SMT-EVAL as a replacement for SMT-COMP in 2013?
>
> Cheers,
>
>
> Cesare
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> SMT-LIB mailing list
> SMT-LIB at cs.nyu.edu
> http://www.cs.nyu.edu/mailman/listinfo/smt-lib
More information about the SMT-LIB
mailing list