[SMT-COMP] [SMT-LIB] An alternative to SMT-COMP?

Mon Jan 7 14:01:10 EST 2013

I'm forwarding this message from Pascal on the discussion about SMT-COMP.

---------- Forwarded message ----------
From: Pascal Fontaine <Pascal.Fontaine at inria.fr>
To: smt-lib at cs.nyu.edu, smt-comp at cs.nyu.edu
Cc:
Date: Mon, 31 Dec 2012 10:10:53 +0100
Subject: Re: [SMT-LIB] An alternative to SMT-COMP?
Dear all,

sorry for the long delay.

Having a festive event like SMT-COMP (with T-shirts and live screen) is a
nice thing: it brings together people around the screen and generate
discussions.  Attesting participation can be good when applying for
projects: it gives a hint to reviewers that the software reaches the level
of maturity required for the competition.  Participating requires some work
(just like releasing a new stable version), and that is why we did not
participate last year.  This work is unavoidable, and necessary from time
to time.  SMT-COMP is not to blame for the fact we did not participate, but
rather ourselves (I apologize).  Indeed, having more time between
successive SMT-COMPs, would be more convenient.

I think that the lost of interest in the SMT-COMP may be attributed to
predictable results.  Leaving more time between competitions may help.
 Also, running the competitions only on the categories for which solver
developers claim improvement would allow to concentrate on the categories
where some changes would occur; but in practice I admit this would be
difficult to do.  We need more and better benchmarks (and gathering new and
good benchmarks is a very difficult task): this would definitely increase
the suspense.  The numerous practical applications (for example in my
direct environment: Rodin, TLAPS) of SMT could provide many new benchmarks;
perhaps people are afraid to fill the SMT-LIB with formulas that are
redundant or trivial.  Also non-disclosure agreements may be a problem.

A final point about the competition, I think that there may be a way to
present the results such that everybody gets some credit, and such that it
does not put too much pressure on the everlasting winner.  E.g., maybe
highlight for each solver the number of files it was first to solve.  Just
presenting a total order per category is not very satisfactory.  It would
be nice also to highlight the solvers that solve the highest number of
benchmarks, regardless of the categories.

Now about the SMT-EVAL: it is a good thing for users to have a view of the
state of the art.  And SMT-COMP only partly helps. The wiki page
http://en.wikipedia.org/wiki/**Satisfiability_Modulo_Theories<http://en.wikipedia.org/wiki/Satisfiability_Modulo_Theories>already
gives some summary on the capabilities of the various SMT solvers,
and enriching this with efficiency information would be valuable.  For
SMT-EVAL, it would be possible to check the solvers against all formulas in
the SMT-LIB.  It would be nice if the SMT-EVAL committee cooperates with
SMT developers, i.e. advertise the "snapshot" date, and interact with
developers if needed, e.g. in case of strange results or difficulties to
compile.  Cesare highlighted the fact that it would be good for the
SMT-EVAL committee not to contain any developer, for independence; my
feeling is that one SMT developer in the committee would perhaps ease
interaction between the committee and the developers.

Best Regards, and Happy New Year,

  Pascal

On 11/21/2012 09:48 AM, Tinelli, Cesare wrote:

> Dear all,
>
> As a follow up to my previous message about the future of SMT-COMP, there
> have been a few suggestions recently to have a different kind of event
> altogether: a comparative evaluation, let's call it SMT-EVAL, instead of a
> competition.
>
> This would be a non-live effort aimed at comparing solvers more thoroughly
> by running them on a much larger set of benchmarks and exploring several
> features and options for each solver, with the goal of getting a broader
> and more accurate picture of the state of the art.
> SMT-EVAL would perhaps not require official solver submissions and focus
> instead on publicly available solvers.
>
> The evaluation would clearly be a major endeavor and require a lot of work
> and a committed team of people. On the positive side, it would span a
> longer period of time, not have the sort of pressure and stress associated
> with a live competition, and for being more research oriented, could more
> easily lead to publications. Last but not least, the team in charge of it
> would benefit from the use of the StarExec execution service to run the
> experiments (seehttp://www.starexec.org/**starexec/public/about.jsp<http://www.starexec.org/starexec/public/about.jsp>).
> StarExec is going live soon and will be initially available only to a
> couple of logic solver communities, with SMT being one of them.
>
> The evaluation could be timed to end before SMT'13, which is expected to
> be in conjunction with SAT next year, so that its results could be
> presented there in lieu of the competition results.
>
> David Cok would be willing to be part of the evaluation team if SMT-EVAL
> replaced SMT-COMP next year. Aaron Stump has expressed a strong interest in
> being in the team too, both for its research aspects and because it would
> be a great stress test for StarExec. They both think that at least another
> person would be needed.
> I would add that it might be good if all the members of the team were not
> solver developers, to avoid any possible conflicts of interest.
>
> What do you think of the idea of SMT-EVAL as a replacement for SMT-COMP in
> 2013?
>
> Cheers,
>
>
> Cesare
>
>
>
>
>
>
>
>
>
>
> ______________________________**_________________
> SMT-LIB mailing list
> SMT-LIB at cs.nyu.edu
> http://www.cs.nyu.edu/mailman/**listinfo/smt-lib<http://www.cs.nyu.edu/mailman/listinfo/smt-lib>
>