[SMT-COMP] IMPORTANT: QF_ABV at SMT-COMP'11

Sun Jul 17 14:14:29 EDT 2011

Hi Alberto,

I believe the winner will not change. Boolector is clearly the best solver for QF_ABV.
I talked with Christoph and we are ok with the current results.

Cheers,
Leo

Sent from my Windows Phone

-----Original Message-----
From: Alberto Griggio
Sent: Sunday, July 17, 2011 10:39 AM
To: smt-comp at cs.nyu.edu
Cc: Morgan Deters; Roberto Bruttomesso; Leonardo de Moura; Christoph Wintersteiger; Armin Biere; florian at informatik.uni-bremen.de; jp at informatik.uni-bremen.de; elenav at informatik.uni-bremen.de; Clark Barrett
Subject: IMPORTANT: QF_ABV at SMT-COMP'11

Dear list and SMT-COMP participants,

as you probably know, SMT-COMP 2011 is currently running, and as usual
it can be followed "live" from the SMT-EXEC webpage. As every year, we
are having surprises in the results, with solvers performing better or
worse than what their authors expected. This is generally good news,
otherwise there would be no point in running the competition :-)

However, we have just realized that there is an issue with the
QF_ABV category. This category was introduced last year, to refine the
QF_AUFBV classification that was used before. However, this
re-classification did not affect the actual logic tag in the benchmarks,
which was still QF_AUFBV. In fact, they are still tagged as QF_AUFBV in
the SMT-LIB (see http://smtexec.org/exec/smtlib-portal-benchmarks.php).
This year, in the process of updating the metadata for benchmark
selection in SMT-COMP, we have accidentally updated such logic tag to
QF_ABV, so that the actual benchmark files used for the competition
contain a "(set-logic QF_ABV)" command rather than
"(set-logic QF_AUFBV)". This change was unintentional, and happened
*after* the solvers had already been submitted. Although the formulas
in the benchmark itself were not touched, this change could cause some
problems to some solvers, e.g. if they use the logic tag to select which
configuration to activate. This is exactly what happened with at least
two of the competitors, namely Z3 and MathSAT. For instance, MathSAT
does recognize the "QF_AUFBV" logic for selecting its
configuration, but not "QF_ABV". Fortunately, this had no impact on
correctness, but it has an impact on efficiency.
As an example, in attachment you find the scrambled version of the
"platania/bellford/bf13.c.smt2" benchmark that was used in the
competition, and the same benchmark with the logic set to QF_AUFBV. Both
Z3 and MathSAT timed out during the competition, and both can solve the
problem in less than one second if the logic is set to QF_AUFBV.

For such reason, we think the competition on QF_ABV should rerun, with
the logic of the benchmarks set to QF_AUFBV like it was before the
deadline. This is definitely the case for Z3, which would otherwise be
penalized without any reason. The situation could be a bit more complex
for MathSAT, since it is a submission from one of the organizers.
Therefore, if there is anybody who feels that rerunning also MathSAT
would be not be ok, we are willing to disqualify the solver for QF_ABV,
and rerun it hors-concours for that category.

What do you think about this? Does anybody see a problem with the above
solution? And more importantly, is any other solver affected by this
change in the logic of QF_ABV benchmarks?

Please let us know ASAP. Should we receive no answer before the end of
the competition, we will assume that nobody disagrees and go ahead.

Sorry again for the inconvenience,

Alberto (also for Morgan and Roberto)