[SMT-COMP] Question about benchmarks with unknown status

Thu May 29 14:02:23 EDT 2008

Hi Alberto,

On Thu, May 29, 2008 at 9:19 AM, Alberto Griggio
<alberto.griggio at disi.unitn.it> wrote:
>> Currently, in QF_LIA/nec-smt/large, we know 888 to be SAT, 1117 to be
>> UNSAT, and 376 remain unknown.  In QF_IDL/schedulingIDL, we have 247
>> SAT, 31 UNSAT, and 2 remain unknown.  We may be able to determine a
>> few more in the next few days.
>
> I see. What is exactly the procedure to use for determining the status?
> Are you simply running one (or more) solver(s) with a very high timeout?
> Or is something more needed? (e.g. a model or an unsat core)

Our procedure has been to run solvers available to us, from previous
competitions, and look for agreement of all solvers that can solve a
benchmark with a very high timeout.  If the assigned status is
questionable, for any reason, it remains unknown in the absence of
stronger evidence (e.g., if this assigned status does not match with
our expectations given the source of the benchmarks).  In no case do
we feel it is enough for one solver to solve a benchmark.

Of course we prefer a proof of unsatisfiability or a model as
evidence, and for assigning additional statuses at this time, we will
probably require it.

Naturally any challenge from the community of the status of a
benchmark can be handled on a case-by-case basis.

Thanks,
Morgan (also for the other organizers)
-- 
Morgan Deters
mdeters at gmail.com