[SMT-LIB] simplify test suite and boogie benchmarks in AUFLIA
Michal Moskal
michal.moskal at gmail.com
Wed May 2 18:35:40 EDT 2007
On 5/2/07, Clark Barrett <barrett at cs.nyu.edu> wrote:
> > I've updated, as requested by Clark. The changes are:
> > - the symbols / and % are not used, uninterpreted divide and modulo
> > operators are used
> > - * is only used on constants, otherwise uninterpreted function multiply is used
>
> That's great. I have to admit that I have somewhat mixed feelings about
> this--ideally we want to capture benchmarks in their "native" form, but at the
> same time, if non-linearity isn't really used, then it's not really a good
> non-linear benchmark either. So I think that forcing these benchmarks to fit
> into AUFLIA is the right thing to do.
Yes... The tools are not limited to linear arithmetic. The fact, that
the benchmarks don't use non-linear constructs is probably only
because Simplify doesn't handle it.
>
> > It seems there is only one Boogie benchmark affected by this, namely
> > Search_Search.Reverse_Div_System.Int32.array_notnull-noinfer.smt,
> > otherwise fx7 is still able to prove the benchmarks expected to unsat.
> > It also proves the ones, that were previously in the non-linear
> > directory, it seems the non-linearity was not crucial.
>
> So, let me see if I understand:
>
> 1) the benchmarks in the "valid" subdirectory have indeed been verified by you to
> be unsat in their current form.
Yes.
> 2) the benchmarks in the "invalid" subdirectory are expected to be satisfiable
> (are they in fact *known* to be satisfiable?)
I wouldn't say they are known to be satisfable. They are too big to be
verified by hand, and I know no tool that could perform such a
verification.
They are intended to be sat, but in the past I've found bugs in both
ESC/Java and Boogie background predicates that made the queries
unsat...
> 3) the single benchmark in the "non-linear" subdirectory is of unknown
> satisfiability (though it is in fact now a linear benchmark--i.e. properly
> in the logic AUFLIA)
Yes.
>
> > As for simplify test suite benchmarks, there is several ones, that are
> > failing for me. I've put them in *_possibly_sat directories.
>
> So again, to clarify, we expect all the simplify benchmarks to be
> unsatisfiable. However, the ones in the possibly_sat directories have not been
> verified as unsatisfiable after the conversion to linear arithmetic, right?
Yes, that's correct.
>
> > cvc3 -lang smtlib +translate now reports the benchmarks to be in
> > AUFLIA logic. BTW, it seems to translate (distinct ...) into quadratic
> > number of inequalities.
>
> Ah yes, cvc3 doesn't natively handle the distinct primitive yet. However, this
> brings up another point. The first occurrence of distinct is fine. It labels
> a set of symbolic constants as distinct. The second occurrence, however, is
> stating that a set of integers are distinct. This is implied by the underlying
> theory and as such, this is the sort of assumption we don't want too have to
> include in SMT-LIB. Do you have any objection to removing these instances of
> distinct? I can do it on my end unless you don't mind another round...
I can do that, that's not a problem.
>
> > I would like to bring up the problem of triggers in the benchmarks. I
> > was hoping for the triggers to be also present in benchmarks in the
> > competition:
> > - they are there in the original benchmarks,
> > - the SMT standard mentions triggers as a possible use for
> > annotations, it just doesn't define the syntax,
> > - changing any SMT solver, that currently infers triggers, so that it
> > can also read the explicit ones seems like a simple hack for me.
> >
> > I agree it might be controversial to have them introduced that late though...
> >
> > What do you think about adding them to the SMT standard anyway?
>
> For the purpose of the competition, it seems interesting both ways--with and
> without patterns. The first tests how well patterns can be exploited and the
> second tests the ability of tools to infer their own patterns.
So the idea here would be to see how good can we do completely
automatically, without a hint from the user?
In general this is a very good idea. But remember that these queries
have been prepared for Simplify to be processed as fast as possible.
This means a whole bunch of tricks, hacks and voodoo magic have been
put into the axiomatizations, so that the triggers can selected
correctly. Selecting a different axiomatization for performance
reasons seems like an order of magnitude more complicated task than
marely stating the triggers.
So for me, using these benchmarks without triggers, is like driving a
jet on a highway -- it will go faster then a car, but that's not
exactly the point.
> I think it's too late for this year--for one thing, it seems likely that only
> your tool will be able to parse the patterns meaningfully--we typically want at
> least 2 or 3 solvers in a division to make it part of the competition.
>
> However, I like the idea of making it part of the standard and then trying both
> in next year's competition.
OK!
--
Michał
More information about the SMT-LIB
mailing list