SMT-LIB: Inital proposal for a common format

Fri Sep 20 12:29:31 EDT 2002

Cesare, Silvio,

The Sparta group at SRC discussed the three proposals that you two made 
in your message of 16 September 2002. Thank you very much for taking the 
initiative on this issue. We at SRC surely will have benchmarks to 
contribute, once the formats and processes are settled. This message 
summarizes our comments about your three proposals. Some of our comments 
raise additional "fine points" on the topics you have raised.  We hope 
these fine points will move the discussion along and not bog the discussion 
down.  Surely you are right to treat major issues first, before getting 
to the details.  But you have made such a good beginning that it may already 
be appropriate to consider some of the major subdecisions.  Finally, at 
the end of this message, we make two proposals of our own.

P1. Your first proposal was to use unsorted FOL-EQ as the underlying logic.
We agree with this proposal. We have four fine points to raise.

P1.F1. The first fine point is that more needs to be said about what 
"unsorted" means. (Is 2+nil = nil+2 or not?  The answer that Simplify 
gives, and that we have always found satisfactory, is that terms that 
intuitively are mistyped, like 2+nil, may appear, but nothing non-trivial 
will be provable about them.  So 2+nil does equal nil+2. (Perhaps more 
than any of our other fine points, this one has the possibility 
of getting us bogged down, so having mentioned it, we move on as fast 
as possible.)

P1.F2. The second fine point is that although we agree with the use of 
an unsorted logic, we propose that function and relation symbols be required 
to be used with consistent argument count throughout any one benchmark 
in the suite. Thus no benchmark will contain anything like P(f(a),f(a,b)). 
(The only use this might have is overloading, where "f" denotes different 
functions depending on the argument count with which it is used.  If some 
benchmark contained overloading in this way, we would propose that the 
overloading be resolved before the benchmark is placed into the suite.) 
On the other hand, there seems to be no reason to require that arities 
be consistent across different benchmarks.  To do so would be an invitation 
to irritation.  Finally on this topic, in the spirit of unsortedness, 
we would propose that the arity of a function or relation symbol be inferred 
from the first use rather than be explicitly declared.

P1.F3. The third fine point is that we consider that FOL (even unsorted 
FOL) segregates formulas from terms: a formula is built in the usual way 
from atomic formulae, which are relation symbols applied to lists of terms, 
while a term is a function symbol applied to a list of terms. The set 
of function symbols is disjoint from the set of relation symbols (in any 
one benchmark). Sometimes this is inconvenient, and Simplify, for example, 
will sometimes consider that a term t used in a position where a formula 
is required is syntactic sugar for the formula (t=@true), which @true 
is a constant whose purpose is to facilitate this desugaring.  We propose 
that any desugaring of this sort necessary for the benchmarks in the suite 
be applied before they are included in the suite.  That is, the benchmarks 
will strictly segregate formulas from terms.

P1.F4. The fourth fine point is that FOL-EQ might or might not be construed 
to include an IF-THEN-ELSE (or ITE or something) operator that constructs 
a term from a formula and two terms.  Several existing provers include 
this feature.  Our prover, Simplify, does not, but nevertheless we would 
propose thatsome such operator be allowed in the conjectures in the 
test suite, since it is undeniably useful. Like P1.F1, the potential for 
getting bogged down in details is great, which is possibly a valid argument 
for leaving out this feature, but our judgement would be to take the risk.

P2. Your second proposal was that the input class of formulas include 
any formula of full FOL-EQ, with or without quantifiers, but that the 
quantifier-free benchmarks be separated from those with quantifiers.

We agree with this proposal.  (After the second proposal you wrote a remark 
about normal forms; you wrote "we are not sure" whether to convert the 
problems into any normal form (such as CNF). We interpret this remark 
as implying that your proposal is to leave the input problems in general 
form rather than impose a normal form, and it is with this interpretation 
that we agree.)

P3. Your third proposal was to specify theories algebraically, rather 
than axiomatically.  We do not agree with this. We propose instead that 
each theory be specified with whichever technique works best for that 
theory.  Probably most will be specified algebraically, but some might 
be specified axiomatically.  Since it will be human beings rather than 
programs that read these specifications, we don't see what is gained by 
ruling out an option that could be attractive in some cases.

So much for our comments on your three proposals.  We also have a two
proposals of our own, which I will label P4 and P5:

P4. The suite will contain invalid as well as valid conjectures, 
appropriately labelled. (Or satisfiable as well as unsatisfiable problems, 
if the conjectures appear in the suite in negated form.)

P5. Each conjecture in the suite will be labelled with the names of the
theories whose symbols appear in the conjecture.  The possible theories
include at least the following two:

Linear arithmetic, with relations <, >, <=, >=, the integer numbers, and
symbols + and -.  Whether multiplication by an integer numeral is allowed
as an abbreviation is TBD.

The theory of arrays, with function symbols for reading and writing elements.
(The names of these symbols is TBD, as is the semantic question of whether
it is or isn't possible for arrays with identical elements to be distinct.)

Sincerely,

Cormac Flanagan,
Rajeev Joshi
Greg Nelson
Shaz Qadeer
Serdar Tasiran

HP Systems Research Center