[SMT-LIB] bitvector division

Mon Jun 16 14:36:05 EDT 2008

The following issue was raised on the SMT-COMP mailing list.  After some
discussion with Cesare Tinelli, he encouraged me to move the discussion here so
as to be able to get feedback from the broader community.

On Tue, Jun 3, 2008 at 4:58 PM, Alberto Griggio
<alberto.griggio at disi.unitn.it> wrote:
> Second, on QF_BV there are some instances whose status seems to depend
> on the way division by zero is handled. These are the instances we have
> discovered that seem to display this problem:
>  QF_BV/spear/wget_v1.10.2/src_wget_vc18197.smt:  :status unsat
>  QF_BV/spear/wget_v1.10.2/src_wget_vc18755.smt:  :status unsat
>  QF_BV/spear/wget_v1.10.2/src_wget_vc18756.smt:  :status unsat
> Both z3 1.2 and yices 1.0.12 report "sat" for them, while spear 2.3
> reports unsat for the first two, and times out (1 hour on our machines)
> for the third. According to our experiments, we get different results
> with different semantics of division by zero which, as far as we
> understand, should be undefined, meaning that it should be ok to choose
> whatever semantics is liked. Is our interpretation correct? We got it
> from this passage of
> http://goedel.cs.uiowa.edu/smtlib/theories/BitVectors.smt:
>
>   Note that the semantic interpretation above is underspecified because
>   it does not specify the meaning of (bvudiv s t) or (bvurem s t) in
>   case bv2nat([[t]]) is 0.  Since the semantics of SMT-LIB's underlying
>   logic associates *total* functions to function symbols, we then
>   consider as models of this theory *any* interpretation conforming to
>   the specifications above (and defining bvudiv and bvurem arbitrarily
>   when the second argument evaluates to 0).  Benchmarks using this
>   theory should only include a :status sat or :status unsat attribute
>   if the status is independent of the particular choice of model for
>   the theory.

The problem is not that the definition of the theory is wrong or unclear (at
least in the view of the authors of the theory).  However, there is a real
problem in that checking whether a benchmark's status depends on division by 0
is non-trivial.

A couple of solutions have been proposed:

1.  One is to implement operators like bvsdiv using a special if-then-else
operator, essentially replacing any instance of (bvsdiv x y) with:

(ite (= y 0) (bvsdiv0 x) (fixed-bvsdiv x y))

where fixed-bvsdiv assumes a particular value for division by 0 and bvsdiv0 is
an uninterpreted function.

The advantage of this approach is that a benchmark will be unsatisfiable only
if there are no models that make it true.  However, if a benchmark is
satisfiable, it cannot be determined from this approach whether the
satisfiability depends on the interpretation of division by 0.  In particular,
a benchmark may be satisfiable only if division by 0 is interpreted in a
specific way and unsatisfiable otherwise.

Thus, this solution is insufficient to check whether the status is
*independent* of the choice of how division by 0 is interpreted.

2. Another proposed solution is to fix the value of division by 0 to be
   whatever is "standard" in hardware.  For instance, UINT_MAX for unsigned
   division, and INT_MAX or INT_MIN for signed division depending on whether
   the other operand is positive or negative.

   This solution has the advantage of being simple.  However, it would require
   changing the bitvector theory.  Also, it is not clear that this
   interpretation is appropriate for all applications.

Both of these proposals have advantages and disadvantages.  Unfortunately,
there is no easy fix here.  However, I would like to propose a third
alternative.  I have discussed this with Cesare Tinelli who supports it and I
would be interested in getting additional feedback from the community.

I believe that what we would really like is a technique for determining whether
the status of a benchmark depends on the definition of division by 0.  In some
sense such a benchmark is ill-formed, at least from the point of view of the
conservative definition we chose for SMT-LIB.  I would also argue that in some
applications, if the validity of a formula depends on the definition of
division by 0, the user of the application would want to know about it.

This requires some work but is not as hard as it may seem.  In particular, one
can generate a formula which, if valid, means the original formula does *not*
depend on the definition of division by 0, and if invalid means that (with some
conservative assumptions) the formula does depend on the definition of division
by 0.

The technique is described in the following PDPAR paper and is used within CVC3
to generate so-called "Type-Correctness Conditions (TCCs)":

@inproceedings{BBS+05,
  author    = "Sergey Berezin and Clark Barrett and Igor Shikanian and Marsha Chechik and Arie Gurfinkel and David L. Dill",
  title     = "A Practical Approach to Partial Functions in {CVC L}ite",
  booktitle = "Selected Papers from the Workshops on Disproving
               and the Second International Workshop on Pragmatics
               of Decision Procedures (PDPAR '04)",
  series    = "Electronic Notes in Theoretical Computer Science",
  volume    = "125(3)",
  publisher = "Elsevier",
  editor    = "Wolfgang Ahrendt and Peter Baumgartner and Hans de Nivelle and Silvio Ranise and Cesare Tinelli",
  pages     = "13--23",
  month     = jul,
  year      = 2005,
  url       = "http://www.cs.nyu.edu/~barrett/pubs/BBS+05.ps
}

I have not yet implemented this for the bitvector theory in CVC3, but I plan to
do so.  If people end up agreeing with this approach, then I would encourage
others to implement the check as well to ensure there is agreement on which
benchmarks are acceptable for SMT-LIB.

Please feel free to comment on the proposals so far or to submit additional
proposals for addressing this issue.

-Clark