[SMT-COMP] Floating-Point benchmarks for SMT-COMP

Wed Apr 4 15:45:38 EDT 2018

Hi Florian,

Sure, I can add them to the repository. I'm not familiar with this year's competition rules or deadlines though. I think status annotations are still mandatory, but in the past there were also some softer rules like at least N solvers should answer the same.

If they are functional correctness checks, I guess they won't be very interesting in the competition, because they're probably all very easy to solve, but that shouldn't stop us. 

For the ALL track... I guess that would be AUFBVDTLIRAFP now? ��

Let me know when you have the full set!

Cheers,
Christoph

Christoph M. Wintersteiger | Researcher | Tel: +44 1223 479724 | Fax: +44 1223 479999 | research.microsoft.com/~cwinter

Microsoft Research Limited (company number 03369488) is a company registered in England and Wales whose registered office is at 21 Station Road, Cambridge, CB1 2FB 

-----Original Message-----
From: Florian Schanda <florian.schanda at altran.com> 
Sent: 21 March 2018 08:51
To: Christoph Wintersteiger <cwinter at microsoft.com>
Cc: smt-comp at cs.nyu.edu; martin.brain at cs.ox.ac.uk; florian at schanda.org.uk
Subject: Floating-Point benchmarks for SMT-COMP

Hi,

We've got a large set of benchmarks to submit this year ;)

I am not sure this is the correct process to submit; but 2 years ago it was "email Christoph"... Please direct me to the correct procedure if this is wrong.

Attached is the first set (the others I will assemble after Easter since its a bit more tricky). If you're really impatient they are all public anyway on:
   https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fflorianschanda%2Fsmtlib_schanda&data=04%7C01%7Ccwinter%40microsoft.com%7C9b0e425ea5f74058dee108d58f08fb4f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636572191140720086%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=nMDwHabYwOo0QbPrlVNYnfrfwDBeEC%2FeZS10ZrQyFwY%3D&reserved=0

This set of 72,925 benchmarks is randomly generated using:
   https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fflorianschanda%2Fpympf&data=04%7C01%7Ccwinter%40microsoft.com%7C9b0e425ea5f74058dee108d58f08fb4f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636572191140720086%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=z4gyAf%2FoXvFgJL85FyU5NKsz3qBsWxfBpn76Gw1nJKk%3D&reserved=0

This is not really a performance benchmarks suite, more of a functional 
correctness suite. Every single benchmark should be correctly annotated with 
sat/unsat. It tests every operation the FP theory mentions, including all the 
fun real/float stuff. It has helped us find soundness bugs in all solvers; so 
we feel it is particularly helpful to include. The license of these is GPLv3. 
In terms of validating the correctness of the ground truth; we've checked 
(where applicable/possible) against hardware (a recent x86_64), software (GNU/
MPFR), and software (PyMPF itself). We've also looked at the results of 
MathSAT, Z3, CVC4, and Colibri and we're reasonably happy that all answers 
that differ from the ground truth are bugs in the respective solvers.

I will submit two more benchmark suites: one extracted from real-world 
industrial code; and the other one from the public spark 2014 testsuite. These 
are more tricky as they are not neatly QF_FP or QF_FPBV, these use arrays, 
functions, bitvectors, reals and ints, quantifiers and datatypes as well.

Perhaps there should be an ALL (or SPARK_VC_GENERATION_IS_SPECIAL) track? :)

	Florian