Jonathan Ullman

Fingerprinting Codes and the Price of Approximate Differential Privacy


We show new lower bounds on the sample complexity of (eps, delta)-differentially
 private algorithms that accurately answer large sets of counting queries. 
A counting query on a database D in ({0,1}^d)^n has the form 
"What fraction of the individual records in the database satisfy the property q?" 
We show that in order to answer an arbitrary set of >> nd counting queries, Q, 
on D to within error +/- alpha it is necessary that 
n > Omega~(d^{1/2} log|Q| / alpha^2 eps). This bound is optimal up to poly-logarithmic 
factors, as demonstrated by the Private Multiplicative Weights algorithm 
of Hardt and Rothblum (FOCS'10). In particular, our lower bound is the first 
to show that the sample complexity required for accuracy and 
(eps, delta)-differential privacy is asymptotically larger than what is required
merely for accuracy, which is O(log|Q| / alpha^2). In addition, we show that our 
lower bound holds for the specific case of k-way marginals 
(where |Q|~(2d)^k) when alpha is a constant.

Our results rely on the existence of short fingerprinting codes (Boneh-Shaw, CRYPTO'95), 
which we show are closely connected to the sample complexity of 
differentially private data release. We also give a new composition techniques 
for combining certain types of sample complexity lower bounds into stronger lower bounds.

Joint work with Mark Bun and Salil Vadhan.