Privacy Breaches in Privacy-Preserving Data Mining

Johannes Gehrke
Cornell University

Friday, November 19, 2004 11:30 A.M.
Room 1302 Warren Weaver Hall
251 Mercer Street
New York, NY 10012-1185

Dennis Shasha, (212) 998-3086


The exponential growth in the amount of digital data has resulted in the creation of databases of unprecedented scale. Simultaneously, concerns about privacy of personal data have emerged globally. Data mining, with its promise to efficiently discover valuable patterns from large databases, has been under attack recently due to privacy concerns. Can we develop accurate data mining models without access to precise information about individuals? I will describe an approach for privacy-preserving data mining based on randomization, including formal models of privacy and algorithms for enforcing privacy.

This is joint work with Rakesh Agrawal, Alexandre Evfimievski, and Ramakrishnan Srikant.

