1 / 14

Differential Privacy

18739A: Foundations of Security and Privacy. Differential Privacy. Anupam Datta Fall 2009. Sanitization of Databases. Add noise, delete names, etc. Real Database (RDB). Sanitized Database (SDB). Health records Census data. Protect privacy Provide useful information (utility).

creola
Télécharger la présentation

Differential Privacy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 18739A: Foundations of Security and Privacy Differential Privacy Anupam Datta Fall 2009

  2. Sanitization of Databases Add noise, delete names, etc. Real Database (RDB) Sanitized Database (SDB) Health records Census data Protect privacy Provide useful information (utility)

  3. Recall Dalenius’ Desideratum

  4. Impossibility Result [Dwork, Naor 2006] • Privacy: For some definition of “privacy breach,”  distribution on databases,  adversaries A,  A’ such that Pr(A(San)=breach) – Pr(A’()=breach) ≤  • Result: For reasonable “breach”, if San(DB) contains information about DB, then some adversary breaks this definition • Example • Terry Gross is two inches shorter than the average Lithuanian woman • DB allows computing average height of a Lithuanian woman • This DB breaks Terry Gross’s privacy according to this definition… even if her record is not in the database!

  5. (Very Informal) Proof Sketch • Suppose DB is uniformly random • Entropy I( DB ; San(DB) ) > 0 • “Breach” is predicting a predicate g(DB) • Adversary knows r, H(r ; San(DB))  g(DB) • H is a suitable hash function, r=H(DB) • By itself, does not leak anything about DB • Together with San(DB), reveals g(DB)

  6. Relaxed Semantic Security Databases? • Example • Terry Gross is two inches shorter than the average Lithuanian woman • DB allows computing average height of a Lithuanian woman • This DB breaks Terry Gross’s privacy according to this definition… even if her record is not in the database! • Suggests new notion of privacy • Risk incurred by joining database

  7. Differential Privacy (informal) If there is already some risk of revealing a secret of C by combining auxiliary information and something learned from DB, then that risk is still there but not increased by C’s participation in the database Output is similar whether any single individual’s record is included in the database or not C is no worse off because her record is included in the computation

  8. Differential Privacy [Dwork et al 2006] Randomized sanitization function κ has ε-differential privacy if for all data sets D1 and D2 differing by at most one element and all subsets S of the range of κ, Pr[κ(D1) S ] ≤ eε Pr[κ(D2) S ]

  9. Tell me f(D) Achieving Differential Privacy • How much noise should be added? • Intuition: f(D) can be released accurately when f is insensitive to individual entries x1, … xn (the more sensitive f is, higher the noise added) User Database D x1 … xn f(D)+noise

  10. Sensitivity of a function • Examples: count  1, histogram  1 • Note: f does not depend on the database

  11. Achieving Differential Privacy

  12. Proof of Theorem 4

  13. Sensitivity with Laplace Noise

More Related