1 / 21

k-Anonymity: A Model for Protecting Privacy

k-Anonymity: A Model for Protecting Privacy. Latanya Sweeney Carnegie Mellon University Intl. Journal on Uncertainty 2002. Presented by – Munawar Hafiz March 14, 2006. Confiden--tiality. Integrity. Availability. Information Security.

avalbane
Télécharger la présentation

k-Anonymity: A Model for Protecting Privacy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. k-Anonymity: A Model for Protecting Privacy Latanya Sweeney Carnegie Mellon University Intl. Journal on Uncertainty 2002 Presented by – Munawar Hafiz March 14, 2006

  2. Confiden--tiality Integrity Availability Information Security Internet and networking technologies have made the access to information much easier. Correlation of information compromises privacy. Information security is also related with confidentiality, integrity and availability. • User Preference • Usability • Proof of compliance Information Security Slide 1 of 19

  3. Overview of the Presentation • Re-identification of Data • Terminology • Introduction to k-Anonymity • Attacks against k-Anonymity • l-diversity Slide 2 of 19

  4. Disease Birth Date Zip Sex Name Re-identification of Data 87% of the population in the USA can be uniquely identified by zip, sex and DoB. Slide 3 of 19

  5. Terminology • Tuple – A row of data • Attribute – A column, A semantic category, A domain • Inference – Belief on a new fact based on some other information • Disclosure – Explicit and inferable information about a person. • Disclosure Control – Attempt to identify and limit disclosures. Quasi-identifier – A minimal set of attributes in table that can be joined with external information to re-identify individual records. Slide 4 of 19

  6. Terminology (continued) Frequency Set Select count(*) from patients group by sex, zipcode Slide 5 of 19

  7. k-Anonymity k-Anonymity A relation is said to satisfy k-Anonymity property if every count in the frequency set is greater than or equal to k. The relation is called k-Anonymous. In plain English, a row in a table cannot be distinguished from at least k other rows. Slide 6 of 19

  8. Z2 = {537**} Z1 = {5371*. 5370*} S1 = {Person} S1 = {*} Generalization and Suppression Generalization A value is replaced by a less specific/more general value that is faithful to the original. Suppression Imposing on each value generalization hierarchy a new maximal element atop the old maximal element. Z0 = {53715. 53710, 53706, 53703} B0 = {1/21/76, 2/28/76, 4/13/86} S0 = {Male, Female} Slide 7 of 19

  9. Z2 = {537**} Z1 = {5371*. 5370*} S1 = {Person} Generalization and Suppression (continued) Domain Generalization Relationship, <D Di≤D Dk Value Generalization Function, γ γ: Di→ Dk S0 = {Male, Female} Domain Generalization Hierarchy γ: Di → Dk Implied Domain Generalization IfDi ≤D Dk and Dk ≤D Dm then Di ≤D Dm Z0 = {53715. 53710, 53706, 53703} Composite Value Generalization Function, γ+ 5371* = γ(53715), 537** εγ+(53715) Slide 8 of 19

  10. 537** Person 5371* 5370* Z2 = {537**} Z1 = {5371*. 5370*} S1 = {Person} Female Male Generalization Hierarchy S0 = {Male, Female} 53710 53706 53703 53715 Z0 = {53715. 53710, 53706, 53703} Slide 9 of 19

  11. <S1, Z1> [1, 1] [0, 2] <S0, Z2> [1, 0] <S1, Z0> [0, 1] <S0, Z1> Z2 = {537**} S1 = {Person} Z1 = {5371*. 5370*} Generalization Lattice <S1, Z2> S0 = {Male, Female} [1, 2] <S0, Z0> Generalization Lattice Z0 = {53715. 53710, 53706, 53703} [0, 0] Distance Vector Generalization Lattice Slide 10 of 19

  12. Generalization Tables Slide 11 of 19

  13. Taxonomy of k-Anonymization models Generalization vs. Suppression Model Only suppress data or use intermediate steps for generalization Global vs. Local Recoding Consider local data items or work on the values in the domain Hierarchy based vs. Partition based Fixed value generalization hierarchy vs. partition into disjoint ranges Slide 12 of 19

  14. Attacks against k-Anonymity: Unsorted Matching Unsorted Matching Attack Solution - Random shuffling of rows Slide 13 of 19

  15. Attacks against k-Anonymity: Complementary Release Complementary Release Attack Slide 14 of 19

  16. black 9/7/65 male 02139 headache black 11/4/65 male 02139 rash black 1965 male 02139 headache black 1965 male 02139 rash Attacks against k-Anonymity: Temporal Temporal Attack PTt1 GTt1 Slide 15 of 19

  17. Attacks against k-Anonymity: Homogeneity Homogeneity Attack k-Anonymity can create groups that leak information due to lack of diversity in sensitive attribute. Slide 16 of 19

  18. Attacks against k-Anonymity: Background Knowledge Background Knowledge Attack k-Anonymity does not protect against attacks based on background knowledge. Slide 18 of 19

  19. l-Diversity Slide 17 of 19

  20. Discussion • k-Anonymity is other domains? • Complexity of k-Anonymity? • Trade-off between privacy guarantees and usefulness of collected data? Slide 19 of 19

  21. References 1. Achieving k-Anonymity Privacy Protection using Generalization and Suppression, Latanya Sweeney 2. Anonymizing Tables, G. Aggarwal et al. 3. Incognito: Efficient Full-Domain k-Anonymity, LeFevre et al. 4. On the Complexity of Optimal k-anonymity, Meyerson et al. 5. General k-Anonymization is Hard, Meyerson et al. 6. Approximation Algorithms for k-Anonymity, Aggarwal et al. 7. l-Diversity: Privacy beyond k-Anonymity, Machanavajjhala et al. Extra Slide

More Related