230 likes | 378 Vues
This paper discusses the information-theoretic quantification of integrity in relation to database privacy, drawing on foundational concepts from Denning (1982) and others. It examines the interplay between corruption, contamination, and suppression within information systems, exploring how untrusted inputs can affect trusted outputs. Notably, it investigates different integrity models, including the Biba model, taint analysis, and statistical database anonymization techniques like k-anonymity and L-diversity. The authors, Michael Clarkson and Fred B. Schneider, present insights into maintaining integrity while ensuring confidentiality.
E N D
Quantification of Integrity Michael Clarkson and Fred B. Schneider Cornell University RADICAL May 10, 2010
Goal Information-theoretic Quantification of programs’ impact on Integrity of Information (relationship to database privacy) [Denning 1982] Clarkson: Quantification of Integrity
What is Integrity? Databases: Constraints that relations must satisfy Provenance of data Utility of anonymized data Common Criteria: Protection of assets from unauthorized modification Biba (1977): Guarantee that a subsystem will perform as it was intended; Isolation necessary for protection from subversion; Dual to confidentiality …no universal definition Clarkson: Quantification of Integrity
Our Notions of Integrity Corruption: damage to integrity Contamination: bad information present in output Suppression: good information lost from output …distinct, but interact Clarkson: Quantification of Integrity
Contamination Goal: model taint analysis Untrusted input contaminates trusted output Program untrusted Attacker Attacker trusted User User Clarkson: Quantification of Integrity
Contamination u contaminates o (Can’t u be filtered from o?) o:=(t,u) Clarkson: Quantification of Integrity
Quantification of Contamination Use information theory: information is surprise X, Y, Z: distributions I(X,Y): mutual information between X and Y (in bits) I(X,Y | Z): conditional mutual information Clarkson: Quantification of Integrity
Quantification of Contamination Contamination = I(Uin,Tout| Tin) Uin Program untrusted Attacker Attacker trusted User User Tin Tout [Newsome et al. 2009] Dual of [Clark et al. 2005, 2007] Clarkson: Quantification of Integrity
Example of Contamination o:=(t,u) Contamination = I(U, O | T) = k bits if U is uniform on [0,2k-1] Clarkson: Quantification of Integrity
Our Notions of Integrity Corruption: damage to integrity Contamination: bad information present in output Suppression: good information lost from output Clarkson: Quantification of Integrity
Program Suppression Goal: model program (in)correctness Specification Sender Receiver correct Implementation untrusted Attacker Attacker trusted Sender Receiver real Information about correct output is suppressed from real output Clarkson: Quantification of Integrity
Example of Program Suppression Spec. for (i=0; i<m; i++) { s := s + a[i]; } a[0..m-1]: trusted Impl. 1 Impl. 2 for (i=1; i<m; i++) { s := s + a[i]; } for (i=0; i<=m; i++) { s := s + a[i]; } Suppression—a[0] missing No contamination Suppression—a[m] added Contamination Clarkson: Quantification of Integrity
Suppression vs. Contamination output := input Contamination Attacker Attacker * * Suppression Clarkson: Quantification of Integrity
Quantification of Program Suppression In Spec Specification Sender Receiver Uin Implementation untrusted Attacker Attacker trusted Sender Receiver Tin Impl Program transmission = I(Spec, Impl) Clarkson: Quantification of Integrity
Quantification of Program Suppression H(X): entropy (uncertainty) of X H(X|Y): conditional entropy of X given Y Program Transmission = I(Spec, Impl) = H(Spec) − H(Spec | Impl) Info actually learned about Spec by observing Impl Total info to learn about Spec Info NOT learned about Spec by observing Impl Clarkson: Quantification of Integrity
Quantification of Program Suppression H(X): entropy (uncertainty) of X H(X|Y): conditional entropy of X given Y Program Transmission = I(Spec, Impl) = H(Spec) − H(Spec | Impl) Program Suppression = H(Spec | Impl) Clarkson: Quantification of Integrity
Example of Program Suppression Spec. for (i=0; i<m; i++) { s := s + a[i]; } Impl. 1 Impl. 2 for (i=1; i<m; i++) { s := s + a[i]; } for (i=0; i<=m; i++) { s := s + a[i]; } Suppression = H(A) Suppression ≤ H(A) A = distribution of individual array elements Clarkson: Quantification of Integrity
Suppression and Confidentiality Declassifier: program that reveals (leaks) some information; suppresses rest Leakage: [Denning 1982, Millen 1987, Gray 1991, Lowe 2002, Clark et al. 2005, 2007, Clarkson et al. 2005, McCamant & Ernst 2008, Backes et al. 2009] Thm. Leakage + Suppression is a constant What isn’t leaked is suppressed Clarkson: Quantification of Integrity
Database Privacy Statistical database anonymizes query results: …sacrifices utility for privacy’s sake …suppresses to avoid leakage …sacrifices integrity for confidentiality’s sake response Anonymizer Database User User query anonymized response Clarkson: Quantification of Integrity
k-anonymity DB: Every individual must be anonymous within set of size k. [Sweeney 2002] Programs: Every output corresponds to k inputs. But what about background knowledge? …no bound on leakage …no bound on suppression Clarkson: Quantification of Integrity
L-diversity DB: Every individual’s sensitive information should appear to have L (roughly) equally likely values.[Machanavajjhala et al. 2007] Entropy L-diversity: H(anon. block) ≥ log L [Øhrn and Ohno-Machado 1999, Machanavajjhala et al. 2007] Program: H(Tin | tout) ≥ log L (if Tin uniform) …implies suppression ≥ log L Clarkson: Quantification of Integrity
Summary Measures of information corruption: • Contamination (generalizes taint analysis) • Suppression (generalizes program correctness) Application: database privacy (model anonymizers; relate utility and privacy) Clarkson: Quantification of Integrity
More Integrity Measures • Channel suppression …same as channel model from information theory, but with attacker • Attacker- and program-controlled suppression • Belief-based measures [Clarkson et al. 2005] …generalize information-theoretic measures Granularity: • Average over all executions • Single executions • Sequences of executions Clarkson: Quantification of Integrity