1 / 53

Privacy and anonymity

Privacy and anonymity. Claudia D i az Katholieke Universiteit Leuven Dept. Electrical Engineering – ESAT/COSIC UPC Seminar, Barcelona, November 24, 2008. Introducing myself. 2000: I.T.S. de Telecomunicaciones, Universidad de Vigo

Télécharger la présentation

Privacy and anonymity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Privacy and anonymity Claudia Diaz Katholieke Universiteit Leuven Dept. Electrical Engineering – ESAT/COSIC UPC Seminar, Barcelona, November 24, 2008

  2. Introducing myself... • 2000: I.T.S. de Telecomunicaciones, Universidad de Vigo • Master thesis (PfC) partially done at KULeuven as Erasmus student • 2000-2001: Pre-doctoral student at COSIC/KULeuven • 2001-2005: PhD student at COSIC/KULeuven • 2005-now: Post-doc researcher at COSIC/KULeuven • My work: • Anonymity metrics • Anonymous communications: analysis and attacks • Traffic analysis • Stego file systems • Other PETs: private search, negative databases, ... • http://www.esat.kuleuven.be/~cdiaz

  3. Introducing COSIC (1/2) • Group at the Dept. of Electrical Engineering of the Katholieke Universiteit Leuven • COSIC: COmputer Security and Industrial Cryptography • Professors: Bart Preneel, Ingrid Verbauwhede and Vincent Rijmen • 5 post-docs (more arriving soon) • 40 PhD students (more arriving soon) • 5-15 visitors and external associated researchers at any given time • Very international: 15-25 nationalities, ~60% non-Belgians

  4. Introducing COSIC (2/2) • Research areas • Symmetric Key Crypto • Public Key Crypto • Secure Hardware Implementations • Identity Management • Privacy Technologies • Mobile Secure Networks • Software Security • Involved in 25-30 active projects • ~ 10 European Projects (FP6 and FP7) • Coordinator of FP6 NoE E-CRYPT, FP7 NoE E-CRYPT II, FP7 IP TAS3 • Biggest success: Rijndael encryption algorithm (AES)

  5. … a few words on the scope of the talk • New field, in development • Give an idea on the problems, concepts and solutions for anonymity

  6. Outline • Privacy and anonymity • Anonymous communications • Anonymity metrics • Social networks

  7. Solove on “I have nothing to hide” • "the problem with the ‘nothing to hide’ argument is its underlying assumption that privacy is about hiding bad things.“ • "Society involves a great deal of friction, and we are constantly clashing with each other. Part of what makes a society a good place in which to live is the extent to which it allows people freedom from the intrusiveness of others. A society without privacy protection would be suffocation, and it might not be a place in which most would want to live."

  8. Diffie and Landau on Internet eavesdropping • Governments are expanding their surveillance powers to protect against crime and terrorism. • BUT: • Secrecy, lack of transparency, and diminished safeguards may easily lead to abuses • “Will the government’s monitoring tools be any more secure than the network they are trying to protect? If not, we run the risk that the surveillance facilities will be subverted or actually used against the U.S.A.” • They conclude: “Communication is fundamental to our species; privatecommunication is fundamental to both our national security and our democracy. Our challenge is to maintain this privacy in the midst of new communications technologies and serious national security threats. But it is critical to make choices that preserve privacy, communications security and the ability to innovate. Otherwise, all hope of having a free society will vanish.”

  9. Privacy properties from a technical point of view • Anonymity • Hiding link between identity and action • Unlinkability • Hiding link between two or more actions / identities • Unobservability • Hiding user activity • Pseudonymity • One-time pseudonyms / persistent pseudonyms • Plausible deniability • Not possible to prove user knows / has done something

  10. The concept of Privacy [Solove] • Privacy threats we are trying to protect against (out of 16 identified by Solove) • Surveillance: monitoring of electronic transactions • Preventive properties: anonymity, unobservability • Interrogation: forcing people to disclose information • Preventive property: plausible deniability • Aggregation: combining several sources of information • Preventive property: unlinkability • Identification: connecting data to individuals. • Preventive properties: anonymity and unlinkability

  11. Anonymity – Data and Communication Layers App App Com Com IP Alice Bob

  12. Classical Security Model • Confidentiality • Integrity • Authentication • Non repudiation • Availability Alice Bob Eve Passive / Active

  13. Anonymity – Concept and Model Set of Alices Set of Bobs

  14. Anonymity Adversary • Passive/Active • Partial/Global • Internal/External Recipient? Third Parties?

  15. Anonymity Adversary • The adversary will: • Try to find who is sending messages to whom. • Observe • All links (Global Passive Adversary) • Some links • Modify, delay, delete or inject messages. • Control some nodes in the network. • The adversary's limitations • Cannot break cryptographic primitives. • Cannot see inside nodes he does not control.

  16. Soft privacy enhancing technologies • Hard privacy • Focus on data minimization • Adversarial data holder / service provider • Soft privacy • Policies, access control, liability, right to correct information • Adversary: 3rd parties, corrupt insider in honest SP, errors • BUT user has already lost control of her data

  17. Other privacy-enhancing technologies • Anonymous credentials / e-cash / ZK protocols • Steganography / covert communication • Censorship resistance techniques • Anonymous publication • Private information retrieval (PIR) • Private search • K-anonymity • Location privacy

  18. Outline • Privacy and anonymity • Anonymous communications • Anonymity metrics • Social networks

  19. Concept of Mix: collect inputs Router that hides correspondence between inputs and outputs

  20. Concept of Mix: mix and flush Router that hides correspondence between inputs and outputs

  21. Functionality of Mixes • Mixes modify • The appearance of messages • Encryption / Decryption • Sender → Mix1 : {Mix2, {Rec, msg}KMix2}KMix1 • Padding / Compression • Substitution of information (e.g., IP) • The flow of messages • Reordering • Delaying - Real-time requirements! • Dummy traffic - Cost of traffic!

  22. Pool Mixes • Based on the mix proposed by Chaum in 1981: • Collect N inputs • Shuffle • Flush (Forward) • Pool selection algorithm • No pool / Static pool / Dynamic pool • Influences the performance and anonymity provided by the mix • Flushing condition • Time / Threshold • Deterministic / Random Round

  23. Example of pool mix Deterministic threshold static pool Mix Pool = 2 Threshold = 4

  24. Stop-and-Go Mix • Proposed by Kesdogan in 1998 • Reordering strategy based on delaying • M/M/∞ • Delays generated by the user from an Exponential distribution • Timestamping to prevent active attacks • Trusted Time Service • Anonymity estimates based on the assumption of Poisson incoming traffic

  25. Network topology • Mixes are combined in networks in order to • Distribute trust • Improve availability Cascade Fully connected network Restricted route network

  26. Cascades vs Free Route topologies • Flexibility of routing • Surface of attack • Advantage free routes • Availability • Advantage free routes • Intersection attacks • Advantage cascades (anonymity set smaller but no partitioning possible) • Trust • Advantage free routes (more choices available to user)

  27. Peer-to-peer vs client-server architectures • Symmetric vs asymmetric systems • Surface of attack • Advantage peer-to-peer • Liability issues • Advantage client-server • Resources / incentives / quality of service • Advantage client-server • Availability • Advantage peer-to-peer • Sybil attacks • Advantage? Depending on admission controls (for peers/servers)

  28. Deployed systems I • Anon.penet.fi (Helsingius 1993) • Simple proxy, substituted email headers • Kept table of correspondences nym-email • Brought down by legal attack in 1996 • Type I Cypherpunk remailers (Hughes, Finney1996) • No tables (routing info in msgs themselves), • PGP encryption (no attacks based on content) – attacks based on size are possible • Chains of mixes (distribution of trust) • Reusable reply blocks (source of insecurity)

  29. Deployed systems II • Mixmaster (Cottrell, evolving since 1995) • Fixed size (padding / dividing large messages) • Integrity protection measures • Multiple paths for better reliability • No replies • Mixminion (Danezis, 2003) • SURBs (Single-Use Reply Blocks) • Packet format: detection of tagging attacks (all-or-nothing) • Forward security: trail of keys, updated with one-way functions

  30. Low-latency applications • Stream-based instead of message-based communication • Web browsing, interactive applications, voice over IP, etc. • Bi-directional circuits • Ephemeral session keys, onion-encrypted (forward secrecy) • Real-time requirements • Delaying not an option • Proposed systems • C-S: Onion Routing, ISDN, Web Mixes • P2P: Crowds, P5 (broadcast), Herbivore (DC-nets) • Implemented systems: • TOR, JAP

  31. Onion encryption R2 R3 D R1 R3 R2 R1 D

  32. Onion Routing R2 R3 D R1 D R1 R2 R3

  33. TOR – adversary model

  34. Anonymizing http traffic not trivial • Difficult to conceal traffic pattern • Difficult to pad • Lots of padding: scalability / cost problem • Little padding: not enough to conceal pattern • Vulnerable to strong adversaries (entry+exit) • Fingerprinting attacks • Adversary observes only user side • Internet exchanges

  35. Dummy Traffic • Fake messages/traffic introduced to confuse the attacker • Undistinguishable from real traffic • Dummies improve the anonymity by making more difficult the traffic analysis • Neccessary for unobservability • Expensive

  36. (Some) Attacks on mix-based systems • Passive attacks • Long-term intersection attacks (statistical disclosure) • Persistent communication patterns • Extract social network information • Traffic correlation / confirmation • Firgerprinting • Source separation • Epistemic attacks (route selection) • Active attacks • N-1 attacks • Sybil • Tagging • Replay • DoS

  37. Long-Term Intersection Attacks • Family of attacks with many variants: • Disclosure attack (Agrawal, Kesdogan) • Hitting set attack (Kesdogan) • Statistical disclosure attack (Danezis, Serjantov) • Extensions to SDA (Dingledine and Mathewson) • Two-Sided SDA (Danezis, Diaz, Troncoso) • Receiver-bound cover traffic (Mallesh, Wright) • Assumption: • Alice has persistent communication relationships (she communicates repeatedly with her friends) • Method: • Combine many observations (looking at who receives when Alice sends) • Intuition: • If we observe rounds in which Alice sends, her likely recipients will appear frequently • Result: • We can create a vector that expresses Alice’s sending probabilities (a sending profile)

  38. Outline • Privacy and anonymity • Anonymous communications • Anonymity metrics • Social networks

  39. Definition [PfiHan2000] • First clear definition of anonymity (2000) • Anonymity is the state of being not identifiable within a set of subjects, the anonymity set. • The anonymity set is the set of all possible subjects who might cause an action or be addressed. • Anonymity depends on: • The number of subjects in the anonymity set • The probability distribution of each subject in the anonymity set being the target

  40. Example: computing anonymity metrics for a pool mix p1=2-1 Recipient R1 Potentially infinite subjects in the recipient anonymity set p2=2-2 Recipient R2 p3=2-3 Probability of recipient Ri : pi=2-i Recipient R3 p4=2-4 Recipient R4

  41. Entropy: information-theoretic anonymity metrics [DSCP02, SD02] • Entropy: measure of the amount of information required on average to describe the random variable • Measure of the uncertainty of a random variable • Increases with N and with uniformity of distribution • Distribution with entropy H equivalent to uniform distribution with 2H subjects • Other information theoretic metrics: min-entropy, max-entropy, Rényi entropy, relative entropy, mutual information, ....

  42. Combining traffic analysis with social network knowledge • “On the Impact of Social Network Profiling on Anonymity” [DTS-PETS08] • Use of Bayesian inference to combine different sources of information (SN knowledge, text mining) • Results: • “Combined” anonymity decreases with network growth (if only one source of information is considered, then growing the network does not decrease anonymity) • Anonymity degradation as more profiles become known • ERRORS: Bad quality SN profile information • Small profile errors lead to large errors in the results • If adversary not completely confident of SN info → cannot have any confidence on results

  43. Combinatorial approaches • Edman et al. • Consider deanonymization for a system as a whole (instead of individual users) • Find perfect matching inputs/outputs • Perfect anonymity for t messages: t! equiprobable combinations • GTDPV-WPES08 • Edman et al.’s metric overestimates anonymity if users send/receive more than one message • Generalization to users sending/receiving more than one message • Divide and conquer algorithm to compute the metric • Upper and lower bounds on anonymity (easy to compute)

  44. Outline • Privacy and anonymity • Anonymous communications • Anonymity metrics • Social networks

  45. A social network approach • Most anonymity techniques focus on channels and one-to-one relationships • Mailing lists, twikis, blogs, online social networks: • Many-to-many communication (communities) • Shared spaces and collective content (group photo albums, forums) • Moreover, information about third parties (not in the community) is also leaked (pictures, stories, references in posts) • This scenario presents new privacy challenges

  46. “The Economics of Mass Surveillance” [DanWit06] • Model: • Assumptions • If one of the club participants is under surveillance all information shared in this club, and the membership list of the club becomes known to Eve • Eve has a limited budget (number of people it can put under surveillance) Clubs People

  47. Target selection • How best to choose those to be put under surveillance to maximize returns in terms of observed clubs, people and their relationships? • How does the lack of information, due to the use of an anonymizing networks, affect the effectiveness of such target selection? • Data: mail archives of political network 2003-2006 • Clubs = mailing lists (373) • People = email addresses posting more than 5 times to those mailing lists (2338) • Links = number of posts of person over 3 years (3879)

  48. With and without anonymizing network • With • Eve can only observe aggregated amount of traffic generated by people • Eve puts under surveillance the nodes that generate the most volume • Results: 50% links known when surveying 5% of people. To get 80% of links an adversary needs to observe 30% of people (vs. 3%) • Without • Eve can observe all traffic flow (not the content) and thus construct the social graph (links people-clubs) • Eve chooses to put under surveillance the nodes with the highest degrees, excluding all spaces already under surveillance • Result: surveilling 8% of the people is enough to control 100% of the clubs

  49. The Economics of Covert Community Detection and Hiding [Nag08] • Adversary tries to uncover communities and learn community membership • Counter-surveillance techniques that can be deployed by covert groups (e.g., topological rewiring) • “Our study confirms results from previous work showing that invasion of privacy of most people is cheap and easy” • With certain counter-surveillance strategies: “70% of the covert community going undetected even when 99% of the network is under direct surveillance.”

More Related