CS573 Data Privacy and Security Introduction Li Xiong Department of Mathematics and Computer Science Emory University
Introduction on Privacy • Definitions and aspects of privacy • Models of privacy protection • Current lack of privacy protection and privacy breach incidents • What we can do • Data protection techniques • Relevant security concepts/topics
Definitions of Privacy • Right to be left alone (1890s, Brandeis, future US Supreme Court Justice) • a: The quality or state of being apart from company or observation; b: freedom from unauthorized intrusion (Merrian-Webster) • The right of individual to be protected against intrusion into his personal life or affairs, or those of his family, by direct physical or by publication of information (Calcutt committee, UK)
Aspects of Privacy • Information privacy • Bodily privacy • Privacy of communications • Territorial privacy
Privacy of the Person • Also referred to as 'bodily privacy', is concerned with the integrity of the individual's body • Examples: • imposed treatments such as lobotomy and sterilization, • blood transfusion without consent, • requirements for submission to biometric measurement.
Privacy of Personal Behavior • sometimes referred to as 'media privacy' • sensitive matters, such as • sexual preferences and habits, • political activities • religious practices
Privacy of Personal Communications • Individuals desire the freedom to communicate among themselves • Issues include • use of directional microphones and 'bugs' with or without recording apparatus, • telephonic interception and recording, and • third-party access to email-messages.
Privacy of Personal Data • Referred to as 'data privacy' and 'information privacy‘. • Establishment of rules governing the collection and handling of personal data • Data about individuals should not be automatically available to other individuals and organizations • The individual must be able to exercise a substantial degree of control over that data and its use.
Models of privacy protection • Comprehensive laws • Adopted by European Union, Canada, Australia • Sectoral laws • Adopted by US • Financial privacy, protected health information • Lack of legal protections for individual’s privacy on the Internet • Self-regulation • Companies and industry bodies establish codes of practice • Privacy enhancing technologies
State of data privacy • The last five decades have seen the application of information technologies to a vast array of abuses of data privacy
A race to the bottom: privacy ranking of Internet service companies • A study done by Privacy International into the privacy practices of key Internet based companies • Amazon, AOL, Apple, BBC, eBay, Facebook, Friendster, Google, LinkedIn, LiveJournal, Microsoft, MySpace, Skype, Wikipedia, LiveSpace, Yahoo!, YouTube
A Race to the Bottom: Methodologies • Corporate administrative details • Data collection and processing • Data retention • Openness and transparency • Customer and user control • Privacy enhancing innovations and privacy invasive innovations
Why Google • Retains a large quantity of information about users, often for an unstated or indefinite length of time, without clear limitation on subsequent use or disclosure • Maintains records of all search strings with associated IP and time stamps for at least 18-24 months • Additional personal information from user profiles in Orkut • Use advanced profiling system for ads
Remember, they are always watching … what can we do? Who cares? I have nothing to hide.
If you do care … • Use cash when you can. • Do not give your phone number, social-security number or address, unless you absolutely have to. • Do not fill in questionnaires or respond to telemarketers. • Demand that credit and data-marketing firms produce all information they have on you, correct errors and remove you from marketing lists. • Check your medical records often. • Block caller ID on your phone, and keep your number unlisted. • Never leave your mobile phone on, your movements can be traced. • Do not user store credit or discount cards • If you must use the Internet, encrypt your e-mail, reject all “cookies” and never give your real name when registering at websites • Better still, use somebody else’s computer
Information need vs. privacy • The volume of data recorded about people will continue to expand • Medical records, finance records, … • Corporate surveillance of customers • The data are of great value for both the individuals and our society. • However, they also pose a significant threat to individuals’ privacy.
Privacy Protection • A process of finding appropriate balances between privacy and multiple competing interests: • the privacy interests of one person may conflict with some other interest of their own (e.g. privacy against access to credit, or quality of health care); • the privacy interest of one person may conflict with the privacy interests of another person (e.g. health care information that is relevant to multiple members of a family); • the privacy interest of one person or category of people may conflict with other interests of another person, category of people, organization, or society as a whole (e.g. creditors, an insurer, and protection of the public against serious diseases).
Data privacy - main topics • Models and algorithms for privacy protection while allowing society to collect and share person-specific data for worthy purposes. • Topics • Anonymization techniques for privacy preserving data publishing • Data perturbation techniques for privacy preserving data mining • Statistical databases • Cryptographic techniques for multi-party computation • Privacy issues in different domains: healthcare, social networks …
Privacy preserving data publishing • Also referred to as data anonymization, data de-identification • Involves methods for de-identifying data such that the results can be shared with assurances of anonymity while the data remain practically useful for worthy purposes. • Data anonymity is a compromise position.
A Face is exposed for AOL searcher No. 4417749 • Naïve anonymization may not be sufficient • 20 million Web search queries by AOL • User 4417749 • “numb fingers”, • “60 single men” • “dog that urinates on everything” • “landscapers in Lilburn, Ga” • Several people names with last name Arnold • “homes sold in shadow lake subdivision gwinnett county georgia” • Thelma Arnold, a 62-year-old widow who lives in Lilburn, Ga., frequently researches her friends’ medical ailments and loves her dogs
Privacy-preserving data mining Data Perturbation – random noise, geometric rotation Models/patterns of data not affected Private Data Private Data Private Data Perturbed Data Data Perturbation Data Perturbation Data Perturbation Data Mining 22
Cryptographic techniques for distributed data sharing Multi-party secure computation Cryptographic protocols Absolute security/privacy vs. approximation x1 x2 f(x1,x2,…, xn) xn x3 23
Private Data Private Data Private Data Private Data Private Data Policies Statistical Queries Data Access Access control • Multi-level secure databases • Hippocratic databases • Statistical databases
Broader topics of information security • Information security - protecting information and information systems from unauthorized access and use. • Core principles (CIA triad) • Confidentiality – preventing disclosure of information to unauthorized individuals or systems • Integrity • Availability • Mechanisms • Access control • Cryptography
Other computer security topics • Network security. Firewalls, intrusion detection systems (IDS), DoS attacks and defense … • OS (Unix/Windows) security. Access control, administration … • Software security. Memory management, buffer overruns, race conditions, analysis of code for security errors, safe languages, and sandboxing techniques … • Malware analysis and defense. Worms, spyware …
References Privacy International – overview of privacy http://www.privacyinternational.org Privacy International - A Race to the Bottom: Privacy Ranking of Internet Service Companies http://www.privacyinternational.org Economist – the end of privacy Computer Security, 2nd edition, Deiter Gollman
Further Readings - Privacy Protection Laws • A good international survey:http://www.gilc.org/privacy/survey/ • USA status:http://www.gilc.org/privacy/survey/surveylz.html#USA • Children's Online Privacy Protection Act:http://www.ftc.gov/bcp/conline/pubs/buspubs/coppa.htm • Health Insurance Portability and Accountability Act:http://www.hhs.gov/ocr/hipaa/ • Gramm-Leach-Bliley Act:http://www.ftc.gov/privacy/privacyinitiatives/glbact.html