Privacy Lecture Notes
What, if anything, is special about the privacy concerns that are associated with cybertechnology? • Technology has had a great impact upon: • The amount of personal information that can be gathered. • The speed at which personal information can be transmitted. • The duration of time that the information can be retained. • The kind of information that can be transferred.
Harvard Law professor Arthur Miller points out that trying to correct inaccurate information in cyberspace is like chasing a greased pig. • The days of “going West” and starting over are over. • Consider what happens when one makes a withdrawal from an ATM machine or purchases something with a credit card. • Information about these transactions is transmitted over the Internet and is stored in databases.
NEWS BULLETIN: The FBI has asked certain universities to turn over library information under the Patriot Act.
What is Personal Privacy? • Initially, under US law, privacy was understood in terms of freedom from physical intrusion (accessibility privacy). • Later it became associated with freedom from interference into one’s personal affairs (decisional privacy). • More recently, it has evolved to involve access to and control of personal information (informational privacy).
A landmark event in the history of privacy law in the US was a seminal article written by Samuel Warren and Louis Brandeis for the Harvard Law Review in 1890. • Their article voiced the opinion that the privacy of individuals should be protected against intrusions by the print media of the time. • Although there is no explicit mention of privacy in the US Constitution, the Fourth Amendment protects citizens against unreasonable searches and seizures of personal affects. • This kind of privacy right eventually became known as accessibility privacy.
Issues in informational privacy include: • Who should have access to one’s personal information? • To what extent can individuals control the ways in which information about them can be gathered, stored, mined, combined, recombined, exchanged, and sold?
Volonino and Robinson, in their book, Principles of Information Security, make the following observation: • “For any organization that maintains customer data, security breaches may result in liability for loss of confidential information, data corruption, or breach of privacy obligations. Liability of this sort may be imposed as a result of contractual obligation to maintain data security. In addition, there may be regulatory obligation imposed by legislation. …
“Examples [of relevant legislation] include the Health Insurance Portability and Accountability Act (HIPPA) of 1996, the Gramm-Leach-Bliley Act (GLB) of 1999 or the Children’s Online Privacy Protection Act (COPPA), which require the safeguarding of customer personal data.” • GLB covers the financial services industry and HIPPA is important in the health services industry.
Under HIPPA, breaches of medical privacy will have mandatory penalties. • Breaches include disclosure of patient records via e-mail or unauthorized network access. • HIPPA is forcing the health industry to take security seriously by mandating the protection of information assets from abuse, exposure, or unauthorized access.
E-mail and voice mail that contain jokes, including those with racial, gender, ethnic or sexual content within an organization can have serious liability implications for that organization. • Communications that may be interpreted or construed as harassing or offensive create legal exposure because of broad interpretations of the Civil Rights Act of 1964. • This act requires employers to provide workplaces that are non-hostile and non-harassing and holds them legally responsible for failure to maintain such workspaces.
Case illustrations: Chevron was sued because of an e-mail that was forwarded among employees that listed 25 reasons why beer was better than women. • In February 1995 Chevron ended up paying $2.2 million to settle this suit. • The NY Times fired 23 employees in December 1999 for distributing pornographic images via e-mail. • The employees were fired for violating the Internet use polices at the NY Times.
Privacy and Democracy • Many have observed that countries with strong democratic institutions consider privacy more important than do less democratic ones. • On the other hand, privacy is not valued to the same degree in all nations and cultures. As a result, it may be difficult to get universal agreement on privacy laws and policies in cyberspace. • A basic goal of an important perspective on ethics is that ethical guidelines should be based on the need for human flourishing and democracy. Is privacy related to these goals?
Mechanisms for Collecting and Recording Personal Data---Dataveillance • Roger Clarke has been an influential author in the area of privacy in cyberspace. • In 1988 (before the Internet became as prevalent as it is today) he introduced the term “dataveillance” to refer to the data monitoring and recording techniques made possible by computer technology.
Individuals (e.g., private investigators and stalkers) as well as organizations (e.g., government agencies all over the world) have used both electronic and non-electronic devices to perform surveillance long before the advent of the Internet. • Examples include wire-tapping and hiring individuals to monitor the performance of employees in the workplace. • Now, surveillance-related privacy threats come not only from governments and their agencies, but also from those businesses and corporations that use on-line data gathering tools and techniques.
Don’t say I didn’t warn you! - George Orwell
Cookies • Cookies are files that Web sites send to and retrieve from the computer systems of Web users. • Cookies can be used to collect information about an individual’s on-line browsing preferences whenever a person visits a Web site. • This exchange of data between the user and the Web site typically occurs without the user’s knowledge or consent.
Defenders of cookies tend to be the owners and operators of on-line businesses and Web sites. • They maintain that they are performing a service for repeat users of a Web site by customizing the information the customer obtains from that site. • Privacy advocates say that this technology crosses the line. They point out that information gathered about a user via cookies can eventually be acquired by on-line advertising agencies who can then target the user to receive on-line ads. • Let’s look at the DoubleClick scenario and case illustration in our scenario handout.
A number of privacy-enhancing tools are available that enable users to identify and block cookies on a selective basis. • These include PGPcookie.cutter from Pretty Good Privacy (PGP). • We shall also see some new technologies that have evolved from P3P, like Privacy Bird. • Most web browsers allow users to disable cookies. The problem is user awareness of the issues and the implications. • Most Web browsers permit cookies by default.
Hal Berghel has written quite a few articles for the CACM about the security and privacy issues surrounding cookies. • In one article he discusses Web bugs. Web bugs are links to small images that are often 1 pixel by 1 pixel in size and thus not visible. • They are placed within host Web pages specifically to eavesdrop on user browsing patterns. • They are usually placed there by a “third party”, such as an advertising company.
The bug can allow the third party to place a cookie on the user’s computer. • As we shall see, those cookies might contain a global identifier which can allow multiple Web sites to monitor that user’s behavior. • Q: What is your reaction to this? Should privacy apply to your Web surfing behavior? Should different rules apply for when you go shopping at Chester County Books versus when you go shopping at Amazon.com?
Q: What about spam? Is spam an invasion of privacy? • Let’s examine the Ken Hamadi case (described in a handout) in which an employee spammed the people working at his company. Was he trespassing against his employer’s private property or just exercising free speech? • Despite recent restrictions due to new legislation, the spam problem definitely seems to be getting worse.
Another issue is ISP monitoring. • ISPs log and retain the mail traffic generated by their customers. • Current law places few restrictions on the use of ISP log data in the United States. • The USA Patriot Act makes it legal for an ISP to release log data on an individual user to law enforcement without a court order.
Merging and Matching Electronic Records • Transactions involving the sale and exchange of personal data is a growing business. • Many now believe that professional information-gathering organizations such as Equifax, Experion, Trans Union, and MIB (Medical Information Bureau) violate the privacy of individuals because of the techniques they use to facilitate the exchange of personal information across and between databases. • These techniques are known as computer merging and matching.
Merging Computerized Records • Organizations have a legitimate need for information about individuals in order to make intelligent decisions concerning those individuals. • For example, if you apply for a credit card, it would be reasonable for the credit card company to request information about you. • A significant question is whether an individual can expect the personal information that he or she has provided to an organization in a specific context to remain within that organization (this is the law in Europe).
Computer merging is the technique for extracting information from two or more unrelated databases that obtain information about some individual or group of individuals and then integrating that information into a composite file or database.
Consider a situation in which you give information about yourself to three different organizations: • A lending institution gets your income and credit history so that you can secure a loan. • A life insurance company gets information about your age and medical history. • A political organization that you wish to join gets information about your views on certain social issues. • In voluntarily providing this information to these three organizations, no breach of privacy has occurred.
But, what if one of these organizations shares your personal information with another? Then, you lose control over the way in which that information about you is managed and utilized. • Let’s consider the DoubleClick and Abacus database controversy in the scenario handout.
Matching Computerized Records • Computer matching is a variation on the data merging technologies. • It involves cross-checking information in two or more unrelated databases to produce matching records or “hits”. • This technique has been used by various federal and state agencies to create a list of potential law violators or even actual law violators.
For example, your property tax records (stored by your local government) can be matched against your federal tax records (held by the IRS) to see whether you own an expensive house but declared only a small income. • Information matching has been used to find deadbeat parents and welfare cheats. • Some opponents (e.g., civil liberties groups) see this kind of data matching as a new form of social control. • Defenders see it as a valuable means of tracking down people who are violating the law.
One popular line of reasoning frequently cited to defend computer matching is: If you have done nothing wrong, you have nothing to worry about. • Another line of reasoning that is sometimes used to defend computer matching runs like this: • Privacy is a legal right. • Legal rights are not absolute. • When one commits a crime, one forfeits one’s legal rights. • Therefore, criminals have forfeited their right to privacy.
One fallacy with this reasoning is that it’s not only the criminals who are losing their privacy. • The matching programs are being applied to everyone, whether they have broken the law or not. • Q: Does the violate the Fourth Amendment protection against unreasonable searches and seizures in your opinion? • A related topic is biometric matching. We will discuss this in greater depth when we discuss face recognition technology.
Mining Personal Data • Data mining uses techniques from research and development in artificial intelligence. • Data mining involves the indirect gathering of personal information through an analysis of implicit patterns discoverable in data. • Data-mining can generate new and sometimes non-obvious classifications or categories.
Individuals whose data is mined can become identified with or linked to certain newly created groups. • The individuals involved may know nothing about these groups. • Current privacy laws offer individuals no protection with respect to how information about them acquired through data-mining activities is subsequently used. • Data mining raises some interesting concerns relating to personal privacy.
For example: What if data mining reveals that all people who drive yellow cars die at a young age. • Your life insurance company finds out that you drive a yellow car using data mining techniques, so they deny you a life insurance policy. • Suppose that data mining research shows that people who drink their coffee black are more likely to cheat on their taxes than those who don’t. • Should the government audit the tax returns of people who drink their coffee black?
What if data mining research shows that people who drive yellow cars and drink their coffee black are likely terrorists. • Should someone who drives a yellow car and drinks their coffee black be sent to Guantanemo? • Virtually no legal or normative (relating to the social norm) protections apply to personal data manipulated in the data mining process where personal information is typically: • Implicit in the data • Non-confidential in nature • Not exchanged between databases
Data mining can suggest “new” facts, relationships, or associations about a person, placing that person in newly discovered categories or groups (e.g., yellow car drivers who drink their coffee black). • Often the kinds of data used in data mining is not considered confidential. It is no secret that you drive a yellow car and that you drink your coffee black. • Let’s go over the XYZ Bank data mining scenario in the scenarios handout.
In the XYZ Bank data mining scenario, Lee gave the bank pieces of information that he considered reasonable so that the bank could make a meaningful determination about his request for an automobile loan. • Lee did not explicitly authorize the bank to use disparate pieces of that information for more general data-mining analyses that would reveal patterns involving Lee that neither he nor the bank could have anticipated at the outset. • One privacy issue is that the inference that Lee was someone who would start his own business (leading eventually to his declaring bankruptcy) was not explicit in any of the data about Lee.
All of the data the bank used was internal to the bank. • It did not involve exchanging data with other institutions. • The bank did not transfer data about Lee to an external database without Lee’s consent. • Some of the data (like the fact that he took a vacation in Europe) might be considered public data. After all, people saw him on the plane and at the Louvre. • The bank used information about Lee in a way that he had not explicitly authorized. This is where data mining raises serious concerns for personal privacy.
One chain store did data mining and discovered that men who shopped for disposable diapers during evening hours for infants also purchased beer, so they relocated the beer supply next to the diapers. • One threat to privacy is posed by commercial Web sites that use data-mining to analyze data about Internet users. The results can then be sold to third parties. • Intelligent agents, or softbots, can scour the Web for information, including information from personal Web sites. This information can be fed into data-mining tools.
Privacy Issues Relating toPublic Personal Information • Privacy analysts are concerned about information which is neither confidential nor intimate and which is also being gathered, exchanged, and mined using cybertechnology. • This information is called Public Personal Information (PPI). • PPI includes information such as where you work or attend school or what kind of car you drive.
PPI has not enjoyed the privacy protection that has been granted to confidential and very personal information (e.g., health records, academic transcripts, financial records, etc.). • Privacy advocates are now saying that PPI deserves greater legal and normative protection than it currently has. • Consider the difference between shopping at Nile.com (on the Web) versus shopping at West Chester Books (a fictitious bookstore on Paoli Pike in West Chester).
The information that Nile.com has about you after your shopping spree may not seem categorically different from the information that West Chester Books has about you (assuming that you used the store’s “courtesy card” in making your purchases). • There are significant differences in the ways that information about you can be gathered, recorded, and then used as a result of your shopping spree at each store. • At West Chester Books, only the things that you actually purchased are recorded. Your browsing behavior is not recorded (or is it?). • At Nile.com there is a record of virtually every move you make – every book you search, review, etc., as well as the ones you actually purchase.
This personal information that Nile.com has gathered about your browsing and shopping habits on-line is considered and treated as public information. • Nile can use this information as they see fit. • For example, they can combine this information about you with information about your on-line transactions at other Web sites to create a customer profile, which then can be sold to a third party (like DoubleCross). • The entrepreneurs might argue that once the user puts himself or herself on-line, that information is not private, it is public.
“You already have zero privacy – get over it!” - Scott McNealy
Accessing Public Recordsvia the Internet • Public records (e.g., held by municipal or county governments) have long been publicly available. • Now we have a situation in which entrepreneurs can manipulate and sell information mined from these public records. • For a long time it has been assumed that the availability of public records causes no harm to individuals and that communities are better served because of the access of those records for what seems like legitimate purposes.
Information-gathering companies now access those public records, manipulate them to discover patterns useful to business, and then sell that information to third parties. • Was that the original intent for making such information accessible to the public? • The Public Records scenario in the handout refers to two controversial uses of public information: • Housing layouts in New Hampshire were made available over the Internet. (Great for robbers!) • The state of Oregon sold driver license information for a fee, using it as a source of revenue.
Some entrepreneurs present the following arguments for making public records available over the Internet: • Public records have always been available to the public. • Public records have always resided in public space. • The Internet is a public space. • Therefore, all of public records ought to be made available on the Internet.
Privacy Enhancing Tools (PETs) • PETs can be used either to: • Protect the user’s personal identity while interacting with the Web, or • Protect the privacy of communications (such as e-mail) sent over the Internet (using encryption / decryption) • We will discuss the first type of technology in our discussion of Crowds and Tor. • We will discuss the second type of technology in our discussion of cryptography. • Ultimately, the widespread application and use of PETs (like Pretty Good Privacy) will require a massive educational effort.