1 / 1

Investigating Privacy Complaints

Investigating Privacy Complaints . Anand Sonkar 1 , Jennifer King 2 , Nick Doty 2 , Prof. Deirdre Mulligan 2 1 Arizona State University, 2 University of California , Berkeley. Many Eyes Visualization Analysis

delila
Télécharger la présentation

Investigating Privacy Complaints

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Investigating Privacy Complaints Anand Sonkar1 , Jennifer King2 , Nick Doty2 , Prof. Deirdre Mulligan2 1Arizona State University, 2University of California , Berkeley Many Eyes Visualization Analysis Many Eyes is a software tool used to visualize patterns within a text. It is used to visually explore information and to help holistically analyze a data set. The Word Cloud Generator, a text analysis tool, enables you to see the frequency with which words appear in a given text and the relationship between words within that text. The Word Cloud Generator was used to further analyze the textual data that was stored in the database. The figure below shows the analysis of the text using this visualization tool. Conclusion This research demonstrates that Yahoo! Answers provides a useful dataset for further analysis using tool such as the Natural Language Toolkit and Many Eyes, because it reflects both people’s real world concerns and questions about privacy issues. Future Work Future work could include increasing the list of privacy related terms and phrases used to generate results, collect more data, and sanitize the code to obtain more relevant results from specified categories. Using natural language processing for linguistic analysis of the text and creating a taxonomy of privacy terms are also goals. Acknowledgements I would like to thank my graduate mentors Jennifer King, and Nick Doty and Professor Deirdre Mulligan for providing me the opportunity to work on this project. I would also like to thank Christopher Castillo, Jennifer Felder, German Gomez and Rafael Negron. Special thanks to Dr. Kristen Gates, NSF and TRUST for providing me with the opportunity to conduct this research. Overview As technology has advanced, the way in which privacy is both protected and violated has changed with it. In the case of the Internet, the improved ability to share information can lead to more ways in which privacy can be breached. The Internet has brought new concerns about privacy in an age where computers can permanently store many kinds of records. Goal of the Project The overall goal of our project was to create a command line executed tool written in python to query the Yahoo! Answers database and obtain relevant privacy complaint data to be further analyzed. Preliminary Analysis Our four-member team started by searching for privacy complaints posted by Yahoo! users on the Answers search engine, as well as the terms or phrases that produced relevant data. The idea was to look for relevant questions about individuals’ privacy concerns. Method The initial process was to connect to and query Yahoo! Answers for specific keywords and store the results into a MySQL database. The flowchart above summarizes the process of how the script is executed. Results Database consists of an organized collection of data. A table with eight rows was created within the database to store the results collected from Yahoo! Answers. As of July 2010, our team has generated over 7,000 results, including the keywords related to privacy, using the previously described tool. In our preliminary research our team derived a list of about 12 words/phrases related to privacy that were efficient in collecting data. Flowchart to create python command line tool. Analyzing the text using Word Cloud Generator. Table of fields created in the database. Yahoo! Answers search engine. Database populated with the results. This work was supported by the TRUST Center (NSF award number CCF-0424422)

More Related