1 / 26

Data Warehousing Data Mining Privacy

Data Warehousing Data Mining Privacy. Reading. Data Warehousing. Repository of data providing organized and cleaned enterprise-wide data (obtained form a variety of sources) in a standardized format Data mart (single subject area) Enterprise data warehouse (integrated data marts) Metadata.

amir-woods
Télécharger la présentation

Data Warehousing Data Mining Privacy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data WarehousingData MiningPrivacy

  2. Reading CSCE 824 - Spring 2011

  3. Data Warehousing • Repository of data providing organized and cleaned enterprise-wide data (obtained form a variety of sources) in a standardized format • Data mart (single subject area) • Enterprise data warehouse (integrated data marts) • Metadata CSCE 824 - Spring 2011

  4. OLAP Analysis • Aggregation functions • Factual data access • Complex criteria • Visualization CSCE 824 - Spring 2011

  5. Warehouse Evaluation • Enterprise-wide support • Consistency and integration across diverse domain • Security support • Support for operational users • Flexible access for decision makers CSCE 824 - Spring 2011

  6. Data Integration • Data access • Data federation • Change capture • Need ETL (extraction, transformation, load) CSCE 824 - Spring 2011

  7. Data Warehouse Users • Internal users • Employees • Managerial • External users • Reporting and auditing • Research CSCE 824 - Spring 2011

  8. Data Mining • Databases to be mined • Knowledge to be mined • Techniques Used • Applications supported CSCE 824 - Spring 2011

  9. Data Mining Task • Prediction Tasks • Use some variables to predict unknown or future values of other variables • Description Tasks • Find human-interpretable patterns that describe the data CSCE 824 - Spring 2011

  10. Common Tasks • Classification [Predictive] • Clustering [Descriptive] • Association Rule Mining [Descriptive] • Sequential Pattern Mining [Descriptive] • Regression [Predictive] • Deviation Detection [Predictive] CSCE 824 - Spring 2011

  11. Security for Data Warehousing • Establish organizations security policies and procedures • Implement logical access control • Restrict physical access • Establish internal control and auditing CSCE 824 - Spring 2011

  12. Security for Data Warehousing (cont.) • Security Issues in Data Warehousing and Data Mining: Panel Discussion • Panel discussion of BhavaniThuraisingham, The MITRE Corporation, Linda Schlipper, The MITRE Corporation, PierangelaSamarati, SRI International, T. Y. Lin, San Jose State University, SushilJajodia, George Mason University, Chris Clifton, The MITRE Corporation, xanadu.cs.sjsu.edu/~tylin/publications/paperList/109_security.ps CSCE 824 - Spring 2011

  13. Integrity • Poor quality data: inaccurate, incomplete, missing meta-data • Source data quality vs. derived data quality CSCE 824 - Spring 2011

  14. Access Control • Layered defense: • Access to processes that extract operational data • Access to data and process that transforms operational data • Access to data and meta-data in the warehouse CSCE 824 - Spring 2011

  15. Access Control Issues • Mapping from local to warehouse policies • How to handle “new” data • Scalability • Identity Management CSCE 824 - Spring 2011

  16. Inference Problem • Data Mining: discover “new knowledge”  how to evaluate security risks? • Example security risks: • Prediction of sensitive information • Misuse of information • Assurance of “discovery” • Interesting Read: C. C. Aggarwal and P.S. Yu, PRIVACY-PRESERVING DATA MINING: MODELS AND ALGORITHMS, http://charuaggarwal.net/toc.pdf CSCE 824 - Spring 2011

  17. Privacy • Large volume of private (personal) data • Need: • Proper acquisition, maintenance, usage, and retention policy • Integrity verification • Control of analysis methods (aggregation may reveal sensitive data) CSCE 824 - Spring 2011

  18. Privacy • What is the difference between confidentiality and privacy? • Identity, location, activity, etc. • Anonymity vs. accountability CSCE 824 - Spring 2011

  19. Legislations • Privacy Act of 1974, U.S. Department of Justice (http://www.usdoj.gov/oip/04_7_1.html ) • Family Educational Rights and Privacy Act (FERPA), U.S. Department of Education, (http://www.ed.gov/policy/gen/guid/fpco/ferpa/index.html ) • Health Insurance Portability and Accountability Act of 1996 (HIPAA), (http://en.wikipedia.org/wiki/Health_Insurance_Portability_and_Accountability_Act ) • Telecommunications Consumer Privacy Act (http://www.answers.com/topic/electronic-communications-privacy-act ) CSCE 824 - Spring 2011

  20. Online Social Network • Social Relationship • Communication context changes social relationships • Social relationships maintained through different media grow at different rates and to different depths • No clear consensus which media is the best CSCE 824 - Spring 2011

  21. Internet and Social Relationships Internet • Bridges distance at a low cost • New participants tend to “like” each other more • Less stressful than face-to-face meeting • People focus on communicating their “selves” (except a few malicious users) CSCE 824 - Spring 2011

  22. Social Network • Description of the social structure between actors • Connections: various levels of social familiarities, e.g., from casual acquaintance to close familiar bonds • Support online interaction and content sharing CSCE 824 - Spring 2011

  23. Social Network Analysis • The mapping and measuring of relationships and flows between people, groups, organizations, computers or other information processing entities • Behavioral Profiling • Note: Social Network Signatures • User names may change, family and friends are more difficult to change CSCE 824 - Spring 2011

  24. Interesting Read: • M. Chew, D. Balfanz, B. Laurie, (Under)mining Privacy in Social Networks, http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.149.4468 CSCE 824 - Spring 2011

  25. Next Hippocratic Databases CSCE 824 - Spring 2011

  26. Next Class Stream Data CSCE 824 - Spring 2011

More Related