Data Mining
E N D
Presentation Transcript
Models Created by Data Mining • Linear Equations • Rules • Clusters • Graphs • Tree Structures • Recurrent Patterns
Knowledge Discovery in Databases (KDD) • Select target data • Preprocess data • Transform (if necessary) • Data mine information • Interpret discovered structures
Dependant and Independent Variables • Dependant Variable - Attribute to be predicted. • Independent Variable - Attributes used for making the prediction.
Fields Contributing to Data Mining • Database Technology • Statistics • Machine Learning • High Performance Computing • Pattern Recognition • Neural Networks • Data Visualization • Information Retrieval
Applications of Data Mining • Decision Making • Process Control • Information Management • Query Processing
Methods of Data Reduction • Drill-down analysis • Clustering • Aggregation • Simple Tabulation
Exploratory Data Analysis (EDA) • Distributions of Variables • Correlation Matrices • Multi-way Frequency Tables • Cluster Analysis • Classification Trees • Other multivariate techniques
Statistical Methods Used in Data Mining • Regression Analysis • Standard Distribution • Cluster Analysis
Industries Using Data Mining • Banking • Insurance • Medicine • Retail • Security • Sciences
Financial Uses of Data Mining • Fraud Detection • Money Laundering Detection • Risk Management
Medical Uses of Data Mining • Chemical Compounds • Genetic Material • Predictive Treatment Models
Retail Uses of Data Mining • Direct Marketing • Store Design • Store Operations
Security Uses of Data Mining • Assess crime patterns • Homeland Security • Identification of suspicious activities • Pre-screening
Scientific Uses of Data Mining • Image analysis • Classification of large data sets
Other Novel Uses for Data Mining • NBA’s Advanced Scout Program • Firefly
Predictive Analytics • An advanced form of data mining that makes prediction models for the behavior of variables in large data sets. • Highly specialized for each application
Uses of Predictive Analytics • Cost-Benefit Analysis • Predicting Customer Behavior • Reducing Costs
Financial Uses of Predictive Analytics • Credit Ratings • Economic Prediction Models • Federal Reserve
Text Mining • Extracts data from unstructured data sets • Allows for data mining of large data sets that are not databases
Sentiment Analysis • Uses semantic techniques and keywords to detect favorable and unfavorable opinions toward specific subjects.
Privacy Concerns with Data Mining • Big Brother • Puts too much power into the hands of Governmental Security Forces
False Positives in Data Mining for Security Reasons • Costs the people and the Government • Subject of controversy and civilian mistrust
Data Mining as Another Tool for Security • Government doesn’t wish to interfere in civilian life • Actual intrusions of privacy incur legal costs • Useful for correlating with other sources of data
Visual and Speech Processing • Examining large amounts of real-time input for specific data and relationships between data • Requires a certain amount of predictive modeling
Data Mining is an Essential Use of Computers • It makes the previously impossible possible • Powerful tool for progress and understanding • Lasting Impact