170 likes | 453 Vues
Data Mining. CS 157B Section 2 Keng Teng Lao. Overview. Definition of Data Mining Application of Data Mining. Data Mining . Refers to the mining or discovery of new information in terms of patterns or rules from vast amounts of data.
E N D
Data Mining CS 157B Section 2 Keng Teng Lao
Overview • Definition of Data Mining • Application of Data Mining
Data Mining • Refers to the mining or discovery of new information in terms of patterns or rules from vast amounts of data. • To be useful, data mining must be carried out efficiently on large files and databese.
KDD Knowledge Pattern Evaluation • Knowledge Discovery in Databases Data Mining Task-relevant Data Selection Data Warehouse Data Cleaning Data Integration Databases
Data Mining Vs. Data Warehousing • The goal of a data warehouse is to support decision making with data. • Data Mining can be used in conjunction with a data warehouse to help with certain types of decisions
Goals of Data Mining and Knowledge Discovery • Prediction – Data mining can show how certain attributes within the data will behave in the future. • Identification – Data patterns can be used to identify the existence of an item, an event, or an activity.
Cont. • Classification – Data mining can partition the data so that different classes or categories can be identified based on combinations of parameters • Optimization – Once eventual goal of data mining may be to optimize the use of limited resources such as time, space… to maximize output variables such as sales or profits under a given set of constraints.
Types of Knowledge Discovered During Data Mining • Association rules • Classification hierarchies • Sequential patterns • Patterns within time series • Clustering
Classification hierarchies • Process of learning a model that describes different classes of data. • Decision Tree
Sequential Patterns • The discovery of sequential patterns is based on the concept of a sequence of itemsets. • TO find all subsequences from the given sets of sequences that have a user-defined minimum support.
Patterns with in Time Series • Time series are sequences of event • Each event may be a given fixed type of a transaction • The closing price of a stock or a fund is an event that occurs every weekday for each stock fund.
Application of Data Ming • Marketing – Application include analysis of consumer behavior based on buying patterns • Finance – Applications include analysis of creditworthiness of clients, segmentation of account receivables…
Cont. • Manufacturing – Applications involve optimization of resources like machines, manpower, and materials • Health Care – Applications include discovering patterns in radiological images, analyzing side effects of drugs…
Real Life Application • The LA police departments counterterrorism unit next are using a new data-analysis system designed to identify and connect related pieces of intelligence to help officers dter and respond to terrorist attacks.
Reference • Elmasri, Remez Fundamentals of Database Systems. Pearson. Singapore. 2004. • LAPD turns to data analysis to fight terrorism. <http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=107670>