1 / 16

D ATA M INING A N O VERVIEW

D ATA M INING A N O VERVIEW. BY : J OSEPH C ASABONA. Data Warehouse-->. O VERVIEW. What is Data Mining? Introduction to KDD Type of Data found using Data Mining The 4 Goals of Data Mining Case Study: MetLife. W HAT IS D ATA M INING ?.

nan
Télécharger la présentation

D ATA M INING A N O VERVIEW

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DATA MININGAN OVERVIEW BY: JOSEPH CASABONA Data Warehouse-->

  2. OVERVIEW • What is Data Mining? • Introduction to KDD • Type of Data found using Data Mining • The 4 Goals of Data Mining • Case Study: MetLife

  3. WHATIS DATA MINING? • Definition: The mining or discovery of new information in terms of patterns or rules from vast amounts of data • Adds more functionality than a DBMS • Creates relationships within the data • One step in the KDD Process 

  4. KDD • Stands for "Knowledge Discovery in Databases" •  Six step process that helps us organize and extract new data from already existing data • The six steps are: data selection, cleansing, enrichment, transformation, mining, and report generation.

  5. KDD CONT. • Selection and cleaning grab and validate the data to make sure it's good, complete, and proper. • Enrichment will add more to the data from other sources. • Transformation then limits the data in some way

  6. DATA MINING • Result is new information the user would not know just by standard querying. •  Can be in the form of: • Association Rules • Sequential Patterns • Classification Trees

  7. THE FOUR GOALSOF DATA MINING • Prediction: Using current data to make prediction on future activities • Identification: "Data patterns can be used to identify the existence of an item, an event, or an activity"

  8. THE FOUR GOALSCONT. • Classification: Breaking the data down into categories based on certain attributes. • Optimization: Using the mined data to make optimizations on resources, such as time, money, etc.

  9. DATA MINING EXAMPLES • Most have been consumer bases • Applicable in most industries • Next: Case Study on MetLife

  10. CASE STUDY: METLIFE Company Profile MetLife, Inc. is a leading provider of insurance and other financial services to millions of individual and institutional customers throughout the United States.  Established in 1863, Metlife now has offices all over  the US and the world, and offers ten different types  of insurances and financial services.

  11. CASE STUDY: METLIFE Industry: Insurance and Financial Services How they use Data Mining:  Fraud Detection

  12. CASE STUDY: METLIFE • Project first started in 2001 • MetLife set out to build $50 Million relational database • This project would consolidate data from 30 business world wide. 

  13. CASE STUDY: METLIFE • Around same time, it was reported that $30 Million of insurance money went to fraudulent claims. • MetLife teamed up with Computer Sciences Corporation (CSC) to  • License their data mining tool (called Fraud Investigator),  • Develop @First, "an early fraud detection system"

  14. CASE STUDY: METLIFE • By 2003, MetLife's data mining operation was in full swing. • They were able to detect fraud in a fraction of the time it would take in man hours • One example is detecting rate evasion

  15. CASE STUDY: METLIFE •  Rate evasion is lying about where you live to pay lower premiums. • Metlife used data mining to detect rate evasion by matching ZIP codes with phone numbers to see if the cities matched. • In 2.5 hours, Metlife found 107 fraudulent claims, all linked to a rate-evasion ring in NY and Massachusetts. 

  16. QUESTIONS/COMMENTS?

More Related