1 / 41

Data Warehousing/data Mining: A Crop Insurance Application

Data Warehousing/data Mining: A Crop Insurance Application. Presented By Ashley Lovell Director of Agricultural Programs & Professor of Agricultural Economics At the National Risk Management Conference Click Icon for Program => D-FW Airport March 26, 2003.

sandra_john
Télécharger la présentation

Data Warehousing/data Mining: A Crop Insurance Application

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Warehousing/data Mining:A Crop Insurance Application Presented By Ashley Lovell Director of Agricultural Programs & Professor of Agricultural Economics At the National Risk Management Conference Click Icon for Program=> D-FW Airport March 26, 2003

  2. Data Warehousing/data Mining:A Crop Insurance Application Presented By Ashley Lovell Director of Agricultural Programs & Professor of Agricultural Economics CAE Staff Center for Agribusiness Excellence Tarleton State University The Texas A&M University System Lovell@tarleton.edu

  3. Overview • ARPA 2000 • Data Warehouse • Data Mining Research • CAE Participants • Overview of Activities & Research • Cost-Benefit Analysis • Conclusion

  4. Agricultural Risk Protection Act of 2000 • Crop Insurance Coverage • Program Integrity • Research and Pilot Programs • Education and Risk Management Assistance

  5. RMA Customer Distribution1999

  6. RMA Customer Distribution 2001

  7. Agricultural Risk Protection Act of 2000 • Improving Program Integrity – By Reducing Fraud, Waste and Abuse to Build a Stronger Crop Insurance Program and Lower Producer Costs • RMA and FSA to reconcile producer information* • RMA to establish methods to identify agents and adjusters who may be abusing the program • Center for Agribusiness Excellence (CAE) established *Spot Check Lists

  8. Agricultural Risk Protection Act of 2000 • Center for Agribusiness Excellence (CAE) • Mission: To conduct research using a single data warehouse and associated data mining tools for enhancing the integrity of the Federal Crop Insurance Program, thus improving the program integrity • Began operating in January 2001

  9. Data Warehouse Description & Contents

  10. Data Warehouse • Massively Large Relational Database (Multi Gigabyte - Terabytes) • Generally Many Variables (Columns) • Usually > 1 Million Observations (Rows) • Multiple Tables (E.G., Data Tables) • Consistent Representation (Dates, Units, Etc.)

  11. CAE Data Warehouse Contents • > 800 Million Records • Includes: • RMA Insurance Data 1991-2003 • NOAA Weather Data Completed Tasks

  12. CAE Data Warehouse Contents • GIS Linkage of Weather Station Data • Integration of Soil Data Tasks in Progress

  13. CAE Data Warehouse (Other Data Bases to Be Loaded) • Remote Sensing Data -Collaboratively with Spatial Sciences Lab (SSL), Texas A&M University • Climatological Data -Collaboratively with University of Nebraska-Lincoln, USDA National Drought Mitigation Center – (NDMC/UN-L)

  14. CAE Data Warehouse (Other Data Bases to Be Loaded) • Economic (e.g., Cash and Futures Market Data) • Soil Series Data -Collaboratively with USDA NRCS National Cartography Laboratory, SSL/TAMU & NDMC/UN-L

  15. Data Mining Research

  16. Overview of Data Mining Graphical Discovery Conditional Logic Affinities and Associations Data Mining Trends andVariations Predictive Modeling Outcome Prediction Forecasting Forensic Analysis Deviation Detection Link Analysis

  17. Modeling Methodology • Linear Regression • Logistic Regression • Neural Networks • Cluster Analysis • Classification Trees • Link Analysis • Genetic Algorithms

  18. Center for Agribusiness Excellence

  19. CAE’s Partners • Tarleton State University and Planning Systems Inc. (PSI) Are Partners in the Data Warehouse and Data Mining • USDA Risk Management Agency Research Project • Cooperative Agreement Signed on December 14, 2000 • Competitive Contract Awarded July 24, 2002, Effective September 1, 2002

  20. CAE’s Partner Contributions • PSI has Expertise in Data Warehouse Development and Implementation • RMA provides the data base and program operational experience • Tarleton has Expertise in Agriculture and CIS and is the Project Contractor & Coordinator

  21. Overview of CAE ActivitiesJanuary 2001-March 2003

  22. CAE Activities • University Personnel Assigned Jan 2001 • Data Model Finished Jun 2001 • RY 2000 Data “Readied” Jun 2001 • Producer Watch List Jun 2001 • Data Warehouse Loaded 1991-2000 Sep 2001 • ARPA 150% Delivered Oct 2001 • Updated 1998-2000 Data, 2001 Data Nov 2001 – Dec 2001 • Growing Season Spotcheck Lists Mar 2002 • NASS Data Integrated May 2002 • Web Interface Operational Aug 2002 • 49 Completed Projects Sept 2002 – Jan 2003 • RMA Spot Check List 2002 Delivered Feb 2003 • Last Delivery & Loading of Data Mar 2003

  23. CAE Research Drivers • Legislation • Work Orders • Scenarios

  24. CAE Research Drivers Legislation, specifically • ARPA of 2000 “…The Secretary shall establish procedures under which the Corporation will be able to identify the following: …

  25. CAE Research Drivers • Any person performing loss adjustment services relative to coverage offered under this title where such loss adjustments performed by the person result in accepted or denied claims equal to or greater than 150 percent … of the mean for accepted or denied claims (as applicable) for all other persons performing loss adjustment services in the same area, as determined by the Corporation….” • In addition to crop adjusters, ARPA included crop insurance agents.

  26. CAE Research Drivers • Work Orders - RMA Personnel Routinely Submit Requests (That Result in Work Orders) Which Focus the Research Resources of CAE • Scenarios* • Over sixty scenarios/sub-scenarios • Initiated scenario development early in 2001 *Indicators of Fraud, Waste, and Abuse

  27. Spot Check List: 2002 Data for ARPA Requirement Scenarios for Spot Check: • Triplets • Frequent Filers • Yield Switching • Prevented Planting Frequent Filers • Producers Associated With All or Nothing Agents • Crop Units With Excessive Yields • Under Reported Harvested Production • Rare Big Losers

  28. Rare Big Losers • Identify Rare Multi-year Losers, Using the Probability of Loss • Local Yield Variability Considered • Cluster and Factor Analysis Show the Importance of Local Conditions • A Producer’s Loss Ratios Strongly Related to Insurance Plan and Coverage Level Spot Check 2002

  29. Iowa & Oklahoma Are Different! Spot Check 2002 Region, Insurance Plan, & Coverage Level

  30. Rare Big Losers Average Indemnity of $98,664 Spot Check 2002

  31. Rare Big Losers Results* • 350 Unique Producers Accounting for $34,532,565 of Indemnity in 2002 • They Were Flagged at the 0.0001 Level • Average Indemnity of $98,664 • 72.5 Percent of Their Policies Resulted in a Significant Loss *Indicators of Fraud, Waste, and Abuse Spot Check 2002

  32. All or Nothing • Producers Who Are Associated With All or Nothing Agents • All or Nothing Agents are those Agents Who Have Disproportionate Numbers of Crop Policies With Total Losses Compared to Other Agents Within Same Area • Associated Producers Have Total Loss Claims • Associated Producers Who Were Indemnified in More Than One Year Spot Check 2002

  33. All or Nothing Producers $12,150,707 Indemnity for 236 Producers Spot Check 2002

  34. 2002 Spot Check List Summary Scenario*IndemnityProducers Triplets $ 4,332,310 99 Frequent Fliers $21,718,632 328 Yield Switching $15,486,631 285 Prevented Planting FF $7,011,644 60 All or Nothing $12,150,707 236 Excessive Yield $36,201,574 389 Under Reported Harvest Prod $23,502,812 225 Rare Big Losers $32,817,867 323 Unduplicated Totals $137,678,258 1,808 *Indicators of Fraud, Waste, and Abuse

  35. Total 2002 Spot Check List$137,678,258 Indemnity for 1808 Insureds

  36. Data Mining ActivitiesPublicized in Weekly Newsletter Newsletter Volume 1, No. 1Week of February 7, 2003 This week, the development of the 2003 Spot Check List is a continuing major research activity and includes all CAE staff members. The following scenarios are the basis for the Spot Check List (SCL) that will be finalized for delivery to RMA early in March.

  37. Cost-Benefit Analysis Data Mining Pays Off

  38. Cost-Benefit Analysis Examples • Data Mining In Texas Similar to CAE’s, Identified Areas of Tax Underpayment • In FY 2000, The State of Texas Comptroller Collected An Additional $43 Million in Taxes From Areas of Underpayment Identified Through Data Mining

  39. Cost-Benefit Analysis • Texas Blue Cross-Blue Shield Developed a Medical Insurance Data Warehouse • In the First Three Months, Data Mining Identified Enough Medical Fraud to Pay for the Data Warehouse & Mining

  40. Conclusions • Data Mining Can Detect Patterns of Waste, Fraud, and Abuse • Millions of Taxpayer and Insurance Provider Dollars Can Be Saved Through Data Mining Using Forensic Analysis Techniques • This Research Provides USDA with Analysis Tools Previously Unavailable

  41. Conclusions • Crop Insurance Is Vulnerable to Multiple Methods of Fraud, Waste and Abuse • A Small Number of Agents, Adjusters and Producers Are Linked to Anomalous Behavior

More Related