1 / 33

Big Data and Predictive Analytics in Government November 6, 2013

Big Data and Predictive Analytics in Government November 6, 2013. James G. Sheehan Executive Deputy Commissioner, New York City Human Services Administration David Fitz, CPA, PMP, CGFM Partner KPMG LLP. Government Initiatives in Big Data Data and Analytics Overview

harvey
Télécharger la présentation

Big Data and Predictive Analytics in Government November 6, 2013

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Big Data and Predictive Analytics in Government November 6, 2013 James G. SheehanExecutive Deputy Commissioner, New York City Human Services AdministrationDavid Fitz, CPA, PMP, CGFMPartnerKPMG LLP

  2. Government Initiatives in Big Data Data and Analytics Overview Value Proposition and Opportunities Current State Success Factors Agenda

  3. Government Initiative • “Can Anyone Catch New York?” 4/2/2013 Saul Sherry in Big Data Republic • Richness of City data available • Index crimes down 80% • 95 per cent success rate in tracking down dumpers of cooking oil • 2x improvement in id of illegal cigarette retailers • Use of inspection algorithm to identify and inspect firetraps • Boston and Chicago are trying

  4. Government Initiatives-IRS • IRS-Compliance Data Warehouse-1996 • Estimate US tax gap • Predict identity theft, fraud, other non-compliance • Optimize workload for enforcement • IRS lessons: • Interdisciplinary teams • Data quality focus-for quality and for trust • Governance to avoid policies and procedures that stifle change

  5. Government Initiatives • Patriot Act-NSA, FBI-2002 • $1.5 billion NSA digital data center-Bluffdale, Utah • FinCEN-”data reasonably necessary to identify illicit finance” (e.g., Bitcoin) • Part D Medicare-2006-insurance design and and drug benefit information 51 gigabytes of data (but most usable data left with plans)

  6. Government Initiatives • MA. May 2012 “Big Data Initiative” • “A Big Data Road map for Government”-NY Times 10/2012 • “Demystifying Big Data: A Practical Guide to Transforming the Business of Government” Techamerica Foundation www.techamericafoundation.org/bigdata

  7. Government Initiatives • March 2012-Obama Administration announces $250 million • DOD, “Big Data Research and Development Initiative”-Office of Science and Technology Policy • National Science Foundation/National Institutes of Health solicitation • $10 million to UC/Berkeley • “Xdata,” “Earthcube”

  8. Government Initiatives • White House Press Release 2013 • NSF Request for information • Big Data Fact Sheet • “Data to Knowledge to Action” program event by NITRD on Tuesday, November 12 www.nitrd.gov available by webinar-no agenda yet posted

  9. A collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications What is considered "big data" varies depending on the capabilities of the organization managing the set, and on the capabilities of the applications that are traditionally used to process and analyze the data set in its domain The “V’s”: Volume, Variety, Velocity, Veracity What is “Big” Data? • Source: Search on Big Data, www.wikipedia.org

  10. 15 of our 17 industry sectors in the United States have more data stored per company than the U.S. Library of Congress, which itself collected 235 terabytes of date in April 2011. Wal-Mart Stores Inc. handles more than 1 million customer transactions every hour, feeding databases estimated at more than 2.5 petabytes – the equivalent of 167 times the books in the Library of Congress. 30 billion pieces of content are shared on Facebook – monthly. What is “Big” Data? • Source: “Big Data + Big Analytics = Big Opportunity”, by Jeanne Johnson, KPMG LLP; Financial Executive, July/August 2012.

  11. Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making (includes hidden patterns, unknown correlations) Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science, and social science domains Data mining, computed matching, extending data Data Analytics • Source: Search on Data Analytics, www.wikipedia.org

  12. Future of Social Services Programs-Growth • Huge growth in eligible populations (about half of USA population (up to $90,000 annual income for a family of four)qualifies for Obamacare subsidies or Medicaid, over 60 million SNAP recipients) • Growth in cash equivalents for recipients and providers • Growth in means-tested eligibility programs • Earned Income tax credits-federal, state • Child care subsidies and vouchers based on income • Expanded higher education grants and direct government loans • Student loan reduced payment/forgiveness based on income • Expanded Social Security disability, SSI • Housing vouchers and income based eligibility, homeless services support

  13. Future of Social Services Programs-Growth (continued) • Automated applications • Limited or no face-to-face interaction with front line staff • Loss of practical expertise and local knowledge of front-line staff in assessing applicants • No original documents (or front-line staff copy of original documents presented) • internet access for all-anonymous or public access devices • Limited preservation of electronic communications • Potential for capture of data about clients, client conduct, and interactions

  14. What risks can we predict for automated enrollment in social services programs? • Anonymity/reduced identification breeds fraud risk • What people will do in the dark (psych research) • Earned income tax credit (GAO reports-about 24% fraud rate) • Driver behavior vs. pedestrian behavior • FEMA applications after Katrina • Use of SNAP benefits at Walmart after SNAP meltdown by Xerox

  15. Data Mining Predictive Analysis Issues • Sensitivity-how likely is the test to identify improper claims? • Specificity-how likely is the test to identify only improper claims • Problem-every predictive analysis will generate more work than there are people available to do -and those people will ignore results from any analysis with many false positives • Rules-based vs. learning systems

  16. What We Learn From Banks And Credit Cards-Data Mining needs to drive organization adaptation? • Identity, status, credentialing verification (new accounts) • Transaction tests ($ thresholds, patterns, locations ) • Front end identity questions • Prompt telephone and IM contact on fraud risk identification • Transaction verification • Scripted interviews and answers (e.g., “there has been a security breach on your account.”) • Close and replace account promptly • Someone is watching • Reduced reliance on prosecution

  17. The Affordable Care Act • $ 350 million over 10 years to bolster anti-fraud efforts, including predictive modeling programs • Provides funding for the Health Care Fraud and Abuse Control (HCFAC) Program account, the Medicare Integrity Program, and the Medicaid Integrity Program • Strengthen cooperative efforts across the Federal government and with the private sector • Increased data sharing between Federal entities to monitor and assess high risk program areas and better identify potential sources of fraud • Expansion of Integrated Data Repository (IDR) which is currently populated with years of historical Part A, Part B and Part D paid claims, to include near real time pre-payment stage claims data; indicators of aberrant activity throughout the claims processing cycle. (e.g., time claim was submitted and modified) • State data set will be harmonized with Medicare claims data in the IDR to detect potential fraud, waste and abuse across multiple payers

  18. CMS Predictive Modeling • 2011-CMS contract with Northrop Grumman and IBM to lead teams to develop a predictive modeling system (Northrop Grumman) and models (Northrop Grumman and IBM) to identify high-risk claims • Northrop Grumman working with National Government Services (NGS) and Federal Network Systems (Verizon); IBM team includes Health Integrity • 4-year task order • How is the Northrop Grumman/ IBM project going? • See December 2012 CMS report at www.stopmedicare.gov/fraud-rtc12142012pdf

  19. Improper Payment Elimination and Recovery Act (IPERA) July , 2010 • Defines “improper payment” : • Payments that should not have been made, or payments made in an incorrect amount (including overpayments and underpayments) • Payment to an ineligible recipient • Payment for an ineligible service • Any duplicate payment • Payment for services not received • Payments for an incorrect amount

  20. The Small Business Jobs Act of 2010 • Requires the Center for Medicare & Medicaid Services (CMS) to “adopt predictive modeling and other analytics technologies to identify improper claims for reimbursement and to prevent the payment of such claims under the Medicare fee-for-service program.” • Two year predictive modeling contest for hospital admissions. WSJ 3/16/11

  21. Small Business Jobs Act of 2010 • CMS Responsibilities • Contract with private companies to conduct predictive modeling and other analytics to identify and prevent improper payment of claims submitted under Parts A and B of Medicare • Identify 10 states that have the highest risk of waste, fraud and abuse in the Medicare program; for one year, use predictive modeling and other analytics technologies to stop fraudulent claims in these states • CMS to start using predictive analytics technologies on July 1, 2011 • After the initial year HHS OIG was required to report to Congress on actual savings to the Medicare FFS for the prior year, projected future savings from the use of these technologies, and the return on investments as a result of the predictive analytics technologies. • CMS was to expand the use of predictive analytics technologies on October 1, 2012, to apply to 10 more States as having the highest risk of waste, fraud, or abuse in the Medicare fee-for-service program

  22. Small Business Jobs Act of 2010 Data Mining Requirements-How Did CMS Do? • OIG Report • Fraud Control Report • CMS Integrity Strategy?

  23. Small Business Jobs Act of 2010 Data Mining Requirements-How Did CMS Do? HHS OIG report-A-17-12-53000 (September 2012) “The Department of Health and Human Services has implemented predictive analytics technologies but can improve its reporting on related savings and return on investment” oig.hhs.gov/oas/region1/171253000.pdf

  24. Small Business Jobs Act of 2010 Data Mining Requirements-How Did CMS Do? • “In its first report, the Department could not present actual savings with respect to improper payments recovered.” • “We could not determine whether the $68.2 million in projected savings from law enforcement referrals was an accurate projection of savings. This amount represents the total value of claims identified during the investigation of leads.” • “Because the Department used actual and projected savings to calculate returns on investment, it should have included actual and projected costs to ensure that all costs were included in the return on investment calculation.”

  25. Computer World Study • 94% of IT projects in last ten years with budgets of over $10 million (in government and out) launched with major problems or simply failed. • “Companies should take small steps, via pilots and skunkworks, and invest in the ones that work.” MIT Sloan study 2013

  26. Facial Recognition Technology and More Data • Numerous public and private entities are incorporating FRT into their operations, as part of the larger biometric technology boom. • systems that consummate online transactions only when the identity of the parties has been verified via webcam. • commercial and government buildings with restricted access identify authorized persons by some biometric characteristic, with facial scanning expected to become more prevalent (think of how often your picture is taken for temporary building ids) • tagged photos for enhanced background checks on job applicants

  27. What Can We Do With All This Information? • The Supreme Court has held that the creation and dissemination of information are speech within the meaning of the First Amendment. See, e.g., Bartnicki, at 527, 121 S.Ct. 1753 ("[I]f the acts of `disclosing' and `publishing' information do not constitute speech, it is hard to imagine what does fall within that category, as distinct from the category of expressive conduct“ • prescriber-identifying information is speech for First Amendment purposes Sorrell v. IMS Health Inc., 131 S. Ct. 2653 - Supreme Court 2011

  28. Predictable Crises in Big Data • Privacy • Contracts-proprietary information • Snooping/unauthorized use • Need for lawyers to analyze and agree on disclosures and data sharing • Compliance with affiliation and authorized use agreements • We’ve got it-what do we do with it? • Disclosure events-do you know where your copier memory went?

  29. Current State • just 31 percent say their agency has an adequate big data strategy*

  30. Current State • Source: “Transforming Internal Audit: A Maturity Model from Data Analytics to Continuous Assurance”, by Jim Littley, KPMG LLP; 2013.

  31. Additional “V”s – The Analytics Viability – Understanding and testing the usefulness of the data variables, new variables, validating the hypothesis Value – Confirm viability – add value/extend variables Visualization – Making the data usable through maps, graphs, charts. Know the audience and potential. Success Factors • Source: “The Missing V’s in Big Data: Viability and Value”, by Neil Biehn, PROS; Wired.com, Innovation Insights, May 6, 2013.

  32. Define the value Tone at the top and senior leadership active involvement Data strategy – include organizational design Improved analytic capabilities of staff Robust governance - Internal and external Risk management - Data security and privacy Change management and communication strategy Success Factors

  33. Predictable Crises in Big Data • Cost • Technology change-Rosetta Stone problems • 4 Vs- • Accuracy/Reliability

More Related