1 / 121

Chapter 1: Overview

4. Chapter 1: Overview. 4. Chapter 1: Overview. Objectives. Define analytics and data mining. Explain the proliferation of data and how this impacts the need for good analytics. Identify some of the key challenges of data mining. Name some applications where analytics are helpful.

teleri
Télécharger la présentation

Chapter 1: Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 4 Chapter 1: Overview

  2. 4 Chapter 1: Overview

  3. Objectives • Define analytics and data mining. • Explain the proliferation of data and how this impacts the need for good analytics. • Identify some of the key challenges of data mining. • Name some applications where analytics are helpful. • Name some applications where analytics are not helpful. • Explain some of the common pitfalls of analytical practice.

  4. Analytics • “The extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions.” • Davenport and Harris (2007) • Competing on Analytics: • The New Science of Winning

  5. Decision Optimization What is the best decision? PredictiveModeling What will happen next? Forecasting What if these trends continue? Competitive Advantage Basic Statistical Analysis Why is this happening? Reporting with Early Warning What actions are needed? Dynamic Reporting Where exactly are the problems? Ad Hoc Reporting How many, how often, where? Basic Reporting What happened? Achieving Success with Analytics Advanced Analytics Basic Analytics Reporting Data Intelligence Information Decision Support Decision Guidance

  6. Data Deluge hospital patient registries electronic point-of-sale data Web comments remote sensing images tax returns stock trades OLTP telephone calls airline reservations credit card charges catalog orders bank transactions

  7. Three Consequences of the Data Deluge • Every problem will generate data eventually. • Every company will need analytics eventually. • Everyone will need analytics eventually. ...

  8. Three Consequences of the Data Deluge • Every problem will generate data eventually.Proactively defining a data collection protocol will result in more useful information, leading to more useful analytics. • Every company will need analytics eventually. • Everyone will need analytics eventually. ...

  9. Three Consequences of the Data Deluge • Every problem will generate data eventually.Proactively defining a data collection protocol will result in more useful information, leading to more useful analytics. • Every company will need analytics eventually.Proactively analytical companies will compete more effectively. • Everyone will need analytics eventually. ...

  10. Three Consequences of the Data Deluge • Every problem will generate data eventually.Proactively defining a data collection protocol will result in more useful information, leading to more useful analytics. • Every company will need analytics eventually.Proactively analytical companies will compete more effectively. • Everyone will need analytics eventually.Proactively analytical people will be more marketable and more successful in their work.

  11. The Business Analytics Challenge • Getting anything useful out of tons and tons of data

  12. Hope for the Data Deluge + analytical tools hospital patient registries Web comments electronic point-of-sale data remote sensing images tax returns stock trades OLTP telephone calls airline reservations credit card charges catalog orders bank transactions = actionable knowledge

  13. Management Changes in the Analytical Landscape Historically… Models Analytical Modelers Historically, analytics have typically been handled in the “back office,” and information was shared only by a few individuals.

  14. Changes in the Analytical Landscape • Historical Changes • Executive Dashboarding – Static reports on business processes • Total Quality Management (TQM) – Customer focused • Six Sigma – Voice of the process, Voice of the customer • Customer Relationship Management (CRM) – The right offer to the right person at the right time • Forecasting and Predicting – 360-degree customer view

  15. Changes in the Analytical Landscape • Relational Databases • Enterprise Resource Planning (ERP) Systems • Point of Sale (POS) Systems • Data Warehousing • Decision Support Systems • Reporting and Ad Hoc Queries • Online Analytical Processing (OLAP) • Performance Management Systems • Executive Information Systems (EIS) • Balanced Scorecard • Dashboard • Business Intelligence

  16. CRM Evolution • Total Quality Management (TQM) • Product Centric • Quality: Six Sigma • Total Customer Satisfaction • Mass Marketing • One-to-One Marketing • Customer Relationship • Wallet Share of Customer • Customer Retention • Customer Relationship Management (CRM) • Customer Centric • Strategy • Process • Technology

  17. OPERATIONS CustomerService Retail Logistics Promotions Changes in the Analytical Landscape TARGET Now… Customers Analytical Modelers Proliferation of Models Suppliers Now analytics are being pushed out to the “front office” and are directly impacting company performance. There are clear, tangible benefits that management will track. Data mining is a critical part of business analytics. Employees Stockholders

  18. Idiosyncrasies of Business Analytics • 1. The Data • Massive, operational, and opportunistic • 2. The Users and Sponsors • Business decision support • 3. The Methodology • Computer-intensive adhockery • Multidisciplinary lineage

  19. The Data Experimental Opportunistic Purpose Research Operational Value Scientific Commercial Generation Actively Passively controlled observed Size Small Massive Hygiene Clean Dirty State Static Dynamic

  20. The Data: Disparate Business Units Marketing Invoicing Risk Acquisitions Sales Operations

  21. Opportunistic Data • Operational data is typically not collected with data analysis in mind. • Multiple business units produce a silo-based data system. • This makes business analytics different from experimental statistics and especially challenging.

  22. The Methodology: What We Learned Not to Do • Prediction is more important than inference. • Metrics are used “because they work,” not based on theory. • p-values are rough guides rather than firm decision cutoffs. • Interpretation of a model might be irrelevant. • The preliminary value of a model is determined by its ability to predict a holdout sample. • Long-term value of a model is determined by its ability to continue to perform well on new data over time. • Models are retired as customer behavior shifts, market trends emerge, and so on.

  23. Using Analytics Intelligently • Intelligent use of analytics results in the following: • Better understanding of how technological, economic, and marketplace shifts affect business performance • Ability to consistently and reliably distinguish between effective and ineffective interventions • Efficient use of assets, reduced waste in supplies, and better management of time and resources • Risk-reduction via measurable outcomes and reproducible findings • Early detection of market trends hidden in massive data • Continuous improvement in decision making over time

  24. Simple Reporting • Examples: OLAP, RFM, QC, descriptive statistics, extrapolation • Answer questions such as • Where are my targets now? • Where were my targets last week? • Is the current process behaving like normal? • What’s likely to happen tomorrow?

  25. Proactive Analytical Investigation • Examples: inferential statistics, experimentation, empirical validation, forecasting, optimization • Answer questions such as • What does a change in the market mean for my targets? • What do other factors tell me about what I can expect from my target? • What is the best combination of factors to give me the most efficient use of resources and maximum profitability? • What is the highest price the market will tolerate? • What will happen in six months if I do nothing? What if I implement an alternative strategy?

  26. Data Stalemate • Many companies have data that they do not use or that is used by third parties. These third parties might even resell the data and any derived metrics back to the original company! • Example: retail grocery POS card

  27. Every Little Bit… • Taking an analytical approach to only a few key business problems with reliable metrics  tangible benefit. • The benefits and savings derived from early analytical successes  managerial support for further analytical efforts. • Everyone has data. • Analytics can connect data to smart decisions. • Proactively analytical companies outpace competition.

  28. Areas Where Analytics Are Often Used Which residents in a ZIP code should receive a coupon in the mail for a new store location? • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • …

  29. Areas Where Analytics Are Often Used • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • … What advertising strategy best elicits positive sentiment toward the brand?

  30. Areas Where Analytics Are Often Used • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • … What is the best next product for this customer?

  31. Areas Where Analytics Are Often Used • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • … What is the highest price that the market will bear without substantial loss of demand?

  32. Areas Where Analytics Are Often Used • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • … How many 60-inch HDTVs should be in stock? (Too many is expensive; too few is lost revenue.)

  33. Areas Where Analytics Are Often Used • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • … What are the best times and best days to have technical experts on the showroom floor?

  34. Areas Where Analytics Are Often Used • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • … What revenue increase can be expected after the Mother’s Day sale?

  35. Areas Where Analytics Are Often Used • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • … Will oatmeal sell better near granola bars or near baby food?

  36. Areas Where Analytics Are Often Used • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • … Which customers are most likely to switch to a different wireless provider in the next six months?

  37. Areas Where Analytics Are Often Used • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • … How likely is it that this individual will have a claim?

  38. Areas Where Analytics Are Often Used • New customer acquisition • Customer loyalty • Cross-sell / Up-sell • Pricing tolerance • Supply optimization • Staffing optimization • Financial forecasting • Product placement • Churn • Insurance rate setting • Fraud detection • … How can I identify a fraudulent purchase?

  39. When Analytics Are Not Helpful Deciding when to run from danger • Snap decisions required • Novel approach (no previous data possible) • Most salient factors are rare (making decisions to work around unlikely obstacles or miracles) • Expert analysis suggests a particular path • Metrics are inappropriate • Naïve implementation of analytics • Confirming what you already know

  40. When Analytics Are Not Helpful • Snap decisions required • Novel approach (no previous data possible) • Most salient factors are rare (making decisions to work around unlikely obstacles or miracles) • Expert analysis suggests a particular path • Metrics are inappropriate • Naïve implementation of analytics • Confirming what you already know Predicting the adoption of a new technology

  41. When Analytics Are Not Helpful • Snap decisions required • Novel approach (no previous data possible) • Most salient factors are rare (making decisions to work around unlikely obstacles or miracles) • Expert analysis suggests a particular path • Metrics are inappropriate • Naïve implementation of analytics • Confirming what you already know Planning contingencies for employees winning the lottery

  42. When Analytics Are Not Helpful • Snap decisions required • Novel approach (no previous data possible) • Most salient factors are rare (making decisions to work around unlikely obstacles or miracles) • Expert analysis suggests a particular path • Metrics are inappropriate • Naïve implementation of analytics • Confirming what you already know The seasoned art critic can recognize a fake

  43. When Analytics Are Not Helpful • Snap decisions required • Novel approach (no previous data possible) • Most salient factors are rare (making decisions to work around unlikely obstacles or miracles) • Expert analysis suggests a particular path • Metrics are inappropriate • Naïve implementation of analytics • Confirming what you already know Predicting athletes’ salaries or quantifying love

  44. When Analytics Are Not Helpful • Snap decisions required • Novel approach (no previous data possible) • Most salient factors are rare (making decisions to work around unlikely obstacles or miracles) • Expert analysis suggests a particular path • Metrics are inappropriate • Naïve implementation of analytics • Confirming what you already know Only looking at one variable at a time

  45. When Analytics Are Not Helpful • Snap-decisions required • Novel approach (no previous data possible) • Most salient factors are rare (making decisions to work around unlikely obstacles or miracles) • Expert analysis suggests a particular path • Metrics are inappropriate • Naïve implementation of analytics • Confirming what you already know Ignoring variables that might be important

  46. Naïve Analytics • Many companies implementing analytical programs such as Six Sigma demonstrate tremendous success. • However, it is important to use analytics in a meaningful way. • For example: • It might not be possible to establish a Six Sigma process with low production volume. • Producer-centric metrics might not give useful information about customer satisfaction, and the Six Sigma process might still fail to meet customer specifications. • Simplistic reporting on massive data might hide complex patterns and is generally unsuccessful.

  47. The Fallacy of Univariate Thinking What is the most important cause of churn? Prob(churn) Daytime Usage International Usage

  48. Expectations Leading the Analysis • Even sophisticated analytics are not immune to personal bias such as • selectively fitting models with variables because they place someone’s opinion or agenda in a positive light • ignoring information that might disprove a hypothesis. • Personal bias in model fitting, whether intentional or otherwise, can diminish the usefulness of your analytical efforts.

  49. Trustworthy Analytics • Let the data guide your conclusions. • Ask the following questions: • Are my assumptions about the causes of my data patterns warranted? • Should I be trying something different? • Assign a cynic to the analytical team whose purpose is to question the assumptions. • What would my critic say is the flaw with my analysis? • Investigate the data in such a way that a critic’s concerns can be ruled out.

  50. 4 Chapter 1: Overview

More Related