1 / 26

Pushing the Frontiers of Analytics

Pushing the Frontiers of Analytics. Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics. Global Technology Outlook Objectives .

ivan
Télécharger la présentation

Pushing the Frontiers of Analytics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pushing the Frontiers of Analytics Brenda Dietrich, IBM Fellow & VP CTO, Business Analytics

  2. Global Technology Outlook Objectives GTO identifies significant technology trends early. It looks for high impact disruptive technologies leading to game changing productsand services over a 3-10 year horizon. Technology thresholds identified in a GTO demonstrate their influence on clients, enterprises, & industries and have high potential to create new businesses. 2

  3. Global Technology Outlook 2012 Uncertain data and analytics are major themes Managing Uncertain Data at Scale Systems of People The Future Watson Outcome Based Business Future of Analytics Resilient Business and Services 3

  4. Managing Uncertain Data at Scale • By 2015, 80% of the world’s data will be uncertain • Uncertain data management requires new techniques • These techniques are necessary for real-world Big Data Analytics Trend: Most of the world’s analyzed data will be uncertain Opportunity: Business leadership using Big Data Analytics • Robust, business-aware uncertain data management • Use analytics over uncertain web, sensor, and human-generated data • Enable good business decisions by understanding analysis confidence • Analysis of text is highly nuanced; sensor-based data is imprecise • Timely business decisions require efficient large-scale analytics • It is more difficult to obtain insight about an individual than a group, especially if the source data is uncertain Challenge: Taking Big Data Analytics into an uncertain world 4

  5. The fourth dimension of Big Data: Veracity – handling data in doubt Volume Velocity Variety Veracity* Data in Many Forms Data at Rest Data in Motion Data in Doubt Terabytes to exabytes of existing data to process Streaming data, milliseconds to seconds to respond Structured, unstructured, text, multimedia Uncertainty due to data inconsistency& incompleteness, ambiguities, latency, deception, model approximations * Truthfulness, accuracy or precision, correctness 5

  6. Uncertainty arises from many sources Process UncertaintyProcesses contain “randomness” Data UncertaintyData input is uncertain Model UncertaintyAll modeling is approximate Actual Spelling Intended Spelling Text Entry ? ? ? Fitting a curve to data GPS Uncertainty Uncertain travel times ? ? ? Testimony {Paris Airport} Ambiguity {John Smith, Dallas}{John Smith, Kansas} Semiconductor yield Forecasting a hurricane(www.noaa.gov) Contaminated? Rumors Conflicting Data 6

  7. By 2015, 80% of all available data will be uncertain By 2015 the number of networked devices will be double the entire global population. All sensor data has uncertainty. 9000 8000 7000 6000 5000 4000 3000 2000 1000 0 100 90 80 70 60 50 40 30 20 10 The total number of social media accounts exceeds the entire global population. This data is highly uncertain in both its expression and content. Sensors (Internet of Things) Global Data Volume in Exabytes Aggregate Uncertainty % Data quality solutions exist for enterprise data like customer, product, and address data, but this is only a fraction of the total enterprise data. Social Media(video, audio and text) VoIP Enterprise Data Multiple sources: IDC,Cisco 2005 2010 2015 7

  8. Examples: Uncertainty management presents many opportunities System analytics predict maintenance Creating profiles from many sources • Downtime costs $M in income loss • Equipment maintenance needs unpredictable • Customer contracts impose penalties • Many inconsistent data sources • Intent hidden within social media • Geospatial data is imprecise 360˚ customer view Smarter Planet 35% more satisfied customers by analyzing agent notes 35% better churn prediction using customer SMS messages Reduced time to determine lending risk from weeks to minutes Auto Energy 5% more oil platform production 30% less maintenance cost Telco Improvements obtained using statistical modeling that combine equipment sensor data with performance history to predict corrective maintenance activities Research Process and forecast uncertainty More data from physician notes and tests • Modeling Uncertainties • Demand, sales, production, shipment • Shipping Uncertainties • Goods damaged • Mistakes in shipped goods • Structured medical records are incomplete • “Golden” text notes • must be interpreted • Drug names • Relationship types (mtr, sibs, m, paunt) • Uncertainty in images Healthcare Supply chain 80% lower price protection costs 30% less channel inventory 50% fewer returns Healthcare Able to identify: • 40% more smokers found • 15% more disease history Mitral stenosis: • 50% more diagnoses • 35% misdiagnoses Reductions obtained using inventory replenishment model that accounts for uncertain price protection 8

  9. A & Temporal Reasoning Condensing data reduces uncertainty by constructing context Required: tight integration to maximize context discovery Required: common practices followedby multiple standards for representing uncertain data and uncertainty of all types, provenance, and lineage and other metadata Loyalty Credit Data finds Data MichaelSan Jose, CA Mother Date Son $560 Buying DSLR today ! Fact Discovery Birthday OR $999 Influencers Intent CONDENSE & NY Buyinga DSLR today ! Spatial Reasoning Sense Making Customer at Mall Maximum ContextForMinimum Uncertainty Customer in Store #42 Correlation Required: common APIs to enable sharing across the uncertainty management pipeline No such common practices, standards or APIs exist today $999 $560 Corroboration (Evidence Combination) In-Store Pricing And Discounts ETC. 9

  10. Systems of People A shift in value from process optimization to people-centric processes • Organizations have extracted most of the efficiencies from traditional process automation • IT enablement opportunities are shifting to Line of Business • Social business drives new efficiencies and value from people-centric processes • An opportunity to instrument people-processes • Provides the basis for addressing diverse set of problems A new set of data is made possible by exploiting social business • Adaptive social platforms instrumented with knowledge capture, interconnected with enterprise data and processes, and made intelligent through differentiating analytics will transform business A new IT market is emerging 10

  11. People-centric processes are at the core of a broad range of issues Differentiate for Growth Create winning products, fast, by having the best and most productive knowledge workers Drive Sales Productivity Create superior sales force, drive sales enablement and seller/client alignment Grow in Emerging Markets Re-create organizational footprint in global markets Transform Service Delivery Further grow productivity and enable new delivery models 11

  12. Optimizing people-centric processes is not the same as optimizing supply chains In the last couple of weeks, I’ve talked to ABC bank, XYZ and at a security conference. Status: Working Expert: Security Status: At conference CRM Claims DeliveryRecords Patents & Publications Influencer • Innovation • Products • Technical leadership • Clients served • Products sold • Sales patterns • Productivity • Work specs • Tasks accomplished • Productivity “Status updates alone on Facebook amount to more than ten times more words than on all blogs worldwide” - David Kirkpatrick, The Facebook Effect • Engagements worked • Team info • Rich information (e.g. expertise, work patterns, response to incentives, digital reputation) is flowing through on-line collaboration and enterprise systems • Capturing this information enables analytics to be applied to people-centric processes 12

  13. Strength of Sales Force Index is an example of what is possible with a rich representation of people • SSFI mines sales force data to understand which attributes of a seller (e.g. skills, experiences), sales team (e.g. team composition, territories) or sales process (e.g. incentives, coverage model) are driving sales performance (quota attainment, win rates, productivity) • SSFI identifies: • Reasons for performance disparities (at individual or group level), and the best set of actions to drive performance FUTURE • True skills and expertise • Disciplines • Clients served • Products sold • Team experiences • Connections • Incentives and responses • Career path • … TODAY • Years selling • Job change • Salary band • PBC “Why is our sales force in Region X not performing at par with other regions or competition?” “What actions can we take to improve sales performance?” “What are the incentives that truly drive performance?” 13

  14. Executing on SoP vision depends on three key capabilities Incorporate capabilities that adapt content for situations and needs, and enhance communication over many devices,across diverse pools of talent context-aware cognitive load managementtranslation, transcriptiontext-to-speech, voice… Develop capabilities to create a representation of a person’s skills, experiences, preferences, digital reputation… In a structured and organized way, so it can be used for the purpose of running a business Implement capabilities for people-centric process optimization within an analytics platform for rapid, on-demand deployment matching, talent cloud crowdsourcing, predictive markets simulation of workforce trends performance analytics behavior modeling… PEOPLE ENABLEMENT People Content PEOPLE ANALYTICS 14

  15. Future of Analytics • Creates new analytics opportunities • Addresses new enterprise needs Explosion of unstructured data Consistent, extensible, and consumable analytics platform • Reduces cost-to-value for enterprises • Increases analytics solution coverage with limited supply of skills Optimizing across the stack to deploy analytics at scale • Analytics becomes a dominant IT workload and drives HW design • Opportunity to seamlessly scale from terascale to exascale 15

  16. Analytics is broadly defined as the use of data and computation to make smart decisions Data Decision point Possible outcomes • Data instances • Reports and queries on data aggregates • Predictive models • Answers and confidence • Feedback and learning Historical Option 1 Option 2 Simulated Option 3 Text Video, Images Audio 16

  17. The value of analytics grows by incorporating new sources of data, composing a variety of analytic techniques, spanning organizational silos, and enabling iterative, user-driven interaction New format or usage of data Multi-modal demand forecasting Intent-to-buy trends Sources and types of data Segmentation-based market impact estimates Price-based demand forecasting (own & competitors) Sales-based demand forecasting Structured or standardized Scope of decision Low High 17

  18. Analytics toolkits will be expanded to support ingestion and interpretation of unstructured data, and enable adaptation and learning New Methods • Learn Adaptive Analysis Responding to context In the context of the decision process Continual Analysis Responding to local change/feedback Optimization under Uncertainty Quantifying or mitigating risk • Decide and Act Optimization Decision complexity, solution speed Predictive Modeling Causality, probabilistic, confidence levels Simulation High fidelity, games, data farming • Understand and Predict Traditional Forecasting Larger data sets, nonlinear regression Alerts Rules/triggers, context sensitive, complex events Query/Drill Down In memory data, fuzzy search, geo spatial • Report Ad hoc Reporting Query by example, user defined reports Real time, visualizations, user interaction Standard Reporting New Data Entity Resolution People, roles, locations, things • Collect and Ingest/Interpret Relationship, Feature Extraction Rules, semantic inferencing, matching Decide what to count;enable accurate counting Annotation and Tokenization Automated, crowd sourced Extended from: Competing on Analytics, Davenport and Harris, 2007 18

  19. Analytic solutions will apply multiple methods to multiple forms of dataExample: Utility Vegetation Management SENSORS UTILITY DATA MAPS WEATHER • Effective Right of Way vegetation management is critical to streamlined utility operations • Traditional Right of Way programs are mainly static-scenario driven on a six year cycle • Static and rigid models lead to predominantly reactive operations, which are expensive • Focus on narrow corridor widths fails to address severe weather impact • A multimodal analytics approach can overcome these shortcomings • Structured data (e.g. transmission line maps) and unstructured data (e.g. LIDAR sensor) • Advanced modeling to perform a dynamic scenario-driven analysis 3-DimensionalModel Recovery Visualization ELECTRIC Preprocessor TELECOMMUNICATIONS Right-of-WayDynamicForecasting Model Solution Framework RAIL Preprocessor ROAD Preprocessor OIL ScheduleGenerator Preprocessor 19

  20. Analytics solution development requires several interacting design steps Algorithm Composition and Invention Data Evaluation and Fusion Testing and Execution Optimization Streaming data Data mining & statistics Text data Optimization & simulation Multi-dimensional Semantic analysis Time series Fuzzy matching Geo spatial Video & image Network algorithms Relational New algorithms ✔ Social network Business Rules Engine Core Analytics Filtering and Extraction Validation Data Acquisition Deployment Composition andPackaging 20

  21. An Analytics solution platform will increase enterprise value by supporting both the CxO solution and the CIO infrastructure Revenue Withplatform Withoutplatform Lines of code Expand Mandate Refine business processes and enhance collaboration Leverage Mandate Streamline operations and increase organizational effectiveness Pioneer Mandate Radically innovate products, markets, business models Transform Mandate Change the industry value chain through improved relationships • Easier consumption of Analytics solutions • Have consistent look and feel • Changes are easier to implement effectively • Trustworthy solutions are produced • More efficient, less complex development • Reduces growth of development costs • Speeds delivery of new functionality • Expands analytics solution developer population • Reduces client cost of operation • Seamless integration eases deploymentof solutions • Establishes preferred development pathfor new solution • Consistent and coherent infrastructure eases managing solutions The CIO can reduce cost and add value to the use of analytics by supporting collaboration and data/analysis sharing 21

  22. Optimizing across the stack will enable the deployment of analytics at scale Systems supporting future analytics will be more data centric, composable and scalable • Systems will support increasingly complex data sets and workflows. • Different elements within these complex workflows will require different capabilities within systems. Predictive Analytics Modeling, Simulation Text Analytics Hadoop Workloads Optimization Sensitivity Analysis Future System General Purpose Integrated Network Integrated Processing Integrated Storage SCM Cores SCM Cores SCM Cores SCM Cores + + Network Storage Network Storage Network Storage Network Storage • Balanced, reliable, power efficient systems, with integrated software that scales seamlessly • Integrated analytics, modeling and simulation capabilities to address generation, management and analysisof Big Data for Business Advantage 22

  23. The Future Watson Extend Watson technology • Moves beyond “question-in & answer-out” to always “learning” evidence-based decision support • Addresses the enterprise need to convert growing volumes of information into actionable knowledge • Demonstrates business value in critical problem spaces, starting with Healthcare Lead in new domains • Efficiently adapting and scaling Watson to new domains requires a novel blend of engineering and research Enable efficient adaptation 23

  24. Watson’s real value proposition: Efficient decision support over unstructured (and structured) content Deeper Understanding,Higher Precision and Broader,Timely Coverage at lower costs Open-Domain Question-Answering Jeopardy! Challenge Existing BI Relevance Ranking Inference/Rules Key WordSearch SQL/XQuery Shallow Understanding Low Precision Broad Coverage Deeper Understanding but Brittle High Precision at High Cost Narrow Limited Coverage • Unstructured Data • Broad, rich in context • Rapidly growing, current • Invaluable yet under utilized • Structured Data • Precise, explicit • Narrow, expensive 24

  25. Taking Watson beyond Jeopardy! Understanding Interacting Explaining Learning Specific Questions Precise Answers& Accurate Confidences Batch Training Process Question-In/Answer-Out The type of murmur associated with this condition is harsh, systolic, and increases in intensity with Valsalva From specific questions to rich, incomplete problem scenarios (e.g. EHR) Evidence analysis and look-ahead, drive interactive dialog to refine answers and evidence Move fromquality answers to quality answers and evidence Scale domain learning and adaptation rate and efficiency Input, Responses Answers, Corrections, Judgements Entire Medical Record Dialog Refined Answers, Follow-up Questions Responses, Learning Questions Rich ProblemScenarios Continuous Training& Learning Process ComparativeEvidence Profiles Interactive Dialog Teach Watson 25

More Related