1 / 32

Prescriptive Analytics Part I

Prescriptive Analytics Part I. Nick Gonzalez, 2/10/14. “It is change, continuing change, inevitable change, that is the dominant factor in society today. No sensible decision can be made any longer without taking into account not only the world as it is, but the world as it will be.”.

Télécharger la présentation

Prescriptive Analytics Part I

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prescriptive AnalyticsPart I • Nick Gonzalez, 2/10/14

  2. “It is change, continuing change, inevitable change, that is the dominant factor in society today. No sensible decision can be made any longer without taking into account not only the world as it is, but the world as it will be.” -Isaac Asimov

  3. Topics Covered • Reference automated prescriptive analytics system • Automated algorithm selection • Distributed algorithm development

  4. Covered in future presentations • Ontology creation and extraction • Representing solutions using ontologies • Business optimization • everything else…

  5. Today’s Data Landscape

  6. Tomorrow’s Data Landscape

  7. Data is outpacing us • Humans can not keep up • Computers can but…

  8. Prescriptive Analytics • Scalable • Automated understanding • Automated predictive analytics • Actionable • Closed loop

  9. game simulations learning process predictive models game server deploy metrics rules Example. Video Games write modify user space analytics space copy to production generate start understanding build / update models

  10. Problems • Scale • Speed • Adaptability

  11. Automated Learning

  12. “I do not fear computers. I fear the lack of them.” - Isaac Asimov

  13. Goals • Remove the human element from analysis phases • Generate accurate, actionable, predictive models • Combine predictive models and simulation to solve problems

  14. Guiding Principle Big data with simple algorithms will out perform sampled data with complex algorithms.

  15. How is this possible? • Focus on a single problem. • Limit scope • Goal must be • Measurable • Actionable

  16. Data Data Engineering & Understanding Actionable Deployment Prep Modeling Simulation Process

  17. 1. Automated Understanding Find the data representation that is most ideal for the problem you are trying to solve.

  18. Raw Data Clean Data Stats meta Automated Understanding Initial Transform

  19. A.1 A.2 … Representation A Representation B Representation C Stats meta Clean Data … Automated Understanding

  20. 2. Automated Algorithm Selection Find the algorithm that performs best against the problem you are trying to solve, while meeting all criteria.

  21. Automated Algorithm Selection • Choose algorithms best suited for this type of problem. • Consider the data, types, sparsity, size, and desired outcome • Try multiple algorithms • Calculate the Root Mean Squared Error or some other appropriate measure. • Consider problem domain. • Use cross validation. • Do not just compare the average RMSE • Choose the algorithm(s) that perform the best

  22. Distributed Processing • Learning to Scale

  23. Approaching the Problem • Two ways to approach a problem • Bottom up • Top down

  24. Bottom Up Approach Programmer Design Patterns, Algorithms C++, Java C, Pascal Assembly Language Hardware

  25. Top Down Problem Solver Problem Representation Distributed System Abstractions Functional Languages Hardware

  26. Building Distributed Algorithms • Identify the simplest concepts that describe data processing • Collections • Collection processing Problem Solver Problem Representation Distributed System Abstractions Functional Languages Hardware

  27. Collection Collection Processing Data Data Data Algorithm Data Algorithm Single “Box” Evolution of thought No “Box”

  28. Single PC … map Hadoop MPI k-means density random forest gradient boost …. mapcat reduce filter sort group Coming together

  29. Distributed Processing Interface • Simple concept • Focus on building algorithms • Many ways to implement this concept • Works with both shared memory systems and distributed memory systems

  30. Implementation • Functional language - Clojure • Reusable functions as callbacks • Hadoop drivers written on top of Cascalog • Data location and type are abstracted as “collection”

  31. “Part of the inhumanity of the computer is that once it is completely programmed and working smoothly, it is completely honest.” - Isaac Asimov

More Related