Data Science 101 - PowerPoint PPT Presentation

data science 101 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Data Science 101 PowerPoint Presentation
Download Presentation
Data Science 101

play fullscreen
1 / 72
Data Science 101
233 Views
Download Presentation
maren
Download Presentation

Data Science 101

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Data Science 101 A Love Story

  2. Agenda • Introductionto Data Science • Who’s who in Data Science? • That Data Science Life. • [Case Study] How Spotify manages their data. • [VM] The Data Science life at VaynerMedia. • Conclusions.

  3. “If you can measure it, you can hack it.” E -> A -> E

  4. We’re generating (and tracking) exponentially more data online than ever before.

  5. Big Data is big.

  6. 5,000,000,000 GB/2 Days

  7. We’re always playing catch-up.

  8. “Innovative Solutions” > “Industry Standards”

  9. Data Scientists are “Innovative Problem Solvers”

  10. I get it.“Big Data” is real, and Data Scientists are awesome.

  11. But what is a Data Scientist? Who are they, andhow do they work with “Big Data”?

  12. VM

  13. DJ Patil is a huge influencer in this space.

  14. Why is DJ Patil so popular?

  15. LinkedIn and People You May Know

  16. Angel has 2 mutual friends with Vikash.Tim has 20 mutual friends withVikash.If John is friends with Vikash, he might know Tim and his mutual friends.

  17. This increased platform usage, making the experience on LinkedIn more valuable.

  18. Active Users = selling point for LinkedIn when pitching to Brands.

  19. Leg up to users looking for employment in the informal job market.

  20. Big Data.Real Business objective.Simple Analysis.Valuable Data-driven Product.

  21. “Patil Effect”

  22. VM analysts do the same thing, we just don’t use the same tools.

  23. 10^100

  24. Google started downloading the entireinternet in the late 90s-early 00s.

  25. “It’s not you, it’s me.”- Google

  26. Google created a better way to process Big Data. They created MapReduce.

  27. Yahoo! wanted to download the internet too.

  28. They liked MapReduce so much that they created Hadoop.

  29. Hadoop is an open sourced distributed file system technology built using MapReduce.

  30. Developed by the folks over at Facebook.

  31. Hive is a data “warehouse” tool built to query Hadoop systems.

  32. Querying this data also allows us to work on our data retrieval skills.

  33. Less time cleaning data.Less time “fishing”.Less spreadsheets. BOOM.

  34. Amazon Web Services makes computing data in the cloud easy and cheap.

  35. No need for huge data centers on site.

  36. Pay for what you use.

  37. Makes it easy to move data around in the cloud.

  38. How does a company actually use all of these cool tools?

  39. AWS EMR (Hadoop) Spotify Client AdHocMapReduce Jobs Hive (data warehouse infrastructure; SQL-like syntax) PostgreSQL

  40. How does all of this fit in to VaynerMedia?

  41. VM