300 likes | 518 Vues
Predictive modeling competitions. making data science a sport. Anthony Goldbloom CEO, Kaggle e-mail anthony.goldbloom@kaggle.com twitter @antgoldbloom. Photo by mikebaird, www.flickr.com/photos/mikebaird. Motivation Why compete? How it works R on Kaggle
E N D
Predictive modeling competitions making data science a sport Anthony Goldbloom CEO, Kaggle e-mail anthony.goldbloom@kaggle.com twitter @antgoldbloom Photo by mikebaird, www.flickr.com/photos/mikebaird
Motivation • Why compete? • How it works • R on Kaggle • The Heritage Health Prize
Global competitions Predicting HIV viral load Competition closes 77% 1½ weeks 70.8% State of the art 70%
Crowdsourcing Mismatch between those with data andthose with the skills to analyse it
Additional slides Not MIT, not SAS … UoL?
Tourism Forecasting Competition Forecast Error(MASE) Existing model Aug 9 2 weeks later 1 month later Competition End
Chess Ratings Competition Existing model (ELO) Error Rate(RMSE) Aug 4 1 month later 2 months later Today
Users apply different techniques • neural networks • logistic regression • support vector machine • decision trees • ensemble methods • adaBoost • Bayesian networks • genetic algorithms • random forest • Monte Carlo methods • principal component analysis • Kalman filter • evolutionary fuzzy modeling
Motivation • Why compete? • How it works • R on Kaggle • The Heritage Health Prize
Why Participants Compete 2 1 More fun than Sudoku Clean, Real world data Professional Reputation & Experience 4 3 Interactions with experts in related fields Prizes
Motivation • Why compete? • How it works • R on Kaggle • The Heritage Health Prize
Competition Mechanics Competitions are judged on objective criteria
Motivation • Why compete? • How it works • R on Kaggle • The Heritage Health Prize
Motivation • Why compete? • How it works • R on Kaggle • The Heritage Health Prize
What could the world’s bestanalysts find in your data? e-mail anthony.goldbloom@kaggle.com phone +61438400053 Photo by gidzy, www.flickr.com/photos/gidzy