310 likes | 434 Vues
The intersection of big data and human cognition presents unique challenges for data scientists. While big data is abundant, the true bottleneck lies not in computational power but in our ability to interpret and understand the insights derived from this data. This discussion covers the importance of a people-centric approach to data interpretation, the trade-offs between accuracy and interpretability, and the impact of speed and interaction on user experience. Join the conversation on enhancing the interplay between big data and human intellect for deeper insights.
E N D
BIG DATA,We have a communication problem. GINORMOUS SYSTEMS April 30–May 1, 2013 Washington, D.C. Daniel Tunkelang Head of Query Understanding, LinkedIn
DATA SCIENTISTS WORRY ABOUT VOLUME, VELOCITY, VARIETY, …
BUT THE BOTTLENECK ISN’TCOMPUTATIONAL IT’S COGNITIVE
BIG DATA IS A TOOL Doug Engelbart, inventor of the mouse, hypertext, etc. TOOLS AUGMENT HUMAN INTELLECT
NOT EVERYONE SUBSCRIBES TO THIS POINT OF VIEW… Claudia Perlich, Chief Scientist of media6degrees, speaking at TTI/Vanguard 2012 Conference on Understanding Understanding:
BUT PREDICTIVE MODELING ISNOT ENOUGH
TRAINING DATA? OBJECTIVE FUNCTION?
WE NEED A PEOPLE-CENTRIC APPROACH TO BIG DATA INTERPRETABILITY INTERACTION INSIGHT
LET’S START WITH INTERPRETABILITY
EXAMPLE: SVM vs. DECISION TREE
DECISION TREES HAVE FLAWS… DISCRETE
early splits provide big picture… …or reveal training data problems BUT THEY COMMUNICATE (if they’re shallow) fat leaves guide feature engineering
WHICH SUPPORTS ITERATION
INTERPRETABILITY DELIVERS • Key search leader favors rule-based approach for key scoring algorithms. • Replaced regression with decision tree in local search model: gained accuracy and insight. • Using trees to recognize spam, analyze search abandonment, model / quantify social proof.
GO DEEP vs INTERPRETABILITY A KEY DATA SCIENCE TRADE-OFF
ON TO INTERACTION
BE FAST, CHEAP, AND 98% RIGHT http://metamarkets.com/2012/fast-cheap-and-98-right-cardinality-estimation-for-big-data/
ARE PEOPLE THAT IMPATIENT? tolerable wait time for web users 0.1s increase in latency significantly reduces # of searches, ad revenue tl;dr: YES
IMPATIENCE IS GOOD SPEED MATTERS
SOLVE FOR INTERESTINGNESS Sept. 11th Abu Ghraib Weapons Inspectors
COMPUTE POTENTIAL INSIGHTS APPLY HUMAN INTUITION
SUMMARY: Let’s have a conversation with Big Data. INTERPRETABILITY INTERACTION INSIGHT