1 / 7

Warding off Evil of Big Data

Warding off Evil of Big Data. Jun Yang. Minority Report , 2002. 39% of the experts agree….

amadis
Télécharger la présentation

Warding off Evil of Big Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Warding off Evil ofBig Data Jun Yang

  2. Minority Report, 2002

  3. 39% of the experts agree… • Thanks to many changes, including the building of “the Internet of Things,” human and machine analysis of Big Data will cause more problems than it solves by 2020. The existence of huge data sets for analysis will engender false confidence in our predictive powers and will lead many to make significant and hurtful mistakes.Moreover, analysis of Big Data will be misused by powerful people and institutions with selfish agendas who manipulate findings to make the case for what they want.And the advent of Big Data has a harmful impact because it serves the majority (at times inaccurately) while diminishing the minority and ignoring important outliers.Overall, the rise of Big Data is a big negative for society in nearly all respects. — 2012 Pew Research Center Report http://pewinternet.org/Reports/2012/Future-of-Big-Data/Overview.aspx

  4. Warding off evil—how? • “Democratize” data • Push transparency • Also make datasets easier to discover, share, cleanse, integrate… But that’s not enough… • Democratize data analysis

  5. Make analysis easier • Say you are developing a big-data analytics platform for social scientists • Don’t force them to code in SQL or Java • Don’t force them to tune execution plans, fiddle with configuration parameters, or pick clusters • Provide just two knobs: time & money • Focus on user’s experience & cost • E.g., Google DukeCumulon

  6. Make analysis understandable • Say you want to expose “lies, d---ed lies, and statistics” in politics, ads, and news • What datasets are useful in checking a claim? • Can you convert a vague claim to a query? • It may be “correct,” but is it “cherry-picking”? • How do you convince your audience? • There are in fact plenty of “core” database problems • E.g., Google Duke database computational journalism

  7. So what’s “big” about big data? • Yeah there are the big volume, big velocity, big variety, and big variability • But ultimately it’s about big value • Not just to big companies and governments • But to us all • Ward off evil by democratizing data & analysis!

More Related