1 / 7

T OWARDS A B IG D ATA C OMMUNITY C HALLENGE

T OWARDS A B IG D ATA C OMMUNITY C HALLENGE. Tilmann Rabl, Florian Stegmaier, Michael Granitzer and Hans-Arno Jacobsen 3rd W orkshop on Big D ata Benchmarking July 16-17 Xi‘an , China. B IG D ATA – W HY C OMMUNITY C HALLANGES M ATTER.

flynn
Télécharger la présentation

T OWARDS A B IG D ATA C OMMUNITY C HALLENGE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TOWARDS A BIG DATACOMMUNITY CHALLENGE Tilmann Rabl, Florian Stegmaier, Michael Granitzerand Hans-Arno Jacobsen 3rd Workshop on Big Data Benchmarking July 16-17 Xi‘an, China

  2. BIG DATA – WHY COMMUNITY CHALLANGES MATTER • Big Data is a major buzzword in scientific's world • Conferences, workshops, tutorials, panels • Component benchmark, end-to-end systems, etc. • Variety leads to incomparability of results • Research communities run challenges to • … enable comparability of results • … foster evolution of a research field • … “Kites rise highest against the wind, • not with it.”(W. Churchill)

  3. WHAT SHOULD BE IN THE FOCUS? DATA! HOW SHOULD IT BE? INTERESTING! „[...] other communities, like information retrieval, natural language processing, or Web research, have a much richer and agile culture in creating, disseminating, and re-using interesting new data resources for scientific experimentation [...]” – G. Weikum, SIGMOD Blog

  4. HOW ARE „THE OTHERS“ DOING? • Information retrieval community: • TREC, TRECVid(task-based, measurable scientific impact) • CLEF Initiative (task-based, benchmarking initiatives) • Multimedia community: • Multimedia Grand Challenge (tasks defined by “global players”, e.g., Yahoo! and Microsoft) • Open Source Software Comp. (foster community activities) • Semantic Web guys: • Linked Data Cup (data generation) • Semantic Web in-Use (mashup creation)

  5. SUCCESSFUL COMMUNITY CHALLENGES: TAKE-HOME MESSAGE • Challenges are not a single event • On-going process, running through different stages: • Data generation • Solving restricted, high-impact issues • Fostering open source frameworks • Assembling mashups • Accepted by the community

  6. BRAINSTORMING AREA:STRUCTURE OF THE CHALLENGE • Challenge needs to be focused on specific tasks: • Tasks assemble a “Big Data pipeline” • Specified by academia and industry • Hybrid approach to engage participants: • Utilize benchmark activities • Computing tasks on “Open Data”

  7. TIME TO BREAKOUT! • Discussions should focus on: • Where to find large-scale, interesting “open” data sets? • Which tasks could form a sophisticated Big Data pipeline ensuring a broad range of implementations? • BREAKOUT HOW-TO: • Breakout and student groups as yesterday • Prepare one slide for each question

More Related