150 likes | 276 Vues
This presentation delves into the transformative journey of big data from historical astronomical measurements by Tycho Brahe and Johannes Kepler to modern computational advancements with the Large Hadron Collider (LHC) and its massive data processing capabilities. It highlights the significance of large-scale analytics, including Dryad and DryadLINQ frameworks for distributed execution and machine learning applications. Discover how big data influences various fields, from genetics to environmental studies, and see real-world examples showcasing its power and potential.
E N D
Crunching Big Data MihaiBudiu Microsoft Research, Silicon Valley September 27, 2011 big DATA
500 YEARS AGO Tycho Brahe1546 – 1601 Johannes Kepler 1571 – 1630
The laws of PLANETARY MOTION Tycho’s measurements Kepler’slaws
THE LARGE HADRON COLLIDER LHC Grid: 200K computing cores 15 PB/year
genetic CODE 106 107 108 109 1010 1011 MYCOPLASMA GRAM POSITIVE BACTERIA GRAM NEGATIVE BACTERIA FUNGI/MOULDS ALGAE WORMS CRUSTACEANS ECHINODERMS INSECTS MOLLUSKS BIRDS BONY FISH CARTILAGINOUS FISH REPTILES MAMMALS AMPHIBIANS FLOWERING PLANTS
some TRUE STORIES
Large-Scale Analytics Engine Distributed Execution: Dryad
Cluster Programming: DryadLINQ Distributed Execution: Dryad Distributed Storage: TidyFS
LEARN FROM DATA Application: Decision forests Library: Machine learning Programming: DryadLINQ Training examples Distributed Execution: Dryad Machine learning Classifier