100 likes | 230 Vues
This presentation explores the challenges and strategies in managing extensive data generated by global air traffic, with Heathrow Airport's limitations serving as a case study. An average of 108GB of raw engine data per hour highlights the massive scale of data involved. We delve into a distributed data architecture that utilizes a global Meta Data Catalogue (MCAT) along with various efficiency tools for data retrieval and delivery. This robust framework enables effective management of heterogeneous data sources, ensuring scalable and efficient solutions for contemporary data challenges.
E N D
A Distributed Data Architecture Mark Jessop University of York
Grid Enabled Swans London Tokyo Cape Town Mexico City
How Big is that Lake? • Heathrow capped at 36 landings per hour. • If half have 4 engines and half have 2, average aircraft carries 3 engines. • Each engine generates around 1GB of data per flight. • 36 x 3 x 1 = 108GB raw engine data per hour. • Factor in the working day and the rest of the world… • …Terabytes and up!
London Tokyo Cape Town Mexico City Managing the Flow of Water
Plumbing Toolkit • Data Repository • Catalogue • Pattern Match Engine
Pattern Match Engine • Pattern Match Control • Data Extractor/Encoder • AURA Encoder • AURA-G • Back Check
DATA DATA DATA DATA DATA DATA MCAT MCAT MCAT MCAT MCAT MCAT MCAT MCAT DATA DATA DATA Data Repository • SDSC Storage Request Broker. • Manages distributed storage resources. • Meta Data Catalogue. • Many configurations. • Heterogeneous. • Efficient data delivery. • C++ and Java APIs.
MCAT A Distributed Architecture • One node per airport. • Single global MCAT. • Stream engine data. • Global Parallel Search. • Present Results. • Scalable. • Robust.
Summary • Large quantities of data arriving globally. • Distributed architecture for data management and search. • Scalable and Robust.