1 / 10

Managing Global Data Flow: A Distributed Approach

This presentation explores the challenges and strategies in managing extensive data generated by global air traffic, with Heathrow Airport's limitations serving as a case study. An average of 108GB of raw engine data per hour highlights the massive scale of data involved. We delve into a distributed data architecture that utilizes a global Meta Data Catalogue (MCAT) along with various efficiency tools for data retrieval and delivery. This robust framework enables effective management of heterogeneous data sources, ensuring scalable and efficient solutions for contemporary data challenges.

storm
Télécharger la présentation

Managing Global Data Flow: A Distributed Approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Distributed Data Architecture Mark Jessop University of York

  2. Swans

  3. Grid Enabled Swans London Tokyo Cape Town Mexico City

  4. How Big is that Lake? • Heathrow capped at 36 landings per hour. • If half have 4 engines and half have 2, average aircraft carries 3 engines. • Each engine generates around 1GB of data per flight. • 36 x 3 x 1 = 108GB raw engine data per hour. • Factor in the working day and the rest of the world… • …Terabytes and up!

  5. London Tokyo Cape Town Mexico City Managing the Flow of Water

  6. Plumbing Toolkit • Data Repository • Catalogue • Pattern Match Engine

  7. Pattern Match Engine • Pattern Match Control • Data Extractor/Encoder • AURA Encoder • AURA-G • Back Check

  8. DATA DATA DATA DATA DATA DATA MCAT MCAT MCAT MCAT MCAT MCAT MCAT MCAT DATA DATA DATA Data Repository • SDSC Storage Request Broker. • Manages distributed storage resources. • Meta Data Catalogue. • Many configurations. • Heterogeneous. • Efficient data delivery. • C++ and Java APIs.

  9. MCAT A Distributed Architecture • One node per airport. • Single global MCAT. • Stream engine data. • Global Parallel Search. • Present Results. • Scalable. • Robust.

  10. Summary • Large quantities of data arriving globally. • Distributed architecture for data management and search. • Scalable and Robust.

More Related