1 / 21

Overview

The LEAD Gateway Dennis Gannon, Beth Plale, Suresh Marru, Marcus Christie School of Informatics Indiana University. Overview. The LEAD ITR Project Science Objectives Adaptive CyberInfrastructure for Mesoscale Storm Prediction A tour of the LEAD project

Télécharger la présentation

Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The LEAD GatewayDennis Gannon, Beth Plale, Suresh Marru, Marcus Christie School of InformaticsIndiana University Indiana University School of Informatics

  2. Overview • The LEAD ITR Project • Science Objectives • Adaptive CyberInfrastructure for Mesoscale Storm Prediction • A tour of the LEAD project • Components of our approach to Data and Data Driven Adaptive Workflow • Experience so far. • The Gateway Lifecycle Indiana University School of Informatics

  3. Predicting Storms • Hurricanes and tornadoes cause massive loss of life and damage to property • Underlying physical systems involve highly non-linear dynamics so computationally intense • Data comes from multiple sources • “real time” derived from streams of data from sensors • Archived in databases of past storms • Infrastructure challenges: • Data mine instrument radar data for storms • Allocate supercomputer resources automatically to run forecast simulations • Monitor results and retarget instruments. • Log provenance and metadata about experiments for auditing. Indiana University School of Informatics

  4. The LEAD Project Indiana University School of Informatics

  5. Traditional Methodology STATIC OBSERVATIONS Radar Data Mobile Mesonets Surface Observations Upper-Air Balloons Commercial Aircraft Geostationary and Polar Orbiting Satellite Wind Profilers GPS Satellites • Product Generation, • Display, • Dissemination Prediction/Detection PCs to Teraflop Systems • Analysis/Assimilation • Quality Control • Retrieval of Unobserved • Quantities • Creation of Gridded Fields The Process is Entirely Serial and Static (Pre-Scheduled): No Response to the Weather! • End Users • NWS • Private Companies • Students Indiana University School of Informatics

  6. The LEAD Vision: Adaptive Cyberinfrastructure DYNAMIC OBSERVATIONS • Product Generation, • Display, • Dissemination Prediction/Detection PCs to Teraflop Systems • Analysis/Assimilation • Quality Control • Retrieval of Unobserved • Quantities • Creation of Gridded Fields Models and Algorithms Driving Sensors The CS challenge: Build cyberinfrastructure services that provide adaptability, scalability, availability, useability, and real-time response. • End Users • NWS • Private Companies • Students Indiana University School of Informatics

  7. Change the Paradigm • To make fundamental advances we need: • Adaptivity in computational model. • But also Cyberinfrastructure to: • Execute complex scenarios in response to weather events • Stream processing, triggers • Close loop with the instruments. • Acquire computational resources on demand. • Need supercomputer-scale resources • Invoked in response to weather events • Deal with data deluge • User can no longer manage his/her own experiment products Indiana University School of Informatics

  8. The LEAD Gateway Portal • To support three classes of users • Meteorology research scientists & grad students. • Undergrads in meteorology classes • People who want easy access to weather data. Go to: http://www.leadproject.org Indiana University School of Informatics

  9. Gateway Components • A Framework for Discovery • Four basic components • Data Discovery • Catalogs and index services • The experiment • Computational workflow managing on-demand resources • Data analysis and visualization • Data product preservation, • automatic metadata generation and experimental data providence. Indiana University School of Informatics

  10. Data Search • Select a region and a time range and desired attributes Indiana University School of Informatics

  11. Portal: Experimental Data & Metadata Space • CyberInfrastructure extends user’s desktop to incorporate vast data analysis space. • As users go about doing scientific experiments, the CI manages back-end storage and compute resources. • Portal provides ways to explore this data and search and discover it. • Metadata about experiments is largely automatically generated, and highly searchable. • Describes data object (the file) in application-rich terms, and provides URI to data service that can resolve an abstract unique identifier to real, on-line data “file”. Indiana University School of Informatics

  12. Workflow: Composing Computational Tools to build new Tools • Workflow is a term that describes the process of moving data through a sequence of analysis and transformational steps to achieve a goal. • Another Paradigm Shift for the users. • Each activity a user initiates in LEAD is an Experiment which consists of • Data discovery and collection. • Applied analysis and transformation • A graph of activities (workflow) • Curated data products and results • Each workflow activity is logged using an event system and stored as metadata in the users workspace. • Provides a complete provenance of work. Indiana University School of Informatics

  13. The Experiment Builder • A Portal “wizzard” that leads the user through the set-up of a workflow • Asks the user: • “Which workflow do you want to run?” • Once this is know, it can prompt the user for the required input data sources • Then it “launches” the workflow. Indiana University School of Informatics

  14. Parameter Selection Indiana University School of Informatics

  15. Selecting the forecast region Indiana University School of Informatics

  16. Indiana University School of Informatics

  17. Gateway Support for Adaptive Queries LEAD requires ability to construct workflows that are • Data Driven • Weather data streams define nature of computation • Persistent and Agile • Data mining of data stream, detects “interesting” feature, event triggers workflow scenario that has been waiting for months. • Adaptive • In response to weather: weather changes. • Nature of workflow may have to change on-the-fly. • Resource and requirements change. Indiana University School of Informatics

  18. Experience with on-demand computing • We use TeraGrid. • Actually “best effort” and not yet “on demand” • Use Grid technology for remote job execution and security. • Reliability is critical. • Workflow can automatically resubmit a failed task to another resource • Urgent Computing handled by the Spruce Gateway. Indiana University School of Informatics

  19. Validating Scientific Discovery • The Gateway is becoming part of the process of science by being an active repository of data provenance • Disks are cheap, so why not record everything? • The system records each computational experiment that a user initiates • A complete audit trail of the experiment or computation • Published results can include link to provenance information for repeatability and transparency. Indiana University School of Informatics

  20. Experience so far • First release to support “WxChallenge: the new collegiate weather forecast challenge” • The goal: “forecast the maximum and minimum temperatures, precipitation, and maximum sustained wind speeds for select U.S. cities. • to provide students with an opportunity to compete against their peers and faculty meteorologists at 64 institutions for honors as the top weather forecaster in the nation.” • 79 “users” ran 1,232 forecast workflows generating 2.6TBybes of data. • Over 160 processors were reserved on Tungsten from 10am to 8pm EDT(EST), five days each week • National Spring Forecast • First use of user initiated 2Km forecasts as part of that program. Generated serious interest from National Severe Storm Center. • Integration with CASA project scheduled for final year of LEAD ITR. Indiana University School of Informatics

  21. The LEAD Gateway Lifecycle • Work began in 2003 with requirements analysis by the LEAD meteorology and CS teams. • First 2 years of development supported by LEAD ITR and NMI Portals project. • Year 3 & 4 support of 2 FTE from TG. • Public Release March 2007. • Current Status • A new production release in July 2007. • Last year of LEAD ITR: hardened version of the Gateway to transition to community support • UCAR - UNIDATA may be the host. • Extensive planning underway. Indiana University School of Informatics

More Related