390 likes | 401 Vues
Learn about GEON's Cyberinfrastructure and how it enables seamless access, coordination, and integration of earth science data to facilitate research and discovery for EarthScope. Discover involvement opportunities for earth scientists.
E N D
Virginia Tech & GEON Cyberinfrastructure and EarthScope Science goals: A GEON perspective What is Cyberinfrastructure? What is GEON? How will GEON research facilitate discovery and integration of earth science data? What are the benefits of such a research initiative for EarthScope? How can earth scientists participate in Cyberinfrastructure research opportunities? CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Cyberinfrastructure • Cyberinfrastructureis the organized aggregate of technologies enabling access and coordination of information technology resources to facilitate science, engineering, and societal goals. • Data access from distributed systems • Data inter-operability • Computation: grid based and workflows • Visualization • Tools • Integration: highlighted today National Science Foundation’s Cyberinfrastructure NSF Blue Ribbon Panel (Atkins) Report provided a compelling and comprehensive vision of an integrated Cyberinfrastructure Modified from Berman, SDSC, 2005 CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON A KEY OBSERVATION IN SUPPORT OF CYBERINFRASTRUCTURE RESEARCH IN GEOSCIENCES “Large team** efforts are required to build a federation of data and tools; but smaller groups or individuals working independently and given access to these data and tools can (and likely will) make fundamental discoveries” MODIFIED FROM BLUE RIBBON ADVISORY PANEL ON CYBERINFRUSTRUCTURE REPORT, NSF ** such as GEON and other projects CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON EarthScope Instrumentation and Data SAFOD PBO USArray InSAR Plus semantic integration of other earth science data Cyberinfrastructure Resources Towards an Integrated Earth Science data and knowledge base to achieve EarthScope Science and education goals Educators and the Public ScienceInvestigators CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Adapted from D.Seber,SDSC CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Three dimensional view of the lithosphere-asthenosphere boundary and surface topography of the northern Appalachians. Base of lithosphere interpolated from migrated Ps waveform images at 6 labeled stations. (From Rychert et al. 2005) New knowledge about evolution of continents requires complex integration of geophysical data with those associated with sub-crustal lithosphere ages, its composition and physical properties (seismic, thermal etc), surface geology and associated events chronology CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON What is the geologic and geophysical record of Super-Continent assembly and dispersal? CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON EarthScope Science Targets: Examples from eastern North America • What is the geologic and geophysical record of Super-Continent assembly and dispersal? • What are the architectures of terrane boundaries at depth? • How do composition, temperature and strain fabrics vary within the lithosphere and asthenosphere? Are lithospheric and asthenospheric strain coupled? • How sharp is the lithosphere-asthenosphere boundary? What defines it? DATA NEEDED TO ADDRESS THESE QUESTIONS ARE DISTRIBUTED ACROSS THE COUNTRY, IN DIFFERENT FORMATS AND CANNOT BE INTEGRATED IN A WEB ENVIRONMENT WITH EXISTING TECHNOLOGIES —overcoming heterogeneity is a priority cyberinfrastructure challenge CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Outline • Data integration problem and solutions • GEON data integration solution: ontology enabled semantic mediation • What is ontology • Registering data to ontologies • Discovering data and using workflows in a web environment to go from queries to questions CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON GEON Architecture addresses problems of :1. Variety of data sources and types2. Discovery and relevance3. Addressing needs of different communities • Platform heterogeneity: different OS platforms • DBMS heterogeneity: different database systems, e.g. SQLServer, mySQL, DB2 • Data type heterogeneity • Schema heterogeneity • Heterogeneity in units, accuracy, resolution • Semantic heterogeneity ( modified from Baru, SDSC, 2005) CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON What is GEON ? How can GEON help integrate heterogeneous and distributed data? CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON GEON: The Geosciences Networkwww.geongrid.org • GEON is a NSF funded collaborative research between IT and Earth Science researchers with the goal of developing cyberinfrastructure to enable new integrative modes of geosciences research • GEON is developing a pioneering system to use knowledge-based techniques to discover, query, and integrate datain the Geosciences • Project participants include 14 PI institutions, as well as partners from other projects, agencies, and industry. • GEON has deployed a Web services-based, distributed computing infrastructure, called the GEONgrid, across the PI and partner sites. • GEONgrid provides access to distributed data collections, tools, and applications Research and Education Products and Results: • Technologies for “Smart Search”, On-the-fly Data Integration, GIS Map Integration, Distributed Portals, and 4D Visualization • Earth Science Research within GEON on • 3D Lithospheric Structure • Integrated Geoscience Modeling • Geologic Evolution of North America • Ontologic Framework for the Geo-sphere • Cyberinfrastructure Summer Institute for Geoscientists and Graduate Courses in Geoinformatics CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON GEON and Cyberinfrastructure • Develop cyberinfrastructure that enables interlinking and sharing multidisciplinary Earth Science data resources, software and tools • Create a scientist-friendly portal to access data, software for analysis , modeling, and visualization • Create the GEONgrid to enable seamless data integration and analysis environment CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Indexing Services Data Integration Services Workflow Services Registration Services Visualization & Mapping Services Virginia Tech & GEON GEON: GEOsciences Network Data Physical model Portal (login, myGEON) Modeling Environment Registration GEONsearch GEONworkbench Core Grid Services GT3, OGSA-DAI, GSI, CAS, gridFTP, SRB, PostGIS, mySQL, DB2 Physical Grid RedHat Linux, ROCKS, Internet, I2, OptIPuter (planned) Model results HPCC CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Discovery of data resources (e.g., gravity, geologic maps, etc) requires registrationthrough use of high level index terms GEON has deployed extension of AGIIndexterms-will be cross indexed to others such as GCMD, AGU Discovering Item level content of databases requires registration through data level ontologies (e.g. column in geochemical database that represents SiO2 measurement) and is a requirement for semantic integration Item detail level registration through ontologies reduces schema based data heterogeneities Computation and modeling tools can be registered for use by community Visualization capabilities Easy access to data through GEON Portal Individual workbench built into GEON Portal Scientific Workflow Systems provide computational and query capabilities in a web environment Virginia Tech & GEON Discovering, sharing and using data in a web environment: GEON style CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON GEON Index Ontology AGI Index Terms Index terms from AGI used for identifying type of data CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Integration: a buzz word but with complex solutions • What is Integration? • Relationships in information contained in heterogeneous and multi-disciplinary databases What are our choices? • Layering of data (commonly used) • View based techniques (create a virtual schema) • Schema based integration (merging of schema, but user must be knowledgeable about the organization, e.g. semantics of schema) • Ontology based semantic integration utilizing workflows….favored by GEON Data Registration is Important for integration! CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON What is Ontology? Why use Ontology? • Ontology : An explicit formal specifications of the terms in the domain (e.g. Geology) and relations among them (Gruber 1993) • Why use ontology • To share and reuse of domain knowledge • To make explicit domain assumptions • To separate domain knowledge from the operational knowledge • To analyze domain knowledge • Ontology Languages: • RDF and RDFS • OIL • DAMP+OIL • OWL: Ontology Web Language fromW3C CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Motivations for Using Ontologies in GEON • A better way to discover and understand datasets Use the knowledge in ontologies to find datasets • A better way to query datasets Query through ontologies without knowing the details of the schemas • A better way to integrate multiple datasets Integrate multiple datasets on-the-fly if they are registered to ontologies • A Better way to segment large data bases Transfer only parts of data bases required for integration An emerging research frontier- Geo-Ontology Modified From Kai Lin, SDSC, 2005 CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Class Diagrams - The Basic Building Block for Semantic Integration Earth Scientists create disciplinary ontologies!!! CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Earth Science research : stages in developing ontologies Napkin Stage GEON formal ontology Concept Map Stage High Level Ontology: integrated GEON, SWEET and NADM stage CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON High Level Ontology Packages : representing relationships CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON State of Matter Planetary Material Element Rocks Minerals Data Types CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON GEON Cyberinfrastructure … More than just about the data, GEON is about going from simple Queries to complex Questions CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
? Information Integration Crime Stats Demographics Realtor School Rankings Virginia Tech & GEON A Home Buyer’s Information Integration Problem What houses for sale under $500k have at least 2 bathrooms, 2 bedrooms, a nearby school ranking in the upper third, in a neighborhood with below-average crime rate and diverse population? “Multiple-Worlds” Mediation Bertram Ludäscher, SDSC CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON A query example: Use SQL to ask a database to show you all white wines from California of 2003 vintage…. A question: "Tell me what wines I should buy to serve with each course of the following menu. And, by the way, I don't like Sauternes." … from W3C This requires two databases (e.g. food and wine) and prescribed relationships between them that are defined for computers as Ontologies Bertram Ludäscher, SDSC CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON The Problem: Scientific Data Integrationor: … from Queries to Questions CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Input a data set name Click on Submission to register a dataset Choose an ontology class Select a zipped shapefile Virginia Tech & GEON Data Registration: key to integration CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Registration at the item detail level using data ontology: working with data CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Choose subject (from a “base” ontology) Choose location (from a gazetteer Webservice) Choose a time (numeric range or from a time ontology Webservice) Choose concepts from ontologies Virginia Tech & GEON GEONsearch:building on data registration Kai Lin, SDSC, 2005 CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Geologic Data sets • Arizona, Idaho, Montana, Utah, Nevada, Colorado, Wyoming, New Mexico • Ontologies • Geologic Time Scale • Multihierarchial Rock Classification from Canada Geologic Survey • British Rock Classification Scheme Snapshot after querying “Paleozoic” Virginia Tech & GEON Ontology Enabled Map Integration :A Case Study CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Scientific Workflow Systems in GEON • Adding computational capability in a web environment • Promote “scientific discovery” by providing tools and methods to generate scientific workflows • Support computational infrastructure for modeling,classification,computation • Design frameworks which define efficient ways to connect to the existing data and integrate heterogeneous data from multiple resources CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Workflow layout for rock classification, but can be used for any query that requires a classifier Find data on the basis of ontologic registration PointInPolygon algorithm CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON Integration Scenario: A-type pluton query • Classifying A-types from an Igneous rock database • Integrating between Relational and Spatial (shapefiles) databases to query and interactively display GIS results CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Integration Scenario: Stages for access to data and tools in a workflow environment A type I & S type Virginia Tech & GEON The integration scenario: What is the distribution and U/Pb Zircon ages of A-Type Plutons in Virginia? • Ontology System • Location • States • Virginia • Classification System • Rock Classifiers • Igneous • Pluton • A-type • Mineral • Zircon • Geologic Time • Dating Methods • U-Pb Zircon Methods 1 Zr 2 4 3 104 Ga/Al 3 5 6 6 5 6 CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Distribution and ages of A-Type plutons and their ages based on integration of multiple databases
Virginia Tech & GEON How do earth scientists participate in Cyberinfrastructure research? • Know your data……its content and definitions • Think more broadly…..integration is between databases that are different from yours • Learn more about how to use IT through summer workshop at SDSC, as well as others sponsored by Societies • Register your data using Index Terms through GEON Portal to facilitate discovery of databases; use data ontology for discovery of data • Build and Share tools and services for use in a web environment • Construct concept maps in your discipline….leads to formal ontologies required for semantic integration……remember Geo-Ontology • EarthScope requires integrative capabilities CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Virginia Tech & GEON melting now where? From Objects to Processes---- just the beginning of a new integrative world heating Some processes and objects typically involved in crustal melting. From Cal Barnes, Texas Tech, 2005 CYBERINFRASTRUCTURE FOR THE GEOSCIENCES A.K.Sinha, Virginia Tech, 2005
Two important events at GSA GEON and EarthScope Reception DIVISION OF GEOINFORMATICS Data to Knowledge FIRST Business Meeting will take place during the upcoming National GSA meeting ,Salt Lake City Tuesday,18 October, Ballroom D, 5.45-7.45pm Monday, 17 October, Hilton Salt Lake City Center Alpine West Ballroom 5.00-7.00pm