370 likes | 389 Vues
SERVOGrid is a project supporting earthquake science by providing grid/web services and portals. It aims to answer questions about plate boundaries, tectonics and climate, ice masses and sea level change, magmatic systems and volcanoes, dynamics of the mantle and crust, and Earth's magnetic field.
E N D
SERVO Grid:Solid Earth Research Virtual ObservatoryGrid/Web Services and Portals Supporting Earthquake Science Current SERVOGrid is USA Project led by JPL (Jet Propulsion Laboratory) but next is iSERVO with International Collaboration between Australia, China, Japan and USA August 26 2004 Beijing China Geoffrey Fox, Marlon Pierce Community Grids Lab, Pervasive Technologies Laboratories Indiana University
What is the nature of deformation at plate boundaries and what are the implications for earthquake hazards? • How do tectonics and climate interact to shape the Earth’s surface and create natural hazards? • What are the interactions among ice masses, oceans, and the solid Earth and their implications for sea level change? • How do magmatic systems evolve and under what conditions do volcanoes erupt? • What are the dynamics of the mantle and crust and how does the Earth’s surface respond? • What are the dynamics of the Earth’s magnetic field and its interactions with the Earth system? Solid Earth Science Questions From NASA’s Solid Earth Science Working Group Report, Living on a Restless Planet, Nov. 2002
The Solid Earth is:Complex, Nonlinear, and Self-Organizing Relevant questions that Computational technologies can help answer: • How can the study of strongly correlated solid earth systems be enabled by space-based data sets? • What can numerical simulations reveal about the physical processes that characterize these systems? • How do interactions in these systems lead to space-time correlations and patterns? • What are the important feedback loops that mode-lock the system behavior? • How do processes on a multiplicity of different scales interact to produce the emergent structures that are observed? • Do the strong correlations allow the capability to forecast the system behavior in any sense?
Characteristics of Computing for Solid Earth Science • Widely distributed datasets in various formats • GPS, Fault data, Seismic data sets, InSAR satellite data • Many available in state of art tar files that can be FTP’d • Provenance problems: faults have controversial parameters like slip rates which have to be estimated. • Distributed models and expertise • Lots of codes with different regions of validity, ranging from cellular automata to finite element to data mining applications (HMM) • Simplest challenges are just making these codes useable for other researchers. • And hooking this codes to data sources • Some codes also have export or IP restrictions • Other codes are highly specialized to their deployment environments. • Decomposable problems requiring interoperability for linking full models • The fidelity of your fault modeling can vary considerably • Link codes (through data) to support multiple scales
SERVOGrid Requirements • Seamless Access to data repositories and computing resources • Integration of multiple data sources including databases, file systems, sensors, …, with simulation codes. • Core web servicesfor common tasks like command execution and file management. • Meta-data generation, archiving, and access with extending openGIS (Geography as a Web service) standards. • Portalswith component model (portlets) for user interfaces and web control of all capabilities • Basic Grid tools: complex job managementand notification • Collaboration to support world-wide work • “Collaboration” can range from data sharing to Audio-video conferencing
SERVOGrid Applications • Codes range from simple “rough estimate” codes to parallel, high performance applications. • Disloc: handles multiple arbitrarily dipping dislocations (faults) in an elastic half-space. • Simplex: inverts surface geodetic displacements for fault parameters using simulated annealing downhill residual minimization. • GeoFEST: Three-dimensional viscoelastic finite element model for calculating nodal displacements and tractions. Allows for realistic fault geometry and characteristics, material properties, and body forces. • Virtual California: Program to simulate interactions between vertical strike-slip faults using an elastic layer over a viscoelastic half-space • RDAHMM: Time series analysis program based on Hidden Markov Modeling. Produces feature vectors and probabilities for transitioning from one class to another. • Preprocessors, mesh generators: AKIRA suite • Visualization tools: RIVA, GMT,IDL
SERVOGrid Codes, Relationships Elastic Dislocation Inversion Viscoelastic FEM Viscoelastic Layered BEM Elastic Dislocation Pattern Recognizers Fault Model BEM This linkage called Workflow in Grid/Web Service parlance
Service-1 Service-3 Role of Workflow • Programming the Grid: Workflow describes linkage between services • As distributed, linkage must be by messages • Linkage is two-way and has both control and data • Apply to multi-scale (complexity) linkage, multi-program linkage, link visualization to simulation, GIS to simulations and viz filters to each other • Microsoft-IBM specification BPEL is current preferred Web Service XML specification of workflow • SERVOGrid uses ANT (well known XML build tool) to perform workflow and this works well in our relatively simple cases) Service-2
(i)SERVO Web (Grid) Services • Programs: All applications wrapped as Services using proxy strategy • Job Submission: support remote batch and shell invocations • Used to execute simulation codes (VC suite, GeoFEST, etc.), mesh generation (Akira/Apollo) and visualization packages (RIVA, GMT). • File management: • Uploading, downloading, backend crossloading (i.e. move files between remote machines) • Remote copies, renames, etc. • Job monitoring • Workflow: Apache Ant-based remote service orchestration • For coupling related sequences of remote actions, such as RIVA movie generation. • Data services: support remote data bases and query construction • XML data model being adopted for common formats with translation services to “legacy” formats. • Migrating to Geography Markup Language (GML) descriptions. • Metadata Services: for archiving user session information.
Security: Authentication and Authorization • Authentication describes who the user is • Authorization describes what a given user can do • What data and computers can be accessed • Basically a database • Current portal uses password accounts and provides services for free for demonstration. • iSERVO should decide on “charging for” services • We have (through Community portal effort OGCE) support for GSI and Kerberos authentication services. • These just plug in and replace the default login service. • Authorization is currently simple: you can only reach your files. • iSERVO should develop an authorization policy • Simultaneous Cross Administrative Domain access is a very hard Grid problem and no consensus as to good solution • Systematic use of Services helps security/privacy/IP issues as “danger of misuse” is lower for services (which have limited privileges) than for direct computer access
SERVO Data Sources • Fault Data • Developed as part of the project • QuakeTables: http://infogroup.usc.edu:8080 • Seismic data formats • Available from www.scec.org • SCSN, SCEDC, Dinger-Shearer, Haukkson • GPS data formats • Available from www.scign.org • JPL, SOPAC, USGS
Applications and Observational Data • Several SERVO codes work directly with observational data. • Scenarios include • GeoFEST, VirtualCalifornia, Simplex, and Disloc all depend upon fault models. • RDAHMM and Pattern Informatics codes use seismic catalogs. • RDAHMM primarily used with GPS data • Problem: We need to provide a way to integrate these codes with the online data repositories. • QuakeTables Fault Database was developed • What about GPS and Earthquake Catalogs? • Many formats, data available in tars or files, not searchable, not easy to integrate with applications • Solution: use databases to store catalog data; use XML (GML) as exchange data format; use Web Services for data exchanges, invoking queries, and filtering data.
Geographical Information Service (GIS) Data Formats and Services • OpenGISConsortium (OGC) is an international group for defining GIS data formats and services. • Main data format language is the XML-based GML. • Subdivided into schemas for drawing maps, representing features, observations, … • First Step: design GML schemas and build specialized Web Services for GPS and Earthquake data. • OGC also defines services. • Services include Web Features Services, Web Map Services, and similar. • These are currently pre-Web Service, based on HTTP Post, but they are being revised to comply with WS standards. • Next Step: Implement OGC compatible Web Services for this problem i.e. build a GIS Grid • Also build services to interact with QuakeTables Fault DB.
GML and Existing Data Formats • GPS or seismic data used in this project are retrieved from different URLs and have different text formats. • Seismic data formats • SCSN, SCEDC, Dinger-Shearer, Haukkson • GPS data formats • JPL, SOPAC, USGS • We defined 2 GML Schemas to unify these • http://grids.ucs.indiana.edu/~gaydin/servo • A summary of all supported formats and data sources can also be found there.
Prototype GML Service • First version of the system available • Tried XML databases but performance was awful • Currently database uses MySQL • Download results are in GML, but we can convert to appropriate text formats.
openGIS Grid Semantics • Note GIS (Geographical Information System) Grid at heart of all these Grids • Geography Markup Language (GML) is an XML encoding for the specification of the geometry and properties of geographic features. GML utilizes the OpenGIS Abstract Specification geometry model which has been harmonized with the ISO geospatial geometry model. • We are building CI specific ontologies in terms of GML to define faults, satellites etc. • http://ripvanwinkle.ucs.indiana.edu:4780/examples/download/schema/ • Styled Layer Descriptor (SLD) specifies the format of a map-styling language for portraying the output of Web Map Servers, Web Feature Servers and Web Coverage Servers etc. SLD will enable different communities in the Emergency Response area to develop a set of customized portrayal rules that best fit their mission requirements. • This becomes the specification of portals to different composite Grids • Sensor Markup Language (SensorML) defines the information model for discovering, querying and controlling Web-resident sensors. • Observations & Measurements (O&M) defines the information model for observations that are returned from the CrisisGrid sensors.
GIS Grid Services I • Web Feature Service (WFS) supports the query and discovery of geographic features delivering GML representations of simple geospatial features in response to queries from HTTP clients. WFS can access geographic features including critical infrastructure features, incident locations, and flood-related geographic features including inundation areas, watershed boundaries, and demographic feature. • Web Coverage Service (WCS) supports the query and discovery of digital geospatial information such as digital elevation models, imagery, orthophotography, weather coverages (such as predicted rainfall, air pressure, wind speed and direction), and any other space-varying flood-related phenomena. • Web Map Service (WMS) uses a SLD portrayal to generate "pictures" of georeferenced feature or coverage data. WMS will provide a means to portray geographic information independent of the underlying data model (WFS or WCS). • Coverage Portrayal Service (CPS) defines a standard interface for producing visual pictures from coverage data typically accessed via WCS with a SLD portrayal.
GIS Grid Services II • Web Terrain Service (WTS) augments WMS with advanced visualization including 3D terrains. • Catalog Service - Web Profile (CS-W) is a catalog service that will be built on a general Grid metadata service • Sensor Collection Service (SCS) fetches observations from a sensor or group of sensors and will be integrated with research on Grid sensor services • Sensor Planning Service (SPS) assists in 'collection feasibility plans' and to process collection requests for a sensor or group of sensors. • Web Notification Service (WNS) will be replaced by standard Grid notification service
QuakeTables+OGC Web Map Service Demo http://rio.ucs.indiana.edu:8080/wmsClient/
Streams and Workflow • NaradaBrokering can manage streams from • Audio/Video conferences • Sensors • Inter-service communication in workflow • http://www.hpsearch.org/demo/ describes scripting managementinterface to NaradaBrokering Grids involve streams as well as compute and data nodes Workflow and dataflow like BPEL imply streams
SERVOGrid Ontology • SERVOGrid has many types of metadata • We are designing RDFS descriptions for the following components: • Simulation codes, mesh generators, etc. • Visualization tools • Data types • Computing resources • … • These are easily expressed as RDFS (actually DAML) “nuggets” of information. • Create instances of these • Use properties to link instances.
Some Sample Relationships Danube installedOn installedOn Computer GMT Viz Appl Disloc visualizedBy Application createsOutput usesInput Stress Map storedIn DataFormat USC Fault DB Fault Data Storage DataType
More Information • SERVOGrid/QuakeSim: • http://quakesim.jpl.nasa.gov/ • Full Portal Demo: • http://complexity.ucs.indiana.edu:8080 • Request an account • Downloads available in November • Fault database • http://infogroup.usc.edu:8080 • GPS and Seismic Database Demo: • http://gf3.ucs.indiana.edu:6060/cce/sql/ • Setting up your own GPS or Seismic database • http://complexity.ucs.indiana.edu/~gaydin/cce/install/install.html • Publications: • http://grids.ucs.indiana.edu/ptliupages/publications/ • http://grids.ucs.indiana.edu/ptliupages/presentations/
Some Grid Controversies • 1) There are several proposals for the Web Service extensions needed for Grids – why do we ignore? • OGSI (GT3) • WSRF (GT4) • WS-GAF (Newcastle) • WS-I+ (Pure Web Services) • We use WS-I+ approach – can later add extensions when consensus clear • This approach adopted by next phase of UK e-Science Program • 2) Web Services are too slow as use HTTP with clumsy ASCII XML data (SOAP)? • Currently no problem but can use separate control channel from data channel if need high performance
iSERVO Strategy • Agree on what (type of) resources and capabilities need to put on the ISERVO Grid • Computers, instruments, databases, visualization, maps, job submittal …. • Agree on interfaces to resources from OGSA-DAI (databases) to particular data structures (GML/OpenGIS) – specify in XML • Implement Resources and Capabilities as Services • User Interface should be a portlet that can be integrated by the portal into web interface • Make certain overarching Grid capabilities such as workflow, federation and metadata are sufficient • SERVO Grid is a prototype of this strategy using several US sites rather than several countries • Can be naturally extended to iSERVO, education, emergency response by extending resources • Web Service Architecture ensures continued interoperability and extensibility
Lessons Learned • Web service performance is not an issue when used to invoke services that take hours to complete. • Later real-time sensors will probe performance • Reliability is a larger problem. • Need monitoring/heartbeat services. • Information systems still have a long way to go. • UDDI is part of WS-I but has/had some well known limitations. • WS-Discovery has some interesting concepts but is too specialized to ad-hoc networks. • Peer-to-peer systems provide many useful concepts like discovery and caching. • Semantic Web provides powerful resource descriptions that could be exploited. • XML Databases slow
Further iSERVO Challenges • Make everything a Service • Think about Data Curation • Set up policies for observational data and criteria for inclusion in iSERVO data repositories • Think about Data Provenance • Generate and maintain metadata describing ownership, origins and transformations • Applies to both “experimental data” and results from simulations (visualizations) • Curation and Provenance change in research methodologies and requires funding! • Education and Emergency Response/Planning interesting offshoots of iSERVO
QuakeSim Portal for SERVOGrid • The services need user interfaces • WSDL descriptions are all you need to create client stubs (if not client applications). • The QuakeSim portal effort aggregates these service interfaces into a portal. • Customizable displays, access controls to services, etc. • QuakeSim is just one of many, many such projects. • Challenge is to develop reusable portal components
Each Service has its own portlet Individual portlet for the Proxy Manager Use tabs or choose different portlets to navigate through interfaces to different services 2 Other Portlets
Computational Web Portal Stack • Web service dream is that core services, service aggregation, and user interface development decoupled. • How do I manage all those user interfaces? • Use portlets. Aggregate Portals Portlet User Interface Components Application Web Services and Workflow Core Web Services
Portal Architecture Clients (Pure HTML, Java Applet ..) Aggregation and Rendering Portlet Class:WebForm Gateway (IU) Web/Gridservice Computing Remoteor ProxyPortlets Web/Gridservice Data Stores Portlet Class GridPort etc. Portlet Class Web/Gridservice Instruments (Java) COG Kit Portlet Class Hierarchical arrangement Portal Internal Services LocalPortlets Clients Portal Portlets Libraries Services Resources
Why Are Portlets a Good Idea? • You don’t have to reinvent everything • Makes it easy (but not effortless) to share portal components between projects. • So you can pull in portlets from all the other earthquake grid projects. • You can easily combine a wide range of capabilities • Add document managers, collaboration tools, RSS news lists, etc for your portal users.
Lessons Learned: Portals • Developing good user interfaces is a lot of work. • Effort doesn’t scale: how do you simplify this for computational scientists to do it themselves without lots of background in XML, Java, portlets, etc? • Portal interfaces have advantages and disadvantages. • Everyone has a browser. • But it has a limited widget set, a limited event model, limited interactivity. • You can of course overcome a lot of this with applets. • Following the service model, you can in principal use any number of GUIs • Browsers are not the only possible clients. • Web service interoperability means that Java Swing apps, Python, Perl GUIs are all possible, but this has not been fully exploited.
Important Lessons/Principles • Use OGCE Portal Architecture and portal services • Can expect GGF activities like OGSA to define/refine interfaces and projects around the world to produce more powerful services • Obsolescing of implementations is a consequence of interoperability • Use Grids of Grids of Services Architecture • Interoperable Component Grids Built from interoperable services • Collaboration, Compute, Database, GIS, Sensor, Visualization Grids • Build a GIS (Geographical Information Systems) Grid spanning simulation/crisis management and different fields with openGIS compliance • openGIS has defined Web Service Interfaces • Visualization should build on these • Geoscience Education Grid by transformations on research grid • Emergency Response and Planning Grids by adding real-time control/collaboration and GIS tools • These additions common to all crises • Collaboration between Beihang University and Indiana University to produce Web Service based audio/video conferencing