InteroperabilityWhat does it mean and how do we improve it? Jebb Q Stewart Chapel Hill, NC Thursday, July 11th, 2013 NOAA Earth System Research Laboratory. Boulder , CO. Affiliated with Colorado State University Cooperative Institute for Research in the Atmosphere (CIRA)
Overview NOAA Earth Information System (NEIS) Climate.gov Data Interoperability Project Interoperability Challenges and Ideas to Improve Interoperability
NEIS-The Concept NOAA Earth Information System (NEIS) is a framework of layered services designed to help NOAA’s mission areas by facilitating the discovery, access, integration, and under-standing of all NOAA (past, present, and future). Framework provides the capability to answer questions that require data from different data sources regardless of format or location.
Applying existing concepts to NOAA data. Improving the User Experience How does industry make sense of massive amounts of information?
Standards Open Geospatial Consortium (OGC) WMS WFS WCS WCS SOS WFS WFS WCS WCS CSW • WFS – Web Feature Service WMS – Web Mapping Service • WCS – Web Coverage Service CSW – Catalog Service for the Web • SOS – Sensor Observation Service WPS – Web Processing Service
Searching and Metadata • Using Esri Geoportal (http://geoportal.sourceforge.net/) as metadata repository and to harvest metadata from other sources. • Using Apache Solr (http://lucene.apache.org/solr/) to provide additional functionality and improve discoverability. • Advanced Full-Text Search Capabilities: • Facets. • Full Document Storing (allows recreation). • Supports geospatial/temporal searching. • Optimized for High Volume Web Traffic • Solr instance indexes information from harvested metadata using Open Geospatial Consortium (OGC) Catalogue Service for the Web (CSW) standard, and directly from data access services.
Data Access • Currently support a variety of data formats: • KML • HTML • FIM native grids • Movies • ERDDAP (Environmental Research Division’s Data Access Program) • Open Geospatial Consortium (OGC) Web Mapping Service (WMS) • Built additional data access services to simplify requests for time sequences of rendered data making them available to NEIS/TerraViz. • Proxy and cache image requests to improve speed and reliability.
Visualization -- TerraViz 3D visualization tool for Earth datasets Developed by NOAA, with full control of source code Written in C#; similar to Java but runs natively ESRL Global Systems Division http://esrl.noaa.gov/neis
About Unity Unity is a commercial game engine that excels at rendering 3D (and 2D) scenes Develop once, then run on Windows, Mac, Linux, iOS, Android, consoles, and browsers Millions of dollars in research and development Good community support and documentation, tutorials >800,000 registered developers ESRL Global Systems Division http://esrl.noaa.gov/neis
ESRL Global Systems Division http://esrl.noaa.gov/neis
Partnerships • Climate.gov Data Interoperability Team • Groups • Open Geospatial Consortium (OGC) Standards Committees • NOAA Environmental Data Management Committee (EDMC) • NOAA Data Management Integration Team (DMIT) • NOAA Unified Access Framework (UAF) group • NSF Earth Cube • FIM/WRF Modeling • GIS Committee ESRL Global Systems Division
Vision “Interoperability is the ability for users to discover the available climate data, preview and interact with the data, and acquire the data in common digital formats through a simple web-based interface.”
How It Works • The client allows users to interact with the various services without knowledge of how the services work. • The client, and underlying libraries, handle formatting specifics for each service, managing requests against the service, and parsing the response for display to the user • Using the search functionality, user’s can discover and access with the data above • Once the data has been located, it can be loaded to preview a visualization of the data • Once data has been selected, a user has further controls to filter underlying data based on time, keywords, or geographic location • Once user has defined constraints, the user can extract this information and download the data to their local system for further analysis • The codes are available to download and fork out in github: https://github.com/ClimateData/interoperability
Acknowledgements • Sudhir Raj Shrestha(NOAA-CPC) • Steve Ansari (NOAA-NCDC) • Kevin O’Brien (NOAA-PMEL) • David Herring (NOAA-CPO) • Mark Phillips (UNCA) • Micah Wengren (NOAA) • Mike Halpert (NOAA-CPC)
Interoperability The ability to discover, access, view, interact, and integrate data regardless of format or physical location. Components of an Interoperability System Format Agnostic. Owner/Physical Location Agnostic. Platform Agnostic. Preview Capabilities. Semantics/Ontology/Vocabulary. Machine to Machine Communication.
Data Integration • Physical • Chemical • Biological Carbon Tracker FIM Biological, Chemical, and Physical data are all interrelated
Interoperability Why do we want it? Improve accessibility Foster data exploration and use. Decrease complexity Provide framework for new tools and applications. Possibilities are endless.
Interoperability is not It is nota web page or list of links like ftp Not easily parsed or understood by a machine No metadata It is not a service Some optional parameters help interoperability WMS maxWidth/maxHeight It is nota search engine Found my data, how do I get it? What is this format? It is not anything that requires either human intervention or the machine to guess and try and try again
Driving Factors • Whitehouse – Open Data Rules to Enhance Government Efficiency and Fuel Economic Growth • Order requires that, going forward, data generated by the government be made available in open, machine-readable formats, while appropriately safeguarding privacy, confidentiality, and security. • http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-13-13.pdf • http://project-open-data.github.io/policy-memo/ • NOAA Science Advisory Board (SAB) Environmental Information Services Working Group (EISWG) ‘Executive Summary’ • Recommends developing an Open Weather and Climate Services (WCS) in which both NOAA and the community share equal and full access to NOAA information and development • http://www.sab.noaa.gov/Doc/Towards-Open-Weather-and-Climate-Services-report-and-transmittal_12_23_11.pdf
Measureable indicators… Interoperability Readiness Levels Credit: NASA ESDS Technology Infusion Working Group
Impacts • Framework built towards • standards, not data. • Important Because: • NOAA data ready for action. Services model facilitates agile response to events. Services can be combined or reused quickly, upgraded or modified independently. • Any data available through framework can be operated on or combined with other data. Integrated standardized formats and access. • New and existing systems have access to wide variety of NOAA data. Any new data added, easy incorporated with minimal to no changes required. Chemical Physical Biological ESRL Global Systems Division
Finding and Discovering Data Challenges • Finding data often requires previous knowledge or understanding of what you are searching for • Having information readily available on data and services for searching. Our environment evolves rapidly • Harvesting versus Aggregated searching • Missing or Incomplete Data • Derived or generated products/data not well documented or discoverable
Finding and Discovering Data Recommendations • Metadata must contain information to allow machine to machine communication, such as: • Automated tools to generate Metadata • ncISO • Investigate how to document “possible” derived or generated data based on process.
Data Access and Display Challenges • Lack of adherence to specifications and consistency behind standards. • Incomplete or Work in Progress Standards. • Generation of graphical representations of data. How do I stylize point information? What color palette is the best for this data? • Server uptime and data availability. Metadata records say data is available for time but server may be down during access. • Derived or generated data, what process to use and how to use it?
Data Access and Display Recommendations • Metadata uniformity helps improve user experience – need best practices, examples, and/or automated ways to produce information. • OGC MetOcean DWG and Others • Dashboard/Automated Testing Systems to give feedback to data providers on outages, adherence to standards, and metadata matchup to capabilities. • Rubrics, ncISO, etc… • Need information on how data should be displayed, imaged, stylized. Open Geospatial Consortium (OGC) Styled Layer Descriptor (SLD) and Symbology Encoding (SE). • Use ISO OnlineResource to point to style information.
Metadata ChallengesTags and Keywords • Metadata keywords inconsistent. • Lack of meaningful tags. • Keyword repetition – same keywords are used on several data sets. • Semantics, Hierarchy, Taxonomy, Ontology, Relationships – All difficult to infer. • Missing Information • Examples: • “Sea Surface Temperature” “SST” “Sea” “Surface” “Temperature” • “NOAA” vs. “National Oceanic Atmospheric Administration”, “Wind” and “Winds” • Ocean -> Ocean Circulation -> Ocean Currents
MetadataRecommendations • Be Aware of existing ontologies and keywords • GCMD Keywords: http://gcmd.nasa.gov/Resources/valids/archives/keyword_list.html • NASA JPL SWEET Ontology http://sweet.jpl.nasa.gov/ • Allow user defined keywords (Crowdsourcing) • To what extent can we auto generate this information • ncISO, etc… • Metadata validation • Rubrics, Dashboards, etc…
Communication • Stay Involved and Connected: • Organization Groups • Within NOAA: Environmental Data Management Committee (EDM), Data Management and Integration Team (DMIT) • OGC Committees (Like OGC MetOcean DWG) • ESIP • NSF • AGU • Others? • Wiki’s: • NOAA EDM -- https://geo-ide.noaa.gov/wiki/ • ESIP • Others? • Prototypes or Proof of concepts
NEIS Development Activities Discovery • Improve discoverability and ease of finding data through: • Continued ontology development by linking and improving ontologies. • Improved analytics. Learn what is being used and when to provide better results for future searching. • Creating profiles for environmental awareness. • Providing capability to weight results depending on context.
NEIS Development Activities ‘Big Data’ • NOAA’s Big Data is different. • NPP data (~ 4 TB / day and growing) • GOES-R • New global forecast models, rapidly increasing in size. • Vast amounts of historical data • Data are ever increasing in size. Improve accessibility to ‘Big Data’ through: • Providing tools allowing seamless integration of data across time and space regardless of data size. • Minimize data we transfer and avoid data duplication. • Move processing close to data. • Allow users to collaborate on said data.
And Beyond • Continue development, integrating and leveraging new and emerging technologies to meet NEIS goal ‘any data, any location, any platform, now’ • Perform processing within cloud environment and with high speed connectivity to data sources, taking advantage of large processing power within cloud infrastructure. • Send graphics and server side processed/rendered/streamed data to GUI, improving bandwidth utilization. • Take advantage of fast networking to make remote requests and processing appear like local application. • Similar to how the concept of the Amazon Silk Browser. ESRL Global Systems Division
Questions? Jebb.Q.Stewart@noaa.gov http://www.esrl.noaa.gov/neis