150 likes | 270 Vues
Maintaining the momentum of OpenSearch in Earth Science data discovery ++. Doug Newman (NASA ECHO) & Dr Chris Lynnes (GES DISC) ESIP Winter 2014. What is OpenSearch?. A ‘refresher’. From www.opensearch.org ‘OpenSearch is a collection of simple formats for the sharing of search results’
Maintaining the momentum of OpenSearch in Earth Science data discovery ++ Doug Newman (NASA ECHO) & Dr Chris Lynnes (GES DISC) ESIP Winter 2014
A ‘refresher’ From www.opensearch.org ‘OpenSearch is a collection of simple formats for the sharing of search results’ The Earth Data discovery use case: OpenSearch Descriptor Document HTTP GET Request HTTP Response keyword = air temperature ID ID bounding_box = 39.1 -96.6 39.1 -96.6 Spatial Extent Spatial Extent start_date = 2013-11-13T00:00:00Z Temporal Extent Temporal Extent Metadata Link Metadata Link Search Link Search Link Data Link Data Link
Is it successful? • ESA’s ‘Next Generation User Services for Earth Observation’ will be using OpenSearch as an infrastructure standard* • CEOS’s CWIC effort will support OpenSearch in a future iteration that will include ESA data providers** • ESIP Federation continues to champion OpenSearch for earth science data discovery*** • NASA ECHO metrics: average number of queries per week • SOAP API (2011) – 10k (243k queries that year) • REST APIs (2013) – 87 of 115k (5 million queries so far this year) * http://eomag.eu/articles/584/eo-user-service-next-generation-project-eo-usng ** http://www.ceos.org/index.php?option=com_content&view=category&layout=blog&id=348&Itemid=482 *** http://wiki.esipfed.org/index.php/Discovery_OpenSearch_Services
Why is it successful? • Lightweight and simple • Standards-based • RESTful • Low entry cost • ‘Free text + spatial + temporal’ satisfies90% of Earth Data discovery use cases* * Based on Reverb metrics for the last year (80,420 registered users, 700k queries so far this year (11/08/13)
ECHO Reverb Statistics • Caters to 90% of Earth Data discovery use cases* * Based on Reverb metrics for the last year (80,420 registered users, 700k queries so far this year (11/08/13)
1. Converge where possible CEOS / CWIC ESIP discovery cluster OGC Attribution: http://imgs.xkcd.com/comics/standards.png
1. Converge where possible (for real) CEOS / CWIC ESIP discovery cluster OGC free_text bounding_box start_date end_date via relation uid place_name geometry described_by
2. Free text + spatial+ temporal = success • Pro: 90% !!!* • Con: lack of free text precision compared with controlled vocabularies • Can free text solve this?** • free text = ‘MODIS’ (693 hits) != instrument = ‘MODIS’ (543 hits) • free text = ‘ozone’ (348 hits) != science keyword = ‘ozone’ (81 hits) *Based on Reverb metrics for the last year (80,420 registered users, 700k queries so far this year (11/08/13) ** Based on ECHO Catalog REST API queries and ‘fuzzy’ comparisons for ‘ozone’
3. Understanding the API OpenSearch parameter extension* is good (once we update it) • Defining expectations of ‘free text’ search. What does ‘air temperature’ mean? • Defining subset of ‘geometry’ capabilities * http://www.opensearch.org/Specifications/OpenSearch/Extensions/Parameter
4. Additional functionality • Result ordering • Described in OSDD and implemented in results • Result ranking • As per OpenSearch ‘Relevance’ extension* • For free text search results • Added to ECHO in Dec. 2013 • Faceted search * http://www.opensearch.org/Specifications/OpenSearch/Extensions/Relevance/1.0
How do we achieve these goals? CEOS ESIP discovery cluster OGC Doug Newman Jérôme Gaspari* Yves Coene** * CNES - Centre National d'Etudes Spatiales ** ESA