1 / 21

Oceanographic Informatics in a Collaborative Environment.

Oceanographic Informatics in a Collaborative Environment. P.H. Wiebe, R.C. Groman, C. Chandler, M.D. Allison, and D. Glover Woods Hole Oceanographic Institution Woods Hole, MA, USA. Data Management Special Session N12: Strategies for Improved Marine and

Télécharger la présentation

Oceanographic Informatics in a Collaborative Environment.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Oceanographic Informatics in a Collaborative Environment. P.H. Wiebe, R.C. Groman, C. Chandler, M.D. Allison, and D. Glover Woods Hole Oceanographic Institution Woods Hole, MA, USA Data Management Special Session N12: Strategies for Improved Marine and Synergistic Data Access and Interoperability. 19 December 2008 San Francisco, CA

  2. A Context Data and Information in oceanography in general are expanding at a rapid pace and there is a significant need for more and better management tools and techniques to preserve and serve them.

  3. Talk Objectives • To discuss current developments and new directions to enable better opportunities for data discovery, integration, and synthesis of oceanographic data regardless of origin. • To encourage comprehensive efforts to establish broadly based and accepted best practices in the quest to obtain new information about ocean physics, chemistry, biology, geology, and geophysics. • To highlight some of the changes I have observed during the past four decades and strongly endorse the New Age that is fast approaching in the way we gather, store, access, and analyze information and data.

  4. A Personal Context I have worked throughout my career as a biological oceanographer on multi-investigator and multi-disciplinary programs and projects. I realized early on that data and information management was an essential element in design, acquisition, and synthesis of data sets in the oceanographic scientific enterprise. But the technology (hardware/software), resources (funding), and mandates were not in place until recently to do it effectively. The effort now is on more than data and information management. It involves what is termed “Data informatics”.

  5. Informatics Defined “Informatics is the science and engineering that occupies the gap between information and communications technology (ICT) systems and cyberinfrastructure (computers, grids, Web services, etc.), and the use of digital data, information, and related services for research and knowledge generation.” From: Baker, D.N., C. E. Barton, W. K. Peterson, and P. Fox. 2008. Informatics and the 2007–2008 Electronic Geophysical Year. Eos. 89(48): 485-486.

  6. 1976 CCR Program 1982 WCR Program HP2100 1999 GLOBEC Program CBM 8032 Evolution of MOCNESS Data Acquisition Windows PC

  7. Sampling in the Cold-Core Ring Program 1976-1977 4 Cruises Total PO, bio-process, & mapping

  8. Sampling in the Warm-Core Ring Program 1981-1982 Endeavor Knorr Oceanus 15 Cruises Total 6 PO 3 bio-process 3 bio-mapping 2 bio-process & mapping

  9. Sampling in theU.S. GLOBEC Georges Bank Program - 1994-1999 122 Cruises Total 31 Broad-scale 91 process and mooring.

  10. Data Storage 1970’s – Honeywell Sigma 7 - Simple File Storage plus the Sigma 7 Extended Database Management System. MOCNESS data only – terminal access. 1980’s Digital VAX 11/780 - Flat File Storage – all data – terminal access. Micro-computers with floppies and small hard-drives. 1990’s Sun/Unix-Linux Server’s - GLOBEC Data & Information Management system – project specific - all data – web available. Micro-computers become mainstay for labs. 2000’s Unix/Linux Server’s –BCO-DMO Data & Information Management system – multiple projects – web available

  11. The Biological and Chemical Oceanography Data Management Office (BCO-DMO) The BCO-DMO was initially created in late 2006 to serve PIs funded by the NSF Biological and Chemical Oceanography Sections to serve investigators funded by the National Science Foundation to conduct marine chemical and ecological research. BCO-DMO provides open access to marine biogeochemical and ecological data and information developed in the course of scientific research can easily be disseminated, protected, and stored on short and intermediate time-frames. [www.bco-dmo.org]

  12. Groman’s Theorems Theorem 1: The probability that all the necessary data and information are collected and preserved to allow another researcher to properly use your data is inversely proportional to the time since the data were collected. Corollary: Unless data and information are collected and preserved during the experiment (e.g., cruise), subsequent researchers will have a difficult time using those data. Theorem 2: The longer the time since the data were collected the less likely the data will ever be considered “final” or available. Conclusion: It is essential that data and information management begin with the start of a project or program.

  13. The Informatics Imperative The Rise in Interdisciplinary Oceanography and Collaboration in Ocean Science have been emphasized by Powell (2008) and Briscoe (2008). Powell: “Ocean science has long been interdisciplinary… Today, one can scarcely conceive of an oceanographic question that does not cut across disciplines.” Briscoe: “Ocean science must head toward more collaboration, because many of the research and applications questions we face demand teams of scientists and engineers (and probably social scientists and economists)…..Collaboration in the ocean sciences is critical to addressing emerging ocean problems, and is worth the effort.” It will take data informatics to make it possible! Powell, T.M. 2008. The rise of interdisciplinary Oceanography. Oceanography. 21(3): 54-57. Briscoe, M.G. 2008. Collaboration in the Ocean Sciences. Oceanography. 21(3): 58-65

  14. What has happened to cause a change? • Computers more powerful and storage much larger. • Software and software tools to handle data management now widely available. • More multi-disciplinary research is happening that is building on the works of earlier programs and the earlier data are needed for current and future work. • Programs have policies that require data sharing in reasonable time frames (~2 years) • Program Managers are requiring that data be made publicly web accessible from previous grants in order to get the funding for the next grant.

  15. Still resistance to sharing data – Why? Structural Impediments • Scientist does not want others to use the data - fear of lost opportunities. • Scientist does not know how to do it. • Other Reasons expressed: • I’m not done publishing my papers based on the data. • My graduate student is almost done analyzing the data. • It’s not final yet. • Lack of positive acknowledgment of data shared (give credit on par with papers? Need for DOI’s).

  16. Reasons for sharing data There are real advantages to sharing. • Scientist’s data are not nearly as valuable by themselves as they are in the context of all the other data sets collected within a program. • Use of other’s data within a program without sharing their data is not fair. • Data publishing with author citable references is coming. Scientists will get credit for putting their data in public repositories.

  17. BASIN – an example of a prospective new program that will require all the Data Informatics and management techniques possible. Data Informatics RDF OWL SPARQL Semantic Web Ontology web language (OWL); Resource Description Framework (RDF); SPARQL Query Language for RDF

  18. Future DM&I EX FO EX FO MO MO Summary • Research in oceanography proceeds along three major lines: field observation, field and laboratory experimentation, and modeling. Data management and informatics until now have been an after-thought. • Efforts like ecosystem-based management requires the integration of oceanographic, biodiversity, fisheries, and other marine environmental data, as well as the development of analysis and assessment tools. • Exponential increase in data sources and the proliferation and distributed nature of databases have created a fourth new and important line of marine research. Data management and informatics is now on par with lines of oceanographic research (Baker et al. 2008). Past Baker, D.N., C. E. Barton, W. K. Peterson, and P. Fox. 2008. Informatics and the 2007–2008 Electronic Geophysical Year. Eos. 89(48): 485-486.

  19. Summary • Research priorities include: • More rapid and efficient data acquisition, • Enhanced data management, • More effective data utilization and reuse, and • Improved data visualization • Development of ontologies. • The ultimate goal is to create a cyberinfrastructure for oceanography that enables open, transparent, interoperable access to data and information, regardless of their location.

  20. Acknowledgments Thanks To: • Charlton Galvarino for his excellent skill in implementing the MapServer interface. • Huan-Xiang Xu for his help during the metadata database design and his help in the initial loading of the database. • Xiaoyan Ye for her help in the initial attempts to develop comprehensive search options, geospatial displays of all the data, and for updating software to take advantage of the new database. • Julie Allen for her extensive help and support in implementing our BCO-DMO web site using Drupal and in using Cold Fusion to provide web access to the database. • National Science Foundation supported our work under grant numbers OCE-0646353 and ANT-0440777.

More Related