300 likes | 443 Vues
Walt Warnick, Ph.D. Director, Office of Scientific and Technical Information U.S. Department of Energy. STI & Innovation. A Mission with a Vision. Annual STIP Meeting, April 26-27, 2006. Welcome back. … and welcome Mary Case. A big thanks to Kathy Macal and Argonne. Name this site!.
E N D
Walt Warnick, Ph.D. Director, Office of Scientific and Technical Information U.S. Department of Energy STI & Innovation A Mission with a Vision Annual STIP Meeting, April 26-27, 2006
Welcome back … and welcome Mary Case A big thanks to Kathy Macal and Argonne Name this site!
Our Mission To advance science and sustain technological creativity by making R&D findings available and useful to DOE researchers and the American people Two ways we fulfill our mission STI ► Innovation ►
Collect and preserve DOE-funded R&D results in a variety of document formats, housed in a central repository of physical documents or virtual inventory of electronic publications. STI
Energy Policy Act of 2005 "The Secretary, through the Office of Scientific and Technical Information, shall maintain within the Department publicly available collections of scientific and technical information resulting from research, development, demonstration, and commercial applications activities supported by the Department."
Exciting Times for Basic R&D and DOE Labs • "Sustained scientific advancement and innovation are key to maintaining our competitive edge."President's American Competitiveness Initiative, February 2006 DOE National Laboratories are key to the President’s American Competitiveness Initiative The environment in Washington is vibrantwith renewed and robust emphasis on basic research We value our partnership
Science Education DOE and its national laboratories have the best authentic science information in the Nation It’s time to create a fully searchable organized “gateway” to DOE’s educational resources, stratified along logical educational levels "To prepare our citizens to compete more effectively in the global marketplace, the American Competitiveness Initiative proposes $380 million in new Federal support to improve the quality of math, science, and technological education in our K-12 schools …” "The American Competitiveness Initiative commits $5.9 billion in FY 2007, and more than $136 billion over 10 years, to increase investments in research and development, entrepreneurship and innovation." Courtesy White House press releases
STIAB & STI Policy ► STIAB Charter: “Scientific knowledge embodied in textual and numeric data produced across the DOE complex is appropriately disseminated and preserved.” A new draft STI Policy addresses data as scientific information and calls for data sharing, preservation, and access for re-use. This effort is being led by the OSTI. ► ►
Agency Efforts NSF/NSB – Long-Lived Data Collection NIH – Journal articles DOE – STIAB ► ► ►
Digital Data Interagency Working Group ► Representing the Department on the Interagency Working Group on Digital Data, newly formed by the Committee of Science of the National Science and Technology Council. The working group is to develop and implement a roadmap for the federal government and coordinate policies and programs to help ensure preservation and access to data. ► (Draft Charter is under review)
Data-Driven Discovery Japanese Art Images – 70.6 GB JCSG/SLAC – 15.7 TB NVO – 93 TB Life Sciences Astronomy Arts and Humanities Engineering Projected LHC Data – 10 PB/year SCEC – 153 TB TeraBridge – 800 GB Geosciences Physics Courtesy of Fran Berman, UCSD
10 DOE Data Centers & scores of other collections ARM Program Data Management Facility, PNNL BABAR Database, SLAC Collider Detector at Fermilab (CDF) Run Results ARM Program’s External Data Center, BNL Brahms Event Data, BNL Carbon Dioxide Information Analysis Center, ORNL Thermodynamics data at Combustion Research Facility,SNL Controlled Fusion Atomic Data Center,ORNL Gammasphere Data,LBNL, ANL, ORNL Distributed Active Archive Center for Biogeochemical Dynamics,ORNL IPCC Climate Change Data,LLNL Joint Genome Institute,LBNL Homo sapiens protein database files, PNNL Measurement and InstrumentationData Center, NREL Wind data for the Wind Powering America Program, INL National Nuclear Data Center,BNL Photographic Information eXchange, NREL Radiation Safety Information Computational Center, ORNL U.S. Life-Cycle Inventory database, Golden Field Office Renewable Resource Data Center, NREL
Agency’s data management and re-use issues are as simple yet complicated as: Collection and preservation: What is going to be saved and for how long? Long-term care: Funding source for preservation after projects terminate. Identification and access: Documentation of the dataand data collection processes. Science discovery:Significance of a data set or collection. ► ► ► ►
Provide access to expanded sources of R&D information to the DOE research community and science-attentive public, using innovative tools such as dynamic databases, federated deep Web searching, and relevancy ranking, to advance awareness of a broad array of scientific information related to DOE missions. Innovation
Areas of active innovation include: ► Federation of distributed collections with simultaneous, ranked, full-text search. Modeling scientific exchange in the research process Grid-based and other distributed computer processing techniques to support federation of collections Integration of numeric data and text Expanding digital access to legacy documents ► ► ► ► OSTI aggressively seeking ways to increase access and reduce burden for labs
A Vision for the Future: Accelerating Science Knowledge Diffusion
Science depends on the diffusionof knowledge Isaac Newton expressed this thought most eloquently in 1676, when he wrote: “If I have seen further than others, it is by standing on the shoulders of giants.”
OSTI Corollary Accelerating science knowledge diffusion accelerates science progress
Often the knowledge scientists need resides in distant communities Genomics Spectroscopy Materials Science Condensed Matter Physics Plasma Physics Nanoscience Oceanography Cryogenics Seismology Cosmology Cybernetics Cognitive Neuroscience Astrodynamics Agricultural Engineering Radiology Acoustics Geology Meteorology Limnology Stereochemistry Bioinformatics Immunology
Diffusion from one area of discovery to remote communities may take months to years!
Metcalf’s Law …states that the value of a network equals approximately the square of the number of users of the system OSTI’s Law: The value of the network increases as the speed/relevancy of retrieval increases. Courtesy Wikipedia
A New Era of Innovation Accelerating the sharing of knowledge means science progresses faster, compressing decades to years, years to months, and months to days.
Searching multiple deep Web databases via a single query First government deployment made by OSTI in 1999 in EnergyFiles Metasearch Patrons often overwhelmed by too many hits Introduced relevancy ranking of metasearch results in 2004 in Science.gov Goal to approach ultimate in precision searching
The Deep Web Is HUGE! Challenge: Metasearch has scaling issues
Current working models for global discovery: Science.gov E-print Network Science Conferences (www.osti.gov/scienceconferences) Current systems provide examples for global discovery (www.science.gov) (www.osti.gov/eprints) Global discovery benefits scientists as they conduct deep Web research across all scientific communities simultaneously.
Deep Web Thanks to you (and OSTI) a significant portion of databases are from DOE Science.gov E-print Network ScienceConferences Only small parts of the deep Web are metasearchable
A global discovery gateway would advance science Goal: Make all of science available and accessible in one place Process: Expand content, enhance precision searching and amplify computing power Means: Deploy Innovations in Scientific Knowledge and Advancement (ISKA) to form a Global Discovery gateway, which would aggregate, search and rank all of the important, Web-accessible science databases
We stand on the rim of a new era of global discovery #1 Expand Content • Increase searchable content within databases • Incorporate databases from all scientific communities • Beyond text – numeric data, audio, video, etc. #2 Enhance Precision Searching • Integrate analytical tools • Improve sophistication and speed of relevancy ranking • Next-generation algorithms • Visualization techniques #3 Amplify Computing Power • Deploy emerging technologies to enable extraction of ever increasing content • Increase computer power, storage capacity • Special architecture, grid technology
“The calculus of innovation is really quite simple: • Knowledge drives innovation; • Innovation drives productivity; • Productivity drives our economic growth. • That’s all there is to it.” - William R. Brody President, The Johns Hopkins University Member, Council on Competitiveness U.S. Competitiveness: The Innovation Challenge Testimony to the House Committee on Science July 21, 2005