320 likes | 473 Vues
Building a National Science Digital Library. Dean Krafft, Cornell University dean@cs.cornell.edu. What is the NSDL?. An NSF-funded $20 million/year program in Science, Technology, Engineering and Mathematics (STEM) education
E N D
Building a National Science Digital Library Dean Krafft, Cornell Universitydean@cs.cornell.edu
What is the NSDL? • An NSF-funded $20 million/year program in Science, Technology, Engineering and Mathematics (STEM) education • A digital library describing over a million carefully selected online STEM resources from over 100 collections (at http://nsdl.org) • A core integration team (Cornell, UCAR, Columbia) working with 9 “pathways” portals and over 200 NSF grantees • A large community of researchers, librarians, content providers, developers, students, and teachers
NSDL Pathways Projects Middle School Portal for Math and Science: Ohio State Applied Mathematics and Science Education Repository (AMSER): U of Wisconsin-Madison The Computational Science Education Reference Desk (CSERD): Shodor Education Foundation The Math Gateway: Math Association of America Teachers’ Domain Pathways to Science: Rich Media Resources for K-12 Teachers: WGBH - Boston BioSciEdNet (BEN) Pathway: AAAS, et al ComPADRE Pathway: American Physical Society, et al A Comprehensive Pathway for K-Gray Engineering Education: NEEDS Coalition, UC Berkeley, et al. Materials Digital Library (MatDL): Kent State, et al.
NSDL Publisher Partnerships • American Mathematical Society • American Physical Society • BioOne • Blackwell Publishing • Cambridge University Press • Elsevier Books • Houghton Mifflin Company • John Wiley and Sons • National Academy Press • Nature Publishing Group • Oxford University Press ― US Book Program • Scientific American • Tom Snyder Productions ― division of Scholastic • Tool Factory ― educational software
NSDL History • 1996-1999: Papers and workshops on creating a national STEM education digital library • Fall 2000: 6 Core Integration Pilots funded; 13 collection & 9 services grants; • Fall 2001: Unified CI funded; 18 collection & 14 services grants • December 2002: NSDL.org launched; 35 collection & 11 services grants • Fall 2003: 22 collections & 11 services • Fall 2004: First 4 Pathways grants
NSDL 1.0 • Create a “union catalog” of Dublin Core metadata records for STEM resources • Harvest those records from collections using OAI-PMH (openarchives.org) • Store records in an Oracle DB and re-serve qualified DC through OAI-PMH • Build a search index using metadata plus full-text of available content pages • Create a web portal at nsdl.org for K-gray access to NSDL resources
NSDL 1.0 Lessons • Rather than one portal for everyone, support communities with common interests: Pathways now provide discipline and area-specific portals • Metadata is expensive: unlike traditional libraries, e.g. through OCLC, digital collections have very “mixed quality” metadata, with unusual and inconsistent coding • On the good side: Oracle DB and OAI-PMH server scaled successfully to over 1 million catalog records
NSDL 1.0 Lessons continued • OAI-provided collections need 3 types of expertise: domain (resources & pedagogy), metadata (vocabulary & formatting), and technical (XML schema, UTF8, HTTP, OAI-PMH). • In many cases it took several months from first contact to successful OAI harvest, and the average harvest failure rate has stayed at 25%-50%, with only 23% of that transient failures • Incremental harvesting fundamental to efficient processing, but problematic: issues with persisting deleted records and recovering from partial harvests • Result: some automation, but high people cost
NSDL 1.0 Summary • Metadata Repository was quick to implement using known technologies, but • Limited model • Metadata-centric orientation • No content – only metadata • Limited relationships – collection/item • Limits on context, structure, and access • Severe limits on contribution and collaboration • One-way data flow: NSDL → Users
Going beyond the card catalog • Create an NSDL that guides not just resource discovery, but resource selection, use, and contribution • Supports creating “context” for resources • Presents resources in context: in a lesson plan; with ratings; correlated with education standards • Supports creating a permanent archive of resources • Enables community tools for structuring, evaluation, annotation, contribution, collaboration • Goal: Create a dynamic, living library
NSDL 2.0: NSDL Data Repository • Goals: • Architecture of participation: service-based, not a monolithic application/single user experience • Remixable data sources and data transformations • Harnessing (and capturing) collective intelligence • A free market of millions of inter-related resources (create the “long tail”) • Two-way data flow: NSDL ↔ users • Solution: Fedora-based NSDL Data Repository
Fedora: the NDR middleware • A Flexible, Extensible Digital Object Repository Architecture • Open source project with $2.2 million in Mellon funding 2002-2007 • Collaboration of Cornell and Univ. of Virginia • Key funded users include: • eSciDoc project (collaboration of the Max Planck Society and FIZ Karlsruhe) • VTLS Corp., Harris Corp., Library of Congress • Australian Research Repositories Online to the World (ARROW) • Royal Library Denmark, National Library, and DTU
What is Fedora? • An architecture, toolkit, and implementation: middleware, not a vertical application • DSpace in contrast: a vertical application with a fixed workflow targeted at users • Stores arbitrary internal and external digital objects, disseminations (transformations and combinations), relationships among objects • Entirely SOAP/REST based, disseminations are URLs • XML data store; RDBMS cache; RDF triplestore supports relationship queries
Implementing the NDR with Fedora • Multiple Object Types: Resources (with local or remote content), Metadata, Aggregations (collections), Metadata Providers (branding), Agents, and Relationships: Structural (part of), Equivalence, Annotation, with arbitrary graph queries • Web services: disseminations are arbitrary recombinations and transformations of content • Authentication/Authorization: Collections and services can manage their own repository content • Network overlay architecture: A lens for viewing science content on the net, whether content is local, remote, or archived – it all has a repository-based URL
Network Overlay View User View API/UI Repository View with Relations & Annotations Resources on the Web
How should we use the NDR? • The NDR provides powerful capabilities for: • Creating context around resources • Enabling the NSDL community to directly contribute resources and context • Representing a web of relationships among science resources and information about those resources • How do we use it? Here’s one specific example …
Issues in STEM Education • Issue: Need to support scientific inquiry • Issue: Students need a better understanding of the processes of scientific research • Issue: Teachers are often under-prepared to teach science and math • Issue: Scientists need tools to make science and math research more available
Addressing the Needs • In Response: NSDL is building an educational tool that… • Models scientific inquiry and exposes the processes of scientific research • Promotes and facilitates conversations between research and education communities • Brings content expertise into the classroom to support under-prepared teachers • Allows scientists, teachers, and media specialists to collaboratively develop instructional context around NSDL resources
What is Expert Voices? • A system using blogging technology to: • Support STEM conversations among scientists, teachers and students • Tie NSDL resources to real-world science news • Create context for resources to enhance discovery, selection and use • Enable NSDL community members to become NSDL contributors: of resources, questions, reviews, annotations, and metadata • Expert Voices ≠ LiveJournal • Contributors are carefully selected, contributions are about science, the process of science, and education
Expert Voices As An Educational Tool • Topic-based discussion (e.g. tsunamis) with pointers to related resources • Research outreach (Criterion 2) – explaining and documenting NSF-funded research • Experts can add resources with topical context to the NSDL • Resources can be reviewed and annotated • Question/answer and discussion forum: scientist ↔ teacher ↔ student ↔ librarian
Broadening Participation: An Expert Voices Learning Scenario • “Hurricane Season Blog” run by a National Weather Service hurricane expert, an Earth Science teacher, and a school media specialist familiar with NSDL resources • Expert creates an entry for Hurricane Gertrude • “On track to hit Ft. Lauderdale in 72 hours” • “Currently undergoing eyewall replacement cycle” • “Expecting 15 foot storm surge” • Media specialist adds links to NSDL resources: Hurricane Hunters site, latest satellite photos, and USGS flooding and flood plain site (storm surge context) • Teacher makes connections to relevant standards and appropriate pedagogy for use by other teachers • Students experience engaging real-time, real-world applications of science lessons
Broadening Participation: An Expert Voices Outreach Scenario • NSF grantee: Bioluminescence researcher wants to make research K-12 accessible • Creates an Expert Voices conversation • Enables his students and researchers to document process and results – how science really works • Writes about publications and educational resources (e.g. www.photobiology.info) • Adds these to the NSDL, creating audience-level metadata • Entries serve as annotations that create K-12 context for the college-level research
Other applications in development • Educational Standards integration with Content Alignment Tool (Syracuse) and ASN standards database (JES & Co.) • OnRamp: an NDR-integrated multi-user, multi-project content management system • Instructional Architect: Create a lesson plan around NSDL resources (Utah State) • iVia-based Expert-Guided crawl: Tool for Pathways and others to turn websites into resource collections (UC Riverside) • MyNSDL: Bookmark and tag STEM education resources within and outside the NSDL
What does this mean for the user? • All these applications situate resources in context, aiding both discovery and use • Users become contributors, adding new resources, ratings, annotations, and organizational structure – frequently as a side effect of using the library • Specialized portals, tagging, and powerful relationship queries and filtering support user-specific “views” into the library
Summary • NSDL 1.0 created a large, production digital library of STEM resources for education. • NSDL 2.0 and its tools allow scientists, mathematicians, teachers, engineers, librarians, and students to create a unique web of context, contribution, and collaboration around the high-quality STEM education resources at the core of the NSDL.
Acknowledgements • NSDL NSF Program Officers • Lee Zia • David McArthur • NSDL Core Integration Team • UCAR: Kaye Howe, PI and Executive Director • Cornell: Dean Krafft, PI • Columbia: Kate Wittenberg, PI • Fedora Development Team • Cornell: Sandy Payette & Carl Lagoze • Univ. of Virginia: Thornton Staples This work is licensed under the Creative Commons Attribution-NoDerivs 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nd/2.5/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.