430 likes | 605 Vues
Linked Library Data. Tuning Library Metadata for the [Semantic] Web . Presented 2011-03-16 ALCTS RDA Webinar Series Corey A Harper. Topical Overview. Semantic Web & RDF Intro Linked Open Data [Linked] Library Data Resource Description and Access (RDA) Beyond MARC As RDF Vocabularies
E N D
Linked Library Data Tuning Library Metadata for the [Semantic] Web Presented 2011-03-16ALCTS RDA Webinar Series Corey A Harper
Topical Overview • Semantic Web & RDF Intro • Linked Open Data • [Linked] Library Data • Resource Description and Access (RDA) • Beyond MARC • As RDF Vocabularies • Broader Interoperability • Small steps forward… Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Semantic Web • TBL’s original vision • “Weaving the Web” – 1999 • Then: Focus on Machine Reasoning • Scientific American Article • Now: Focus on things & links • Reasoning & Inferencing less central Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Semantic Web • Originally: • Metadata standard built on XML • Metadata about “Web” things (documents) • Eventually: • Metadata about all sorts of things • And about relationshipsbetween things • What are the “things”? Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Semantic Web Terminology • Resource: Any “thing” • Class: Abstraction of a type of thing • Individual: An instance of a class • Property: An attribute of an individual • Statement/Triple: • A Resource (subject) • A Property (predicate / verb) • A Value (object) - Nodes • Graph: Visual Representation of statements • Ontology: A domain specific collection of classes and properties Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Semantic Web Terminology • Nodes: The Subjects and Objects in a Graph • Arcs: The Predicates in a Graph • Domains and Ranges: Constraints on Nodes • Domain: What things can be subjects • Range: What things (or strings) can be objects • Literals: Values as strings rather than things • Named Graphs: Graphs with URIs treated as nodes. Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Linked Open Data • Use URIs as names for things • Use HTTP URIs so that people can look up those names. • When someone looks up a URI, provide useful information. • Include links to other URIs. so that they can discover more things. http://www.w3.org/DesignIssues/LinkedData.html Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Data in the Cloud • Hubs in the May 2008 Version: • FOAF • DBPedia • Myriad Sources coming online: • Thompson Reuters • New York Times • British Broadcasting Corporation • Government Data (UK, US and more) • Google and Facebook • More and More Library, Archive and Museum Data • Geonames • MusicBrains Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
DBpedia • Structured Wikipedia Data • Genres, Influences, External Links • Multi-lingual / Multi-script labels • Rich Semantics • Many linkages to other datasets Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
DBpedia Model • Partial basis in data entry conventions • InfoBox’s, and InfoBox Templates • Metadata Entry Format • Partial source of Ontology • Class Structure • Vocabulary Design Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
DBpedia • 3.4 Million “things” described • Ontology based on “infoboxes” • 1.5 million things classified • http://wiki.dbpedia.org/Ontology • Approx. 50,000 “Properties” • Approx. 1,200 defined in ontology Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
What *things* are in our data??? Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
…Librarydata is extremely complicated
Library Metadata • Rich stores of MARC, MODS, &c. • Robust Controlled Vocabularies • Subject Heading lists • Code lists • Thesauri • Emerging data model in FR* Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Bibliographic Vocabs • Bibliographic Ontology (Bibo) • Zotero, Omeka, EPrints and Others • FRBR – unofficial • And now Official (Thank you IFLA!) • ISBD • Resource Description and Access (RDA) Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Linked Library [Archive, Museum] Data • LIBRIS (Swedish Union Catalog) • Library of Congress (LCSH, OSI) • German National Library • Hungarian National Library • British Library • Europeana • Archives Hub & LOCAH Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Library Authority Data “Include links to other URIs. so that they can discover more things.” Short of providing and linking to URIs, this *is* authority data. This is what our authority files are for. Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Library Controlled Vocabularies: Benefits • Reputation - Trusted Tradition • Mature - Time tested and carefully developed • General & Comprehensive - Cover large knowledge spaces Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
SKOS • Simple Knowledge Organization System • Properties and Classes for describing Controlled Vocabulary • Heavily used in Linked Library Data • id.loc.gov • Virtual International Authority File (VIAF) skos:primaryTopic bibo:book skos:subject Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Other Vocabularies • Thesaurus for Economics • French Subject Headings • Swedish Subject Headings • IconClass (not on web yet) • OCLC Terminology Services • Dewey Decimal Classification • Virtual International Authority File • Metadata Authority Description Schema (MADS) Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Resource Description and Access • Current focus on MARC • Much criticism • Within MARC, not a tremendous change • Different problems outside of MARC • Possible focus outside of MARC • RDA as realization of FRBR • RDA as Metadata Vocabularies • RDA as related to Bibo Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
RDA as Metadata Vocabularies Slide Adapted from Diane Hillmann Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services RDA elements, roles and vocabularies have been provisionally registered IFLA FRBRer and ISBD elements and vocabularies have been officially registered Discussions about long term maintenance of both RDA and the vocabularies Effort to create multi-language RDA Vocabularies
Metadata Registries • Formerly NSDL Registry • Now “Open Metadata Registry” • Managing Vocabularies • Providing Vocabulary Services • RDA – Now adding translations • IFLA Work • FRBR, FRAD, FRSAD, ISBD Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
RDA as realization of FRBR • What will this look like? • Probably *won’t* be stored in MARC • Overly constrained by FRBR? • Properties have FRBR domains & ranges • Unofficial “Generalized” properties • Non-FRBR metadata • Similar to DCMI’s range constraints… Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Support Free Range Metadata! Photo Credit: http://www.flickr.com/photos/ciwf/3217378769/ Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
BIBO and RDAVocab • Open question re: alignment • Simplified view of Bib Data is useful • Interlinking with more general data • Interlinking with non-library domain data • FRBR as internal model for library domain • Examples Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Why Does This Matter? Our descriptions no longer stand alone! Connect our data with the rest of the WEB Allow others to reuse more easily • FOAF, Geonames • DBPedia • MusicBrains • New York Times, Thomson Reuters • Government Data - data.gov • British Broadcasting Corporation • Other Library, Archive and Museum Data Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Conclusions • Distributed bibliographic control environment • Linking Data • Focus on identification over description • “In short, by treating values as non-literal resources and assigning URIs to them we give ourselves (and others) the hooks on which to hang further descriptions.” - Andy Powell Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
“Records” in Linked Library Data Vocabulary Alignment and Interoperability DCMI planning in this space General Metadata Interoperability Application Profiles? Archival Data for *context* - (EAC-CPF) Future Work Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
W3C Linked Library Data Incubator • Collecting, Curating and Clustering over 50 Use Cases • Mining use cases for functional requirements and design patterns • Recommendations to W3C • Should lead to Working Groups • http://www.w3.org/2005/Incubator/lld/ Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Other Activities • ALCTS/LITA Linked Library Data IG • IFLA Semantic Web IG • https://wiki.d-nb.de/x/vA10Ag • Open Knowledge Foundation • http://okfn.org/ • CKAN Linked Library Data Group: • http://ckan.net/group/lld Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services
Thanks! corey.harper@nyu.edu 212.998.2479 @chrpr Questions? Harper - Linked Library Data - RDA Webinar Series Hosted by the Association for Library Collections and Technical Services