170 likes | 436 Vues
UNIMARC and linked data. Gordon Dunsire and Mirna Willer Presented at Session 187 (Advancing UNIMARC: alignment and innovation) of the World Library and Information Congress : 77th IFLA General Conference and Assembly, 13-18 August 2011, San Juan, Puerto Rico . Overview. Background
E N D
UNIMARC and linked data Gordon Dunsire and MirnaWiller Presented at Session 187 (Advancing UNIMARC: alignment and innovation) of the World Library and Information Congress : 77th IFLA General Conference and Assembly, 13-18 August 2011, San Juan, Puerto Rico
Overview • Background • Linked data and the Semantic Web • Methods and issues in representing UNIMARC for the Semantic Web • Recommendations
Background • Representation of IFLA standards for use in the Semantic Web • Work of the FRBR Namespaces project and IFLA Namespaces Task Group • Work of the ISBD/XML Study Group • Included a feasibility study of representation of UNIMARC • Representations allow legacy catalogue records to be published as linked data using RDF • Branding IFLA standards for authority & trust • Semantic Web lets “Anyone say Anything about Any resource”
Linked data and RDF • Resource Description Framework (RDF) • Designed for machine-processing of metadata at global scale (Semantic Web) • 24/7/365 • Trillions of operations per second • Everything must be dis-ambiguated • Machines are dumb • A simple approach helps! • Machine-readable identifiers
RDF triple • Metadata expressed as “atomic” statements • A simple, single, irreducible statement • The title of this book is “Cataloguing is fun!” • Constructed in 3 parts • “Triple” • The title of this book is “Cataloguing is fun!” • Subject of the statement = Subject: This book • Nature of the statement = Predicate: has title • Value of the statement = Object: “Cataloguing is fun!” • This book – has title – “Cataloguing is fun!” • subject – predicate - object
Machine-readable identifiers • Uniform Resource Identifier (URI) • Can be any unique combination of numbers and letters • No intrinsic meaning; it’s just an identifier • RDF requires the subject and predicate of triple to be URIs • Object can be a URI, or a literal string (“Cataloguing is fun!”) • URIs can be matched by machine to link triples together
UNIMARC element identifiers Element: Number (ISBN) Tag: 010 1stind.: b 2ndind.: (Unique in element set) b Subfield: a Coded Information Block: Target audience vocabulary: children, ages 9-14 Target audience code (Unique in element set) 100bba Character position: 17-19 Code: d (Unique in vocabulary)
Vocabularies and Element sets • Controlled terminologies represented as vocabularies • UNIMARC entities, attributes, and relationships form an element set • Attributes and relationships represented as properties/predicates • Entities represented in RDF as classes • But only 1 entity in UNIMARC-B (Resource) • ISBD already has an equivalent class for Resource
UNIMARC and ISBD properties • Element identifier/URI: unimarcb:P205bbb • Label (English): (has) issue statement • Equivalent ISBD URI: isbd:P1011 • Label (English): has additional edition statement • The meaning is the same, but the identifiers and labels are different • unimarcb:P205bbb same as isbd:P1011 (in RDF) • Or use isbd:P1011 instead of unimarcb:P205bbb
Translations • The same identifier is used for translated elements (captions, definitions, etc.) and vocabularies (preferred terms, definitions, etc.) • E.g. Vocabulary of 116bba0 = Coded data for graphics: Specific material designation
Graphics SMD translation example • Term identifier/URI: namespace/b • Notation: b • Preferred label (English): drawing • Preferred label (Italian): disegno • Preferred label (Portuguese): desenho • Definition (English): An original visual representation (other than a print or painting) ...
Triples from UNIMARC records • Create or obtain URI for the Resource described • Obtain URI for UNIMARC tag/subfield • Direct from tag/indicators/subfield encoding • Obtain URI of value of subfield, or use a literal value • URI from vocabulary or UNIMARC Authority • Publish triple
Recommendations: Foundation • Approve the method of identifying UNIMARC elements and vocabularies. • Approve the pattern for namespaces for UNIMARC/B and /A elements and vocabularies. • Decide on initial creation and maintenance of UNIMARC elements and vocabularies in the Open Metadata Registry (OMR). • Decide between re-use of existing ISBD namespaces for UNIMARC/B or representing all UNIMARC/B elements and link to existing ISBD classes and properties as appropriate. Dunsire & Willer. UNIMARC and Linked Data, IFLA 2011 San Jose, Puerto Rico
Recommendations: Foundation • Investigate further the re-use of existing FRAD/FRBR and FRSAD namespaces or representing all UNIMARC/A elements and link to existing FRAD/FRBR/FRSAD classes/subclasses and properties as appropriate. • Investigate further the appropriate classes for UNIMARC/A in relation to UNIMARC/B, FRAD/FRBR and FRSAD. • Support and promote the translation of UNIMARC classes and properties in national languages. Dunsire & Willer. UNIMARC and Linked Data, IFLA 2011 San Jose, Puerto Rico
Recommendations: Application • Discuss and consider the requirements for Application Profiles for UNIMARC. • Check and verify the availability of SKOS representations of other external vocabularies used in UNIMARC. • Investigate and verify internal UNIMARC vocabularies for suitable SKOS representations; consider approaching the owners of external vocabularies to liaise on developing SKOS representations. Dunsire & Willer. UNIMARC and Linked Data, IFLA 2011 San Jose, Puerto Rico
Recommendations: Application • Investigate further the “combinatorial explosion” of UNIMARC properties; determine if some combinations are invalid and do not require a separate property. • Consider and approve the re-use of aggregated ISBD elements which are represented in RDF using Syntax encoding schemes (SES), which will avoid the need for developing UNIMARC equivalents. • Monitor relevant MARC21 developments, especially the Bibliographic Framework Transition Initiative recently announcement by the Library of Congress. Dunsire & Willer. UNIMARC and Linked Data, IFLA 2011 San Jose, Puerto Rico
Thank you • gordon@gordondunsire.com • mwiller@unizd.hr