XML Standards in Bibliographic IT: Current Usage and Future Trends
This document provides an overview of XML standards relevant to bibliographic IT, detailing their current use and future developments. It focuses on identifying various XML standards, describing existing applications such as MARC formats, Z39.50, and interlibrary loan transactions, and discussing their advantages. The paper is intended for professionals in bibliographic IT and not as an introductory course in XML standards. Key benefits include easy human readability, support from IT vendors, and efficient handling of hierarchical data.
XML Standards in Bibliographic IT: Current Usage and Future Trends
E N D
Presentation Transcript
XML & Library Applications ELAG 2001 Poul Henrik Jørgensen, phj@dbc.dk Danish Bibliographic Centre, www.dbc.dk
Objective • To identify family of relevant XML standards • To describe existing use of XML standards within important Bibliographic IT standards • To identify future development of Bibliographic IT standards based on XML standards • Not an introductory course in the XML standards themselves! Poul Henrik Jørgensen
Content • XML Standards Overview • Major benefits of XML • MARC formats and XML • Z39.50 and XML formats • Interlibrary Loan transactions in XML • NISO Circulation Interchange Protocol and XML • ZML: Z39.50 as XML Protocol • RDF: Semantic Web and Metadata Poul Henrik Jørgensen
XML Standards Overview • Family of related standards from W3C • XML: Representation of hierarchical data • XML Schema: Specification of XML structure • XHTML: Presentation/display of data • DOM: Internal representation of XML • XSLT/XPath: Transformation of XML • RDF: Relationships between Objects and Classes Poul Henrik Jørgensen
Major benefits of XML • Easy to understand by humans – as well as by computers • Supported by all mainstream IT vendors • Handles hierarchical information well • Can be edited by simple tools • Many IT people knows XML Poul Henrik Jørgensen
OAI MARC XML Schema • Developed for Open Archives Initiative • Similar to limited ISO 2709 structure • Single Field element type containing single Subfield element type • Field- and Subfield instances identified by attribute values • Suitable for exchange and conversions • http://www.openarchives.org/OAI/oai_marc.xsd Poul Henrik Jørgensen
MARC XML Schemas • Developed for VisualCat and ONE-2 project • Similar to LC MARC DTD • Each possible combination of Field and Subfield specified as separate XML Element Types • danMARC2: 164 MARC Fields and 1189 Subfields • Suitable for automatic syntax validation • Schemas for MARC21 (British Library), UNIMARC (Italian SBN) and danMARC2 (DBC) Poul Henrik Jørgensen
VisualCat MARC21 Schema Poul Henrik Jørgensen
Z39.50 and XML formats • CompSpec option may specify Record Syntax, Schema, and Element Specification: • recordSyntax identifies format, e.g. XML (OID= 1.2.840.10003.5.109.10) • Schema identifies structure e.g. Holdings(OID=1.2.840.10003.13.7.1) • elementSpec identifies subset, e.g. Level B-1: Minimal Bibliographic Level Holdings (ESN=”B1”) Poul Henrik Jørgensen
XML Schemas in Z39.50 • Dublin Core XML format • http://www.nlc-bnc.ca/bath/bp-app-d.htm • http://www.openarchives.org/OAI/dc.xsd • ExplainLite XML DTD • http://www.one-2.org/technical/ONE-ICONE-DTD-0001.dtd • Holdings XML Schema • http://www.portia.dk/zholdings/Holdings6a/HoldingsSchema6a_xsd/HoldingsSchema6a.htm • ES Task Package XML format Poul Henrik Jørgensen
Dublin Core (Bath Profile) <record-list> <dc-record> <creator>some author</creator> <creator>some author</creator> <title>some title</title> </dc-record> <dc-record> ..... </dc-record> </record-list> Poul Henrik Jørgensen
ExplainLite (VisualCat) Poul Henrik Jørgensen
Holdings XML (danZIG) <?xml version="1.0" encoding="UTF-8" ?> <!--ZIG XML Holdings B-2/A danZIG example --> <!--Produced by Poul Henrik Jorgensen 2001-05-21 --> <holdingsStructure xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.portia.dk/zholdings/Holdings6a/HoldingsSchema6a.xsd"> <bibItemInfo-1targetItemId-3="001 field in DanBib" /> <!--holdingsStatement repeated for each location --> <holdingsStatement-4 unionCatShelfMark-9="Local Shelf Mark" holdingsNotes-25="Holdings of serials"> <holdingsSiteLocation-6targetLocationId-26="Identificator in local system" institutionOrSiteId-27="Library Number" networkAddress-33="Z39.50URL" siteNotes-34="Availability" /> <unionCatLendingInfo-19servicePolicy-109="1 eller 2 cf. danZIG note" /> </holdingsStatement-4> </holdingsStructure> Poul Henrik Jørgensen
ILL transactions in XML • ICCU/SBN system offers XML Item-Order request messages (APDU) via E-mail • ONE-2 Profile of all ILL messages in XML • http://www.portia.dk/pubs/ill/schema/illv2/illv2.htm • New danZIG Profile using XML for Z39.50/ILL Profile 1 Poul Henrik Jørgensen
Circulation Interchange Protocol • NISO Circulation Interchange Protocol • http://www.niso.org/commitat.html • NCIP Data specified by XML DTD/Schema • http://www.portia.dk/pubs/NCIP/NCIP_v0_1a.xsd • Adapted to SOAP/WSDL • http://www.portia.dk/pubs/NCIP/PortTypes.wsdl Poul Henrik Jørgensen
ZML Objectives • Initiative by Library of Congress and others • Leverage investments in existing Z39.50 Services and specifications • Simplify Z39.50 implementation • Facilitate interoperability with other relevant standards • Foster migration of Z39.50 functionality to mainstream IT technologies Poul Henrik Jørgensen
ZML: Z39.50 over SOAP • Protocol elements encoded as XML Structures • Relevant Z39.50 Services mapped to SOAP Request/Response functions over HTTP • Search/Present and other services simplified • Existing Web-to-Z gateways may be enhanced with SOAP-to-Z gateways • Draft specifications to be presented at ZIG in October 2001 at British Library in York Poul Henrik Jørgensen
Semantic Web and RDF • RDF is part of W3C Semantic Web Activity • http://www.w3.org/2001/sw/Activity • Defines relationships and attributes of electronic resources • Can represent any metadata schema, e.g. Dublin Core or IFLA FRBR metadata • RDF is expressed by directed graphs or XML • RDF is used to represent Authority Data and other metadata Poul Henrik Jørgensen
FRBR RDF graph (VisualCat) Poul Henrik Jørgensen
Summary • XML offers many inherent advantages as data format • XML standards are already implemented in relation to MARC, Z39.50, ILL and NCIP • Next generation of Z39.50 (i.e. ZML) will most likely be based on XML standards • ”Digital Libraries may be the killer application for RDF” Poul Henrik Jørgensen