120 likes | 212 Vues
Learn the basics of XML, a versatile file-formatting language used for structuring document content. Explore XML elements, attributes, and DTD declarations. Discover how XML can describe various document types and how it integrates with ISIS for managing semi-structured and fully structured data. Explore tools like XML2ISIS for importing XML data into ISIS databases and exporting data from ISIS to XML format.
E N D
ISIS and XML an introductionby E. de Smet, Univ. of Antwerp
What is XML ? • eXtensible Markup Language • Language : a set of codes, HTML-like, i.e. in between brackets < and > • Markup : ‘in stream’ codes to identify structural (not lay-out) parts of a document • eXtensible : codes are defined in a ‘DTD’ (Document Type Definition) and everybody can produce a DTD • W³-consortium standard, now becoming the most important file-formatting language (e.g. MS Office 2003)
SGML, HTML and XML • Standardized General Markup Language (80’s) : founding ‘father’, very generic but too complicated, still in use (electronic publishing); introduced concept of ‘DTD’ • HTML (90’s) : HyperText Markup Language (WWW), in fact based on one SGML-DTD • XML (00’s) : focussing on contents and structure, instead of layout
XML DTD • defines which elements are possible in a document • <!DOCTYPE books [ • !DOCTYPE is the tag for starting a document type declaration, which specifies the type of document you are validating against. It contains either the validation data or a reference to the location of the file with these data. • ‘books’ is the name you are giving this type of document • [ announces the beginning of DTD data : all ‘elements’ need to be declared
XML DTD (cont’d) • example of elements ‘grammar’ : <! Element doctype (book|article|report) “book“> • doctype is the name of the element. • (book|article|report) are the values it can contain. • “book" is the default value. If you don't include this attribute in the XML tag, it will assume the value is “book" . • Attributes : <!Attlist element CDATA>
XML DTD (cont’d) • example elements-definition : • <!ELEMENT book (author+, title, publisher)> • <!ATTLIST book year CDATA> • <!ELEMENT article (author+, title, year?, (shortversion|longversion))> • <!ATTLIST article type CDATA> • <!ELEMENT publisher (name, address)> • <!ELEMENT author (firstname?, lastname)> • ? = not or once • * = not, once or more than once • + = once or more than once
XML DTD (cont’d) • example of a record using this DTD : • <bib> • <book> • <!-- A good introductory text --> • <title> The CDS/ISIS for Windows Handbook </title> • <author> <lastname>Hopkinson</lastname> <firstname>Alan</firstname></author> • <publisher> <name>BLA</name> <address>London</address> </publisher> • </book> • </bib>
ISIS and XML • XML can describe any type of document • ISIS deals with semi-structured data • XML is perfect to describe semi-structured data (esp. * conforms with ISO-2709 fields characteristics) • Remark : XML can also be used for full structured databases (E-business!) • ISIS : confined to 3 levels, XML not ->importing in ISIS requires level reduction
ISIS and XML : importing • XML2ISIS : a free, independent tool to import XML-formatted data into an ISIS-database (ISO-file) • requires hierarchical ‘tree’ definition and defining which XML-tags go to which ISIS-fields (with reduction) • demo of the interface of XML2ISIS
ISIS and XML : exporting • built-in function from WinISIS 1.4 (build 19f): ‘utils’-menu option • MFN-range or search result as source • can deal with subfields and repeatable fields • DTD as part or separate file, using tags or fieldnames • field selection possible • demo on ASFA-database…