320 likes | 423 Vues
Learn about XML's syntax, motivation, and practical applications. Understand how XML enables data exchange, separation of content, and semantic web features. Explore examples and benefits of XML in database management systems.
E N D
Database Management Systems XML Motivation & Syntax Monica Farrow G30 email : monica@macs.hw.ac.uk
XML Topics • Motivation • Syntax • Describing the document • DTD, XML Schema • Accessing the elements using XPath • Using XML • Transforming and querying XML • XSLT, XPath, XQuery • XML & Databases • Programming APIs (DOM, SAX) used with XML XML - Motivation & Syntax
XML in One Slide • Basically, XML is an annotated text file. The format is similar to HTML • However, in XML, you can use any tag names that you want, to describe the data • Example: <person> <name> Lisa Simpson</name> <tel> 0131-828-1234 </tel> <tel> 078-4701-7775 </tel> <email> lisa@macs.hw.ac.uk</email> </person> XML - Motivation & Syntax
Motivation • XML allows us to create machine-readable text files, enabling • Exchange of data over a network • Separation of content from presentation • “Write once read anywhere” • The Semantic Web • A machine-understandable Web • The meaning of data (i.e., the semantics of data) should be encoded together with the data XML - Motivation & Syntax
Newsfeeds • News can be exported as RSS - this data can easily be used by a program • Browsers such as Firefox enable you to add rss feeds to your webpage XML - Motivation & Syntax
RSS example • Really Simple Syndication • The latest news on topics you’ve subscribed to arrive at your RSS reader (here the browser) XML - Motivation & Syntax
Business data exchange Solution: Use XML On every step of the way for data exchange XML - Motivation & Syntax
Application data • A standard method to access information, making it easier for applications and devices of all kinds to use, store, transmit, and display data. • For example, an application may store data in XML files to keep track of the updates used • Version number, file names, installation time etc XML - Motivation & Syntax
XSL XSL XSL WML (hand-held devices) HTML (web browser) TEXT (Excel) Write Once Use Everywhere XML document XML - Motivation & Syntax
Insurance Co. Rating Provider sites Physician’s Agent Mom required treatment in-plan? close-by? Specialist? Driving schedule Lucy’s Agent Pete’s Agent Semantic integration: Doctor’s Appointment“The Semantic Web”, Scientific American, May 2001 Needs treatment Schedule appointment Arranges treatment Will drive her there if free Will drive her there if free XML - Motivation & Syntax
Some existing XML languages • XHTML • XML compatible version of HTML • DocBook • For any documentation. Tags such as title, chapter, para etc • ODF • Open document format. For office documents such as word processing or spreadsheets . Used by OpenOffice. • MathXML • To describe mathematical formulae XML - Motivation & Syntax
XML Syntax
XML Overview • XML is a ‘human-legible’ simplified subset of the Standardised General Markup Language, on which HTML is also based • Data is divided into elements and attributes. Each element is surrounded by a start tag and an end tag. • <tel>0131–444 7777</tel> • Tagnames are chosen to reflect the meaning of the element content • (In html, tagnames are chosen to indicate page structure) SGML XML HTML XML - Motivation & Syntax
element, Contains text Terminology • The segment of an XML document between an opening and a corresponding closing tag is called an element • Elements may contain text or other elements Element contains other elements <person> <name>Bart Simpson</name> <tel>0131–444 7777</tel> <tel>078–4011 6022</tel> <email>bart@ed.ac.uk</email> </person> Can be >1 element with the same tagname XML - Motivation & Syntax
person name tel tel email XML Document is a Tree Bart Simpson 0131-444 7777 078–4011 6022 bart@ed.ac.uk • XML documents are abstractly modeled as trees, as reflected by their nesting • Sometimes, XML documents are graphs (by using IDs and IDREFs to link elements) XML - Motivation & Syntax
Elements Can Be Nested <addresses> <person> <name>Donald Duck</name> <tel>0131-8281345</tel> <tel>0131-8281374</tel> <email> donald@macs.hw.ac.uk </email> </person> <person> <name> Mickey Mouse</name> <tel> 0141-4261142 </tel> </person> </addresses> XML - Motivation & Syntax
A Complete XML Document <?xml version ="1.0" encoding="UTF-8" ?> <!DOCTYPE addresses SYSTEM "http://www.addbook.com/addresses.dtd"> <addresses> <person> <name>Lisa Simpson</name> <tel> 0131-828 1234 </tel> <tel> 078-4701 7775 </tel> <email> lisa@macs.hw.ac.uk </email> </person> </addresses> Required Optional XML - Motivation & Syntax
Attributes • An opening tag may contain attributes • These are typically used to describe the contents of an element <entry> <wordlanguage = “en”>cheese</word> <wordlanguage = “fr”>fromage</word> <wordlanguage = “ro”>branza</word> <meaning>A food made …</meaning> </entry> XML - Motivation & Syntax
When to Use Attributes It’s not always clear when to useattributes <person> <ssno>123 4589</ssno> <name>L. Simpson </name> <email> lisa@macs.hw.ac.uk </email> ... </person> <person ssno= “123 4589”> <name>L. Simpson </name> <email> lisa@macs.hw.ac.uk </email> ... </person> XML - Motivation & Syntax
When to Use Attributes It’s not always clear when to use attributes General Rule: Use an attribute to describe how the data should be interpreted (e.g. language, currency) Use an attribute for “IDs”, i.e., identifying data (covered later) XML - Motivation & Syntax
Rules for XML (1) • XML is order sensitive, i.e. the following are different: • XML is case-sensitive, i.e., the following are different: <person>, <Person>, <PERSON> <entry> <wordlanguage = “en”>cheese</word> <wordlanguage = “fr”>fromage</word> </entry> <entry> <wordlanguage = “fr”>fromage</word> <wordlanguage = “en”>cheese</word> </entry> XML - Motivation & Syntax
Rules for XML (2) • Tags come in pairs<date> ...</date> • They must be properly nested • Good:<date> ... <day> ... </day> ... </date> • Bad: <date> ... <day> ... </date>... </day> • Bad: <date> ... </Date> • There is a special shortcut for tags that have no text or sub-elements in between them (empty element, bachelor tags) • <img src=“myPic.jpg” /> instead of • < img src=“myPic.jpg > </img> XML - Motivation & Syntax
Rules for XML (3) • There should be exactly one top-level element. • This element is also called the root element • <?xml version=“1.0”?> • <Question> This is legal </Question> • <?xml version=“1.0”?> • <Question> Is this legal? </Question> • <Answer> No. </Answer> XML - Motivation & Syntax
Well Formed Documents • A document is well-formed if it has • One top-level element • Tags come in properly nested case-sensitive pairs • Empty elements may use the accepted shortcut / • Attribute values must be enclosed in quotes • Attribute names must not be repeated within a tag XML - Motivation & Syntax
Why is this not well-formed? <?xml version ="1.0" encoding="UTF-8" ?> <person phone= 0131-828 1234 phone=078-4701 7775 > <Name> <first>Homer <second>Simpson </first></second> </name> <person phone= 0131-828 1235 > <Name> <first>Lisa <second>Simpson </first></second> </name> XML - Motivation & Syntax
IDs and Referencing • Unique elements can be identified with an id, and referred to from other elements • In this way, relationships between elements can be shown without repetition • E.g. • Each person has an ID. Each person can contain a reference to the ID of their mother, father, children • Books and authors can be listed. But each book may have >1 author, each author might write >1 book. So the book can contain a reference to the author. etc XML - Motivation & Syntax
Referencing example <family> <person id=“lisa” mother=“marge” father=“homer”> <name> Lisa Simpson </name> </person> <person id=“bart” mother=“marge” father=“homer”> <name> Bart Simpson </name> </person> <person id=“marge” children=“bart lisa”> <name> Marge Simpson </name> </person> <person id=“homer” children=“bart lisa”> <name> Homer Simpson </name> </person> </family> XML - Motivation & Syntax
XML Authoring • There are many authoring tools available to facilitate the creation of XML documents. • E.g., XML Spy, Xmetal • However, you may as well start off using a simple text editor, ideally XML aware • XML is after all just a text file. • You are then responsible for checking that the XML is correct! XML - Motivation & Syntax
Viewing and checking XML • This is perhaps simplest way to check that XML is well formed: • If well formed XML is loaded into your browser it will be displayed as a tree structure XML - Motivation & Syntax
Viewing and checking XML • If incorrect XML is loaded into your browser then error messages will be displayed XML - Motivation & Syntax
Defining the structure of an XML file • We can check if an XML file is well-formed • by looking at it, maybe • By loading it into a browser • If well-formed, it will be displayed • However, how can we check that the well-formed file contains the correct elements in the correct quantities? • We need to write a specification for the XML file • See the next lecture XML - Motivation & Syntax
Exercise • Write an example of an XML file containing 2 or 3 records which holds information about holiday homes for rent. • Each home has an id, a name and a location • Additionally, each home has one or more sets of contact details. Contact details consist of a name and a phone number, and optionally an email address and website. • In your example, demonstrate optional or repeated elements. XML - Motivation & Syntax