150 likes | 240 Vues
Learn about XML, an inline markup system with a tree structure consisting of elements and content. Explore how XML is used to add knowledge and content to data, its flexibility, and independence. Compare XML to HTML and practice creating XML structures in exercises.
E N D
XML – an introduction David Nathan ELDP training March 2010
XML • an in-line markup system • single sequence of plain text only (but can be unicode) • equivalent to a tree structure • consists of elements and content • elements: tag syntax • entities syntax • reserved characters < > & " ‘
XML syntax • structures are defined by tags in angle brackets: eg: <noun> • tags are usually in pairs: • a start/open tag, and an end/close tag: the <noun> dog </ noun> chased ... • but can also be single and closed: the dog <pause /> sat down
XML syntax • tags can have attributes with values : the <noun num=“1”> dog </ noun> sat down • you can name your tags, attributes or values (almost) anything • there are some restrictions: • you can have hierarchies, but not overlaps: <a>the <b><c>cat</c> sat</b> on the mat</a> <a>the <b><c>cat</b> sat</c> on the mat</a>
XML is used to add knowledge ... • add knowledge to content: • usually structures and labels • add the knowledge that’s relevant to your domain or task • knowledge priorities: • what’s required • what’s visually represented (eg by format/layout) • what’s implicit
Compare to HTML • ... the man who really liked the book The Lawyer Who Lost, about habeas corpus ... • in HTML: ... the man who <i>really</i> liked the book <i>The Lawyer Who Lost</i>, about <i>habeas corpus</i> ... • in XML, we can define our own elements that focus on logical structure rather than visial format
Compare to HTML • XML: • is flexible and extensible • must be well-formed • can be validated • is application-, platform-, and vendor- independent • is machine readable (ie parsable, or understandable by computer programs)
... in XML <story> <metaDataField>The Guardian</metaDataField> <metaDataField>July 1, 1997</metaDataField> <metaDataField>Andrew Higgins in Hong Kong</metaDataField> <headLine>A last hurrah and an empire closes down </headLine> <p>With a clenched-jaw nod from the Prince of Wales, a last rendition of <title>God Save the Queen</title>, and a wind machine to keep the Union flag flying for a final 16 minutes of indoor pomp...</p> </story>
Where does XML come from? • write “raw” XML (we will do this) • XML editors • generated, eg from databases, programs
What is XML used for? • any symbolic data • data exchange • data transformation • structure • format • content
Why do I need to know about it? • you already consume a lot of XML! • many linguistic software tools (eg ELAN) use XML as data format • XML is very powerful and flexible, especially for certain tasks, and for archiving • XML is an ISO standard • XML is growing in use and support! • XML is easy!
XML exercise 1 James departed from Manilla on Wednesday 11 May and arrived in Boston on Thursday 12 May. • identify times and names, code these as XML • draw as a tree structure THEN • add more information to your XML as attributes • draw a tree structure again
XML exercise 2 • draw a simple linguistic tree structure • represent it as XML
Congratulations • you have now taken your first steps in XML