500 likes | 622 Vues
This guide introduces the Simple API for XML (SAX) and Document Object Model (DOM) technologies for parsing and manipulating XML data. SAX operates by firing events during the parsing process without creating a default object. In contrast, DOM represents the XML document as a tree of nodes, allowing easy manipulation of elements and attributes. Through practical examples, we explore creating custom object models, parsing XML, and accessing node values using various methods. Learn how to leverage these technologies to effectively work with XML documents.
E N D
XML Technologies SAX and DOM
What is SAX? • Simple API for XML • Used to parse XML • But does not create a default object • It just fires events when it detects objects such as • open or close tags • PCDATA or CDATA • Comments • entities
Example • Imagine the following document: <?xml version = "1.0"?> <addressbook> <person> <lastname>Dingli</lastname> <firstname>Alexiei</firstname> <company>University of Malta</company> <email>alexiei.dingli@um.edu.mt</email> </person> </addressbook>
SAX in 3 steps • Creating a custom object model (like Person and AddressBook classes) • Creating a SAX parser • Creating a DocumentHandler (to turn your XML document into instances of your custom object model).
Custom Object Model (1) • Create both a person and an address book object • Create its setters, getters and to xml methods
Create a Document Handler (1) • Actually 4 Interfaces ... • The Document Handler • The Entity Resolver • The DTD Handler • The Error Handler
Setting the parser parser.setDocumentHandler( ... ) parser.setDTDHandler( ... ) parser.setErrorHandler( ... )
Handler Class • Rather than implementing all the interfaces mentioned earlier • Make use of org.xml.sax.helpers.DefaultHandler • Which implements all the methods • And you simply override what you want to use • http://java.sun.com/j2se/1.4.2/docs/api/org/xml/sax/helpers/DefaultHandler.html
Example Handler SAX Handler
DOM • W3C standard • Standard way of accessing and manipulating documents • Divided into 3 parts • Core DOM (access any structured document) • XML DOM • HTML DOM • Presents element as a tree structure
XML DOM • A standard object model for XML • A standard programming interface for XML • Platform- and language-independent • A W3C standard • The XML DOM is a standard for how to get, change, add, or delete XML elements
XML DOM rulez Everything in XML is a node • The entire document is a document node • Every XML element is an element node • The text in the XML elements are text nodes • Every attribute is an attribute node • Comments are comment nodes
Example <bookstore> <book category="web" cover="paperback"> <title lang="en">Learning XML</title> <year>2008</year> </book> </bookstore> • Bookstore is the root node • It contains one book node • A book node contains a title node and a year node • Title contains a text node “Learning XML” • 2008 is not the value of the year node but a text node inside the year node
The node tree • Any DOM object has a node tree where • In a node tree, the top node is called the root • Every node, except the root, has exactly one parent node • A node can have any number of children • A leaf is a node with no children • Siblings are nodes with the same parent
Creating the XML text="<bookstore>" text=text+"<book>"; text=text+"<title>Everyday Italian</title>"; text=text+"<author>John Smith</author>"; text=text+"<year>2008</year>"; text=text+"</book>"; text=text+"</bookstore>";
Parsing the XML try //Internet Explorer { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.loadXML(text); } catch(e) { try //Firefox, Mozilla, Opera, etc. { parser=new DOMParser(); xmlDoc=parser.parseFromString(text,"text/xml"); } catch(e) { alert(e.message) } } document.write("xmlDoc is loaded, ready for use");
XML DOM Methods • x.getElementsByTagName(name) - get all elements with a specified tag name • x.appendChild(node) - insert a child node to x • x.removeChild(node) - remove a child node from x
XML DOM properties • x.nodeName - the name of x • x.nodeValue - the value of x • x.parentNode - the parent node of x • x.childNodes - the child nodes of x • x.attributes - the attributes nodes of x
Examples document.write(xmlDoc.getElementsByTagName("title") [0].childNodes[0].nodeValue); document.write("<br />"); document.write(xmlDoc.getElementsByTagName("author") [0].childNodes[0].nodeValue); document.write("<br />"); document.write(xmlDoc.getElementsByTagName("year") [0].childNodes[0].nodeValue);
Accessing nodes • By using the getElementsByTagName() method • By looping through (traversing) the nodes tree • By navigating the node tree, using the node relationships
Example 1 xmlDoc.getElementsByTagName("title") [0].childNodes[0].nodeValue;
Example 2 x=xmlDoc.getElementsByTagName("title"); for ( i=0; i<x.length; i++) { document.write(x[i].childNodes[0].nodeValue); document.write("<br />"); }
Example 3 x=xmlDoc.getElementsByTagName("book")[0].childNodes; y=xmlDoc.getElementsByTagName("book")[0].firstChild; for (i=0;i<x.length;i++) { if (y.nodeType==1) {//Process only element_nodes (type 1) document.write(y.nodeName + "<br />"); } y=y.nextSibling; }
Node properties • nodeName • nodeValue • nodeType
nodeName property • nodeName is read-only • nodeName of an element node is the same as the tag name • nodeName of an attribute node is the attribute name • nodeName of a text node is always #text • nodeName of the document node is always #document
nodeValue property • nodeValue for element nodes is undefined • nodeValue for text nodes is the text itself • nodeValue for attribute nodes is the attribute value
Acessing node attributes x=xmlDoc.getElementsByTagName("book")[0].attributes; document.write(x.getNamedItem("category").nodeValue);
Traversing Example // documentElement always represents the root node x=xmlDoc.documentElement.childNodes; for (i=0;i<x.length;i++) { document.write(x[i].nodeName); document.write(": "); document.write(x[i].childNodes[0].nodeValue); document.write("<br />"); }
Navigating Nodes (1) • parentNode • childNodes • firstChild • lastChild • nextSibling • previousSibling
Getting the node value x=xmlDoc.getElementsByTagName("title")[0]; y=x.childNodes[0]; txt=y.nodeValue; Result = the name of the book Title node > Text node
Setting the node value x=xmlDoc.getElementsByTagName("title")[0].childNodes[0]; x.nodeValue="Easy Cooking";
Removing Nodes y=xmlDoc.getElementsByTagName("book")[0]; xmlDoc.documentElement.removeChild(y); Or y.parentNode.removeChild(y);
Creating nodes newel=xmlDoc.createElement("edition"); x=xmlDoc.getElementsByTagName("book")[0]; x.appendChild(newel);
Creating text nodes newel=xmlDoc.createElement("edition"); newtext=xmlDoc.createTextNode("first"); newel.appendChild(newtext); x=xmlDoc.getElementsByTagName("book")[0]; x.appendChild(newel);
Create CDATA nodes newCDATA=xmlDoc.createCDATASection("Special Offer & Book Sale"); x=xmlDoc.getElementsByTagName("book")[0]; x.appendChild(newCDATA);
Create Comment Node newComment=xmlDoc.createComment("Revised March 2008"); x=xmlDoc.getElementsByTagName("book")[0]; x.appendChild(newComment);
More additional methods x.appendChild(newNode) x.insertBefore(newNode,y) x.cloneNode(true) // add all attributes and children if true x.insertData(offset,"Easy "); // add text
Getting attribute value x=xmlDoc.getElementsByTagName("title")[0].getAttributeNode("lang"); txt=x.nodeValue;
Creating attributes newatt=xmlDoc.createAttribute("edition"); newatt.nodeValue="first"; x=xmlDoc.getElementsByTagName("title"); x[0].setAttributeNode(newatt);
Setting the attribute value x=xmlDoc.getElementsByTagName('book'); x[0].setAttribute("category","food"); Or x=xmlDoc.getElementsByTagName("book")[0] y=x.getAttributeNode("category"); y.nodeValue="food";
Removing attributes x=xmlDoc.getElementsByTagName("book"); x[0].removeAttribute("category");
Exercise • Given the following XML file • How shall we display • Two buttons • “Get CD info” and display the Titles and the Composer • “Get CD info abridged” and display the Titles only
Answer (1) • The code • What’s the result?