500 likes | 594 Vues
XML Technologies. SAX and DOM. What is SAX?. Simple API for XML Used to parse XML But does not create a default object It just fires events when it detects objects such as open or close tags PCDATA or CDATA Comments entities. Example. Imagine the following document:
E N D
XML Technologies SAX and DOM
What is SAX? • Simple API for XML • Used to parse XML • But does not create a default object • It just fires events when it detects objects such as • open or close tags • PCDATA or CDATA • Comments • entities
Example • Imagine the following document: <?xml version = "1.0"?> <addressbook> <person> <lastname>Dingli</lastname> <firstname>Alexiei</firstname> <company>University of Malta</company> <email>alexiei.dingli@um.edu.mt</email> </person> </addressbook>
SAX in 3 steps • Creating a custom object model (like Person and AddressBook classes) • Creating a SAX parser • Creating a DocumentHandler (to turn your XML document into instances of your custom object model).
Custom Object Model (1) • Create both a person and an address book object • Create its setters, getters and to xml methods
Create a Document Handler (1) • Actually 4 Interfaces ... • The Document Handler • The Entity Resolver • The DTD Handler • The Error Handler
Setting the parser parser.setDocumentHandler( ... ) parser.setDTDHandler( ... ) parser.setErrorHandler( ... )
Handler Class • Rather than implementing all the interfaces mentioned earlier • Make use of org.xml.sax.helpers.DefaultHandler • Which implements all the methods • And you simply override what you want to use • http://java.sun.com/j2se/1.4.2/docs/api/org/xml/sax/helpers/DefaultHandler.html
Example Handler SAX Handler
DOM • W3C standard • Standard way of accessing and manipulating documents • Divided into 3 parts • Core DOM (access any structured document) • XML DOM • HTML DOM • Presents element as a tree structure
XML DOM • A standard object model for XML • A standard programming interface for XML • Platform- and language-independent • A W3C standard • The XML DOM is a standard for how to get, change, add, or delete XML elements
XML DOM rulez Everything in XML is a node • The entire document is a document node • Every XML element is an element node • The text in the XML elements are text nodes • Every attribute is an attribute node • Comments are comment nodes
Example <bookstore> <book category="web" cover="paperback"> <title lang="en">Learning XML</title> <year>2008</year> </book> </bookstore> • Bookstore is the root node • It contains one book node • A book node contains a title node and a year node • Title contains a text node “Learning XML” • 2008 is not the value of the year node but a text node inside the year node
The node tree • Any DOM object has a node tree where • In a node tree, the top node is called the root • Every node, except the root, has exactly one parent node • A node can have any number of children • A leaf is a node with no children • Siblings are nodes with the same parent
Creating the XML text="<bookstore>" text=text+"<book>"; text=text+"<title>Everyday Italian</title>"; text=text+"<author>John Smith</author>"; text=text+"<year>2008</year>"; text=text+"</book>"; text=text+"</bookstore>";
Parsing the XML try //Internet Explorer { xmlDoc=new ActiveXObject("Microsoft.XMLDOM"); xmlDoc.async="false"; xmlDoc.loadXML(text); } catch(e) { try //Firefox, Mozilla, Opera, etc. { parser=new DOMParser(); xmlDoc=parser.parseFromString(text,"text/xml"); } catch(e) { alert(e.message) } } document.write("xmlDoc is loaded, ready for use");
XML DOM Methods • x.getElementsByTagName(name) - get all elements with a specified tag name • x.appendChild(node) - insert a child node to x • x.removeChild(node) - remove a child node from x
XML DOM properties • x.nodeName - the name of x • x.nodeValue - the value of x • x.parentNode - the parent node of x • x.childNodes - the child nodes of x • x.attributes - the attributes nodes of x
Examples document.write(xmlDoc.getElementsByTagName("title") [0].childNodes[0].nodeValue); document.write("<br />"); document.write(xmlDoc.getElementsByTagName("author") [0].childNodes[0].nodeValue); document.write("<br />"); document.write(xmlDoc.getElementsByTagName("year") [0].childNodes[0].nodeValue);
Accessing nodes • By using the getElementsByTagName() method • By looping through (traversing) the nodes tree • By navigating the node tree, using the node relationships
Example 1 xmlDoc.getElementsByTagName("title") [0].childNodes[0].nodeValue;
Example 2 x=xmlDoc.getElementsByTagName("title"); for ( i=0; i<x.length; i++) { document.write(x[i].childNodes[0].nodeValue); document.write("<br />"); }
Example 3 x=xmlDoc.getElementsByTagName("book")[0].childNodes; y=xmlDoc.getElementsByTagName("book")[0].firstChild; for (i=0;i<x.length;i++) { if (y.nodeType==1) {//Process only element_nodes (type 1) document.write(y.nodeName + "<br />"); } y=y.nextSibling; }
Node properties • nodeName • nodeValue • nodeType
nodeName property • nodeName is read-only • nodeName of an element node is the same as the tag name • nodeName of an attribute node is the attribute name • nodeName of a text node is always #text • nodeName of the document node is always #document
nodeValue property • nodeValue for element nodes is undefined • nodeValue for text nodes is the text itself • nodeValue for attribute nodes is the attribute value
Acessing node attributes x=xmlDoc.getElementsByTagName("book")[0].attributes; document.write(x.getNamedItem("category").nodeValue);
Traversing Example // documentElement always represents the root node x=xmlDoc.documentElement.childNodes; for (i=0;i<x.length;i++) { document.write(x[i].nodeName); document.write(": "); document.write(x[i].childNodes[0].nodeValue); document.write("<br />"); }
Navigating Nodes (1) • parentNode • childNodes • firstChild • lastChild • nextSibling • previousSibling
Getting the node value x=xmlDoc.getElementsByTagName("title")[0]; y=x.childNodes[0]; txt=y.nodeValue; Result = the name of the book Title node > Text node
Setting the node value x=xmlDoc.getElementsByTagName("title")[0].childNodes[0]; x.nodeValue="Easy Cooking";
Removing Nodes y=xmlDoc.getElementsByTagName("book")[0]; xmlDoc.documentElement.removeChild(y); Or y.parentNode.removeChild(y);
Creating nodes newel=xmlDoc.createElement("edition"); x=xmlDoc.getElementsByTagName("book")[0]; x.appendChild(newel);
Creating text nodes newel=xmlDoc.createElement("edition"); newtext=xmlDoc.createTextNode("first"); newel.appendChild(newtext); x=xmlDoc.getElementsByTagName("book")[0]; x.appendChild(newel);
Create CDATA nodes newCDATA=xmlDoc.createCDATASection("Special Offer & Book Sale"); x=xmlDoc.getElementsByTagName("book")[0]; x.appendChild(newCDATA);
Create Comment Node newComment=xmlDoc.createComment("Revised March 2008"); x=xmlDoc.getElementsByTagName("book")[0]; x.appendChild(newComment);
More additional methods x.appendChild(newNode) x.insertBefore(newNode,y) x.cloneNode(true) // add all attributes and children if true x.insertData(offset,"Easy "); // add text
Getting attribute value x=xmlDoc.getElementsByTagName("title")[0].getAttributeNode("lang"); txt=x.nodeValue;
Creating attributes newatt=xmlDoc.createAttribute("edition"); newatt.nodeValue="first"; x=xmlDoc.getElementsByTagName("title"); x[0].setAttributeNode(newatt);
Setting the attribute value x=xmlDoc.getElementsByTagName('book'); x[0].setAttribute("category","food"); Or x=xmlDoc.getElementsByTagName("book")[0] y=x.getAttributeNode("category"); y.nodeValue="food";
Removing attributes x=xmlDoc.getElementsByTagName("book"); x[0].removeAttribute("category");
Exercise • Given the following XML file • How shall we display • Two buttons • “Get CD info” and display the Titles and the Composer • “Get CD info abridged” and display the Titles only
Answer (1) • The code • What’s the result?