260 likes | 433 Vues
XML Technology in E-Commerce. Lecture 3 DOM and SAX. Lecture Outline. General Model for XML Processing Document Object Model (DOM) Logical Model; DOM Interfaces; Example; Simple API for XML (SAX); Parser Architecture; Events; Java Classes and Interfaces; Example. Error Handling;
E N D
XML Technology in E-Commerce Lecture 3 DOM and SAX
Lecture Outline • General Model for XML Processing • Document Object Model (DOM) • Logical Model; • DOM Interfaces; • Example; • Simple API for XML (SAX); • Parser Architecture; • Events; • Java Classes and Interfaces; • Example. Error Handling; • Summary;
General Model for XML Processing DOM - Document Object ModelSAX - Simple API for XML
DOM • DOM defines: • Logical model for XML documents; • Platform and language independent application programming interfaces for model manipulation; • DOM allows: • accessing document content; • modifying document content; • creating new documents in the memory; • DOM homepage: • http://www.w3.org/DOM/
competition results photos name name name img John Smith D. Warwick M. Douglas img DOMLogical Model XML Document: <competition> <results> <name>John Smith</name> <name>Derek Warwick</name> <name>Mik Douglas</name> </results> <photos> <img src="img1.gif"/> <img src="img2.gif"/> </photos> </competition> DOM Tree Structure: XML Document is a set of Nodes that form tree structure. There are different node types: for elements, attributes, text content, etc.
DOMProgramming Interfaces • DOM Interfaces are defined in the Interface Definition Language (IDL); • There are bindings for different languages: IDL JavaScript C++ Java Python
DOMInterface Hierarchy More important interfaces defined in Java package org.w3c.dom
DOMInterface Details • DOM provides two groups of interfaces: • Generic: Node, NodeList, NamedNodeMap; • Specialized: Node subinterfaces for elements, attributes, text nodes, etc. • Interfaces: • Node - Deitel 8.5, Fig. 8.7, page 201; • Document - Deitel 8.5, Fig. 8.5, page 200; • Element - Deitel 8.5, Fig. 8.9. Page 200; • Attr; • Text;
DOMDemo • Demo - Example on fig. 8.10, Deitel 8.5, page 202; • Tools: • Java 1.2.2; • http://java.sun.com/products/jdk/1.2/ • Java API for XML Processing (JAXP) 1.0.1; • http://java.sun.com/xml/archive.html • Classes: jaxp.jar and parser.jar; • Demo files: • ReplaceText.java; • MyErrorHandler.java; • intro.xml;
DOMDemo Explained (1) • Importing packages: import org.w3c.dom.*; import org.xml.sax.*; import javax.xml.parsers.*; import com.sun.xml.tree.XmlDocument; • Instantiation of the parser. DOM does not specify parser instantiation, so this is an implementation specific detail: DocumentBuilderFactory factory =DocumentBuilderFactory.newInstance(); factory.setValidating( true ); DocumentBuilder builder = factory.newDocumentBuilder();
DOMDemo Explained (2) • Loading and Parsing the XML file: Document document = builder.parse(new File( "intro.xml")); • Getting the root element (myMessage): Node root = document.getDocumentElement(); • Casting the root to Element type: Element myMessageNode = ( Element ) root; • Finding the message elements: NodeList messageNodes = myMessageNode.getElementsByTagName("message"); • Getting the first message element: Node message = messageNodes.item(0);
DOMDemo Explained (3) • Creating a new text content and replacing the old one: Text newText = document.createTextNode("New Changed Message!!"); Text oldText = (Text) message.getChildNodes().item(0); message.replaceChild( newText, oldText ); • Writing the changed document to a new file. DOM does not specify how to save the DOM structure. This is implementation specific detail: ((XmlDocument) document).write( new FileOutputStream("intro1.xml"));
DOMSpecification Levels • DOM level 1 (Discussed here); • DOM level 2: • Namespace support; • Stylesheets interface; • Model for events; • Views, Range and Traversal interfaces; • DOM level 3 (work in progress): • Loading and Saving documents; • Model for DTD and Schema;
Lecture Outline • General Model for XML Processing • Document Object Model (DOM) • Logical Model; • DOM Interfaces; • Example; • Simple API for XML (SAX); • Parser Architecture; • Events; • Java Classes and Interfaces; • Example. Error Handling; • Summary;
SAX • SAX - Simple API for XML; • Developed by the members of XML-DEV list in 1998; • SAX is Event based: • The parser reports parsing events: start and end of the document, start and end of an element, errors, etc. • When an event occurs, the parser invokes a method on an event handler; • The application handles the events accordingly; • SAX home page: http://www.megginson.com/SAX/
Document Handler Error Handler XML Source SAX Parser Application DTD Handler Entity Resolver SAXParser Architecture DocumentHandler, ErrorHandler, DTDHandler and EntityResolver are interfaces that the Application can implement
SAXDocumentHandler Interface • Java package org.xml.sax; • DocumentHandler Interface; More important methods: public abstract void startDocument() public abstract void endDocument() public abstract void startElement(String name, AttributeList atts) public abstract void endElement(String name) public abstract void characters(char ch[],int start, int length) public abstract void processingInstruction(String target,String data)
SAXDemo • Demo - Example on fig. 9.3, Deitel 9.6, page 235; • Tools: • Java 1.2.2; • http://java.sun.com/products/jdk/1.2/ • Java API for XML Processing (JAXP) 1.0.1; • http://java.sun.com/xml/archive.html • Classes: jaxp.jar and parser.jar; • Demo files: • Tree.java; • Sample XML documents;
SAXDemo Explained (1) • Importing packages: import org.xml.sax.*; import javax.xml.parsers.SAXParserFactory; import javax.xml.parsers.ParserConfigurationException; import javax.xml.parsers.SAXParser; • Class HandlerBase: • Provide default implementation of the four event handlers. Applications usually extends it and overrides some methods: public class Tree extends HandlerBase • Tree class overrides the methods from DocumentHandler interface; • Registration in the parser before parsing;
SAXDemo Explained (2) • Factory Instantiation: SAXParserFactory saxFactory = SAXParserFactory.newInstance(); saxFactory.setValidating( validate ); • Obtaining the parser and start parsing: SAXParser saxParser = saxFactory.newSAXParser(); saxParser.parse(new File(args[1]), new Tree());
SAXError Handling • Three error types: • Fatal errors: usually violation of well-formedness constraints. The parser must stop processing; • Errors: usually violation of validity rules; • Warnings: related to DTD; • Errors are handled by implementing ErrorHandler Interface; • The Tree class overrides the default implementation of methods for warnings and errors; • The same mechanism is used with DOM parsers;
SAX 2.0 • Main Changes: • Namespace support; • Introduction of Filter mechanism; • Interface DocumentHandler is replaced by ContentHandler; • New exception classes;
SAX and DOMComparison • DOM: • maintains an internal structure for the document; • possible high memory usage for large documents; • enables traversing; • SAX: • doesn’t maintain an internal structure; • enables building of custom structure; • low memory usage; • usually faster than DOM; • traversing is impossible without internal structure; • Usually a DOM implementation is built on the top of a SAX parser;
Summary • Two approaches for XML processing: • Tree-based (DOM); • Event-based (SAX); • Tools: • JDK 1.2.2; • JAXP 1.0.1 (used in the book); • JAXP 1.1 is also available; • See also http://xml.apache.org; Read: Deitel 8, 9 Assignment: Modify the case study in Deitel 8.8. In the new version the query should be based only on year, month and day (time is excluded). Add new functionality for making new appointment for a meeting on the found day and at specified time. For more detailed explanation and some hints see the course site.