1 / 24

Processing XML

5. Processing XML. Overview. Parsing XML documents Document Object Model (DOM) Simple API for XML (SAX) Class generation. What's the Problem?. ?. <?xml version="1.0"?> <books> <book> <title>The XML Handbook</title> <author>Goldfarb</author> <author>Prescod</author>

Télécharger la présentation

Processing XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 5 Processing XML

  2. Overview • Parsing XML documents • Document Object Model (DOM) • Simple API for XML (SAX) • Class generation

  3. What's the Problem? ? <?xml version="1.0"?> <books> <book> <title>The XML Handbook</title> <author>Goldfarb</author> <author>Prescod</author> <publisher>Prentice Hall</publisher> <pages>655</pages> <isbn>0130811521</isbn> <price currency="USD">44.95</price> </book> <book> <title>XML Design</title> <author>Spencer</author> <publisher>Wrox Press</publisher> ... </book> </books> ? Book

  4. Parser startDocument Application implementsDocumentHandler startElement startElement endElement endElement endDocument Document Tree Parsing XML Documents Docu-ment DTD /Schema DOM SAX

  5. Parser • Project X (Sun Microsystems) • Ælfred (Microstar Software) • XML4J (IBM) • Lark (Tim Bray) • MSXML (Microsoft) • XJ (Data Channel) • Xerces (Apache) • ...

  6. The Document Object Model XML Document Structure <?xml version="1.0"?> <books> <book> <title>The XML Handbook</title> <author>Goldfarb</author> <author>Prescod</author> <publisher>Prentice Hall</publisher> <pages>655</pages> <isbn>0130811521</isbn> <price currency="USD">44.95</price> </book> <book> <title>XML Design</title> <author>Spencer</author> <publisher>Wrox Press</publisher> ... </book> </books> books book book title author publisher pages isbn The XMLHandbook Goldfarb PrenticeHall 655 ... Prescod

  7. The Document Object Model • Provides a standard interface for access to and manipulation of XML structures. • Represents documents in the form of a hierarchy of nodes. • Is platform- and programming-language-neutral • Is a recommendation of the W3C (October 1, 1998) • Is implemented by many parsers

  8. DOM - Structure Model Document books book book Node title author publisher pages isbn Element The XMLHandbook Goldfarb PrenticeHall 655 ... Prescod NodeList

  9. The Document Interface Method Result docTypeimplementation documentElement getElementsByTagName(String) createTextNode(String) createComment(String) createElement(String) create CDATASection(String) DocumentType DOMImplementation Element NodeList String Comment Element CDATASection

  10. The Node Interface Method Result String String short Node NodeList Node Node Node Node NodeNamedMap Node Node Node Boolean nodeName nodeValue nodeType parentNode childNodes firstChild lastChild previousSibling nextSibling attributes insertBefore(Node new,Node ref) replaceChild(Node new,Node old) removeChild(Node) hasChildNode

  11. Node Types / Node Names Result: NodeType /NodeName Node Node Node Fields Type Name ELEMENT_NODE 1 tagName ATTRIBUTE_NODE 2 name of attribute TEXT_NODE 3 "#text" CDATA_SECTION_NODE 4 "#cdata-section" ENTITY_REFERENCE_NODE 5 name of entity referenced ENTITY_NODE 6 entity name PROCESSING_INSTRUCTION_NODE 7 targetCOMMENT_NODE 8 "#comment"DOCUMENT_NODE 9 "#document"DOCUMENT_TYPE_NODE 10 document type name DOCUMENT_FRAGMENT_NODE 11 "#document-fragment" NOTATION_NODE 12 notation name

  12. The NodeList Interface Method Result length item(int) Int Node

  13. The Element Interface Method Result tagName getAttribute(String) setAttribute(String name, String value) removeAttribute(String) getAttributeNode(String) setAttributeNode(Attr) removeAttributeNode(String) getElementsByTagName String String Attr Attr Attr NodeList

  14. DOM Methods for Navigation parentNode previousSibling nextSibling firstChild lastChild childNodes(length, item()) getElementsByTagName

  15. DOM Methods for Manipulation appendChild insertBefore replaceChildremoveChild createElement createAttribute createTextNode

  16. firstBook DOMObject secondAuthor TextSubnodes firstthereof Root Node Books Text Authors Example books book book author author author Spencer Prescod Goldfarb doc.documentElement.childNodes.item(0).getElementsByTagName("author"). item(1).childNodes.item(0).data

  17. Script <HTML> <HEAD><TITLE>DOM Example</TITLE></HEAD> <BODY> <H1>DOM Example</H1> <SCRIPT LANGUAGE="JavaScript"> var doc, root, book1, authors, author2; doc = new ActiveXObject("Microsoft.XMLDOM"); doc.async = false; doc.load("books.xml"); if (doc.parseError != 0) alert(doc.parseError.reason); else { root = doc.documentElement; document.write("Name of Root node: " + root.nodeName+ "<BR>"); document.write("Type of Root node: " + root.nodeType+ "<BR>"); book1 = root.childNodes.item(0); authors = book1.getElementsByTagName("author"); document.write("Number of authors: " + authors.length + "<BR>"); author2 = authors.item(1); document.write("Name of second author: " + author2.childNodes.item(0).data);} </SCRIPT> </BODY></HTML>

  18. Parser startDocument startElement startElement endElement endElement endDocument SAX - Simple API for XML Docu-ment DTD Application

  19. SAX - Simple API for XML • Event-driven parsing model • "Don't call the DOM, the parser calls you." • Developed by the members of the XML-DEV Mailing List • Released on May 11, 1998 • Supported by many parsers ... • ... but Ælfred is the saxon king.

  20. Procedure • DOM • Creating a parser instance • Parsing the whole document • Processing the DOM tree • SAX • Creating a parser instance • Registrating event handlers with the parser • Parser calls the event handler during parsing

  21. Namespace Support <?xml version="1.0"?> <order xmlns="http://www.net-standard.com/namespaces/order" xmlns:bk="http://www.net-standard.com/namespaces/books" xmlns:cust="http://www.net-standard.com/namespaces/customer" > ... <bk:book> <bk:title>XML Handbook</bk:title> <bk:isbn>0130811521</bk:isbn> </bk:book> .... </order>

  22. DOM Level 2 SAX 2.0 Interface "Node" startElement Method qName uri localName nodeName namespaceURI prefix localName Access to Qualified Elements Node "book" bk:book http://www.net-standard.com/namespaces/books bk book

  23. DTD / Schema 'yacht' Generation Class 01 yacht 05 name 05 details 10 type <?xml?> <yacht yachtid='147'> <name>Mona Lisa</name><image file='yacht147.jpg'/><description> Any text describing this yacht 147</description><details> <type>GULFSTAR 55</type> ength>1700</length> <width>480</width> <draft>170</draft> <sailsurface>112</sailsurface> <motor>84</motor> <headroom>202</headroom> <bunks>8</bunks> </details></yacht> Processing 01 yacht 05 VENTANA 05 details 10 GULFSTAR 55 Object Generation of Data Structures

  24. Summary • To avoid expensive text processing, applications use an XML parser that creates a DOM tree of a document. • The DOM provides a standardized API to access the content of documents and to manipulate them. • Alternatively or additionally, applications can work event-based using the SAX interface, which is provided by many parsers.

More Related