150 likes | 257 Vues
XML and Data Management XML Processors. Hachim Haddouti Al Akhawayn University SSE H.Haddouti@alakhawayn.ma http://mail.alakhawayn.ma/~H.Haddouti. TOC. Intro SAX – event oriented processing DOM – Tree Structure Use of DOM and SAX Implementations Literatur. XML Prozessors.
E N D
XML and Data Management XML Processors Hachim Haddouti Al Akhawayn University SSE H.Haddouti@alakhawayn.ma http://mail.alakhawayn.ma/~H.Haddouti
TOC • Intro • SAX – event oriented processing • DOM – Tree Structure • Use of DOM and SAX • Implementations • Literatur
XML Prozessors • To make the content of an XML Document available for an application • Standardized Interfaces for several programming languages (Java, Phyton, C, C++, ...) • Embedding of libraries • Expansion of Entities • Non validated / validated processors in relation to DTD or XML Schema
SAX – event-oriented Processing • actions are caused by components of the document • sequential processing • stateless startDocument() <?xml version="1.0"?> startElement("hotel", AttributeList( <hotel id= length=1, {name= value= <hotelname> startElement("hotelname", null) Strand Hotel Hübner </hotelname> endElement("hotelname") <adresse> startElement("adresse", null) <ort> startElement("ort", null) Warnemünde characters(char[], start, length) </ort> endElement("ort") <telefon> . . . 0381/548230 </telefon> </adresse> endElement("adresse") endElement("hotel") </hotel> endDocument()
DOM - Manipulation of Tree W3C Recommendation describe interfaces to the access to XML Documents and to the change of structure and contents it is not defined the underlying implementation and storage of the XML-Dokumente!
Methods Class: Node • All of document components are based on this class • The class contains • Methods to identify the node types • Methods to navigate thru document structure • Methods to manipualte document structure
DOM (Document Object Model) –Navigation thru Document • XML Document as Tree • Access only by navigation • update of the document structure also possible 1 Starting from this node 5 6 The following methods of class node Provide nodes or node list as result: 4 1 - getParentNode() 2 3 2 - getFirstChild() 3 - getLastChild() 4 - getChildren() 5 - getPreviousSibling() 6 - getNextSibling()
DOM – Manipulation of the Structure insertBefore (newChild, refChild) Delete, add, change etc. document components refChild newChild refChild appendChild (newChild) newChild replaceChild (newChild, oldChild) newChild oldChild removeChild (oldChild) oldChild
Interface NodeList • The Calss NodeList allows processing node list. • Methods of NodeList Interfaces are: • item(index) to access a single node, the index start with 0. • getLength() provides the number of nodes in list.
Interface NamedNodeMap • Methods of NamedNodeMap: • getLength() – number of nodes in list • item(index) – access singles nodes in list • getNamedItem(name) and • removeNamedItem(name) read and delete of an element node resp.
Class: Element • The following methods allow reading and updating the information store din Element. • getTagName() Tag name of an element • getAttribute(name) • setAttribute(name, value) update an attribute • removeAttribute(name) delete an attribute • getElementsByTagName(tagname) • getElementsByTagName(*) return all elements.
Method of Class Attribute • Using these methods to get information about attributes: • getName() • getValue() • setValue(value).
Methods of Class Character Data • Read, Set and Update of text components • Methods of Character Data are: • getLength() length of text component • getData() whole text • substringData(start,count) subset of text, which begins with Start and goes for the specified length • appendData(text) • replaceData(offset,count,text) • insertData(offset,text) • deleteData(offset,count)
DOM vs. SAX XML- DTD Document Parser DOM SAX startDocument startElement Anwendung, startElement implemen- tiert endElement Document ... Handler endDocument nach Roland Böndgen
SAX simple access for simple structured and equaly structured Documents suitable for large XML-Documents Access to few parts of a documents DOM Navigation thru Document structure therefore context dependent access Manipulation of the structure not well suitable for very large XML Documents Use of SAX and DOM