1 / 44

Introduction to XML

Introduction to XML. Kostas Kontogiannis Evan Mamas. Outline. Introduce XML, HTML and SGML Compare and Contrast XML vs. HTML XML vs. SGML XML Components, Applications, Industry Thoughts on XML. What is XML?. eXtensible Markup Language Proper subset of SGML for web use Meta-language

dani
Télécharger la présentation

Introduction to XML

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to XML Kostas Kontogiannis Evan Mamas

  2. Outline • Introduce XML, HTML and SGML • Compare and Contrast • XML vs. HTML • XML vs. SGML • XML • Components, Applications, Industry • Thoughts on XML

  3. What is XML? • eXtensible Markup Language • Proper subset of SGML for web use • Meta-language • Allows you to create your own markup languages • Compromise between HTML and SGML

  4. What is HTML ? • HyperText Markup Language • Language to describe information for transmission over the web. • Uses tags to markup the information • Tags are just a formatting tool • Example • <H1> Hello, World </H1> • Hello, World

  5. Why isn’t HTML enough? • Good enough for presenting text on the web • Not accepted as an authoring or archival form • Extensibility • HTML standard changes continually • Uses tags for formatting • Structures • Has no defined or definable structural rules

  6. What is SGML ? • Standard Generalized Markup Language • International Standard for over 10 years • Language for specifying markup languages • Describes only the formal properties and inter-relations of the components of a document • Document, Entities, Elements, Attributes

  7. Uses of SGML • Formally structured documents • Technical Manuals • Exchange documents • Product documentation • Data encoding • Interchange specification • Provide long-term storage of information which was independent of suppliers and changes in h/w and s/w

  8. SGML Example • Memo • DTD (Document Type Definition) <to>All staff <from>Martin Bryan <date>5th November <subject>Cats and Dogs <text>Please remember to keep all cats and dogs indoors tonight. <!DOCTYPE memo [ <!ELEMENT memo O O ((to & from & date & subject?), text) > <!ELEMENT text - O (para+) > <!ELEMENT para O O (#PCDATA) > <!ELEMENT (to, from, date, subject) - O (#PCDATA) > ]>

  9. Why isn’t SGML enough? • Specification is very long • Contains many options not needed for Web applications • Time consuming and high cost • Expensive tools • Too much for small applications • Bad reputation

  10. XML vs. HTML • New tags and attributes definitions allowed • Document structures can be nested to any level of complexity • Structural validation is possible by describing the grammar

  11. XML vs. SGML • XML is the minimum required subset of SGML for web use • Easier to implement and to create tools for • A new attempt at structured markup languages with a new “face”

  12. XML Components • XML Style Language (XSL) • Cascading Style Sheets, level 2 CCS2 • XML Document Object Model (DOM) • XML Linking Language (XLL) • XML Pointer Language (XPL) • XML Name Spaces • Synchronized Multimedia Integration Language (SMIL) • Resource Description Framework (RDF) • Mathematical Markup Language (MathML)

  13. XML Components (cont.) • XML Style Language (XSL) • Defines a way to present the documents • Separates formatting from content • Has two steps: • Generate a result tree (associate patterns with templates) • Use XML Namespace (formatting vocabulary) to generate formatted output. • Similar to DSSSL for SGML

  14. XML Components (cont.) • Cascading Style Sheets, level 2 CCS2 • Defines a way to present documents • Similar to XSL (Not as strong) • Supported by most browsers <HTML> <TITLE>Bach's home page</TITLE> <STYLE type="text/css"> H1 { color: blue } </STYLE> <BODY> <H1>Bach's home page</H1> <P>Johann Sebastian Bach was a prolific composer. </BODY> </HTML>

  15. XML Components (cont.) • XML Document Object Model (DOM) • In-memory model for representing parsed XML documents • Designed to provide common structures in XML browsers • Intended to enable interoperable XML processing across browsers • Implemented by Internet Explorer and Netscape

  16. XML Components (cont.) • XML Linking Language (XLL) • Links by reference rather than exact location • Provides hyperlinking elements • Simple links like HTML links • Extended • Multi-directional links • Links with multiple destinations • Placing content inline from a linked document • Requires use of XML Pointer Language

  17. XML Components (cont.) • XML Name Spaces • Vocabulary of all elements and attribute types • Namespace prefix (mapped to Uniform Redource Identifier) • Local Part • Allows use of names defined in other documents • Modularity and reuse of a markup • Mechanisms to establish name scope

  18. XML Components (cont.) • Synchronized Multimedia Integration Language (SMIL) • Language for describing interactive synchronized multimedia distributed on the Web • Several components (images, video, audio) can be linked together to create a presentation on the web • Resource Description Framework (RDF) • Abstract mechanism for defining simple relationships among web resources • Mathematical Markup Language (MathML) • Language to describe mathematical expressions

  19. XML DTD • Defines the hierarchy of all user-defined elements (tags) in the XML document • Declares the attributes and behaviour of each XML element • Each XML document calls a specific DTD file to validate its elements

  20. XML DTD • <?xml version="1.0" encoding="UTF-8"?> • <!-- DTD for a simple program beginning of element declarations--> • <!--the root tag of Language--> • <!ELEMENT Language (FileTag*,Declaration*,Function_Call*)> • <!ELEMENT FileTag (IncludeTag*,SourceTag*)> • <!ELEMENT IncludeTag (#PCDATA)*> • <!ELEMENT SourceTag (#PCDATA)*> • <!ELEMENT Declaration (Type_Name|Identifier)*> • <!ELEMENT Type_Name (#PCDATA)*> • <!ELEMENT Identifier (#PCDATA)*> • <!ELEMENT Function_Call (Return_Type*,Function_Name*,Argument*)> • <!ELEMENT Return_Type (Return_Var*)> • <!ELEMENT Return_Var (#PCDATA)> • <!ELEMENT Function_Name (#PCDATA)> • <!ELEMENT Argument (parameterName*)> • <!ELEMENT parameterName (#PCDATA)> • <!--We may want to have external calls or graphics in our document. Currently there is none, but we still have to declare them--> • <!ELEMENT External_Call EMPTY> • <!ELEMENT Graphics EMPTY> • <!--end of element declarations--> Defines what other tags are within the <Language> tag Defines data types for contents within the <IncludeTag> tag

  21. XML Document (page 1 of 2) • <?xml version="1.0"?> • <?xml:stylesheet type="text/xsl" href="studentXSL1.xsl" ?> • <!DOCTYPE Language SYSTEM "Student.dtd"> • <Language> • <FileTag> • <IncludeTag>include stdio.h:</IncludeTag> • </FileTag> • <FileTag> • <IncludeTag>include math.h</IncludeTag> • </FileTag> • <FileTag> • <SourceTag>code statement3:</SourceTag> • </FileTag> • <FileTag> • <SourceTag>code statement2:</SourceTag> • </FileTag> • <Declaration> • <Type_Name>char*</Type_Name> • <Identifier>UW</Identifier> • </Declaration> Calls a XSL style sheet Calls a DTD document

  22. XML Document (page 2 of 2) • <Declaration> • <Type_Name>int</Type_Name> • <Identifier>numOfstudents</Identifier> • </Declaration> • <Declaration> • <Type_Name>char*</Type_Name> • <Identifier>facultyName</Identifier> • </Declaration> • <Function_Call> • <Return_Type> • <Return_Var>student_profile</Return_Var> • </Return_Type> • <Function_Name>elec_eng</Function_Name> • <Argument> • <parameterName>name</parameterName> • </Argument> • </Function_Call> • </Language>

  23. XML Namespaces • Latest milestone for W3C's XML technology (14-January-1999 ) • W3C’s definition of XML NameSpaces: • “XML namespaces provide a simple method for qualifying element and attribute names used in Extensible Markup Language documents by associating them with namespaces identified by URI references.” • Why use it? • Maintain tag meaningfulness and uniqueness • How does it solve the problem? • Add context to XML tags by using prefix and URL

  24. XSL Document (Page 1 of 3) • <?xml version="1.0"?> • <DIV xmlns:xsl="http://www.w3.org/TR/WD-xsl"> • <html:html xmlns:html="http://www.w3.org/TR/REC-html40"> • <i>This page consists of XML, XSL, Namespace, HTML, and Java Applet</i> • <html:head><html:title><H1>Sample C Code (hidden XML tag)</H1></html:title></html:head> • <xsl:for-each select="Language"> • <TD STYLE="padding-left:1em"> • <DIV><xsl:value-of select="/"/></DIV> • <html:font color="red">The above command prints out all contents within tags without any formmating, ordering, linebreaks, etc.</html:font> • </TD> • </xsl:for-each> • <xsl:for-each order-by="+ IncludeTag" select="Language/FileTag"> • <TD STYLE="padding-left:1em"> • <html:BR></html:BR> • <DIV><html:BR><xsl:value-of select="IncludeTag"/></html:BR></DIV> • </TD> • </xsl:for-each> • <html:font color="red">End of IncludeTag, ascending sort on Include Tag Content</html:font> Namespace for XSL Namespace for HTML

  25. XSL Document (Page 2 of 3) • <xsl:for-each order-by="+ SourceTag" select="Language/FileTag"> • <TD STYLE="padding-left:1em"> • <html:BR></html:BR> • <DIV><xsl:value-of page-break-after="SourceTag" select="SourceTag"/></DIV> • </TD> • </xsl:for-each> • <html:font color="red">End of SourceTag, ascending sort on SourceTag Content</html:font> • <html:BR></html:BR> • <xsl:for-each order-by="+ Type_Name" select="Language/Declaration"> • <TD STYLE="padding-left:1em"> • <html:BR></html:BR> • <DIV><html:BR><xsl:value-of select="Type_Name"/></html:BR></DIV> • <DIV><html:BR><xsl:value-of select="Identifier"/></html:BR></DIV> • </TD> • </xsl:for-each> • <html:font color="red">End of Declaration, ascending sort on Type_Name</html:font> • <DIV></DIV>

  26. XSL Document (Page 3 of 3) • <xsl:for-each select="Language/Function_Call"> • <TD STYLE="padding-left:1em"> • <html:BR><DIV><xsl:value-of select="Return_Type"/></DIV></html:BR> • <html:font color="red">End of Return_Type</html:font> • <html:BR><DIV><xsl:value-of select="Function_Name"/></DIV></html:BR> • <html:font color="red">End of Function_Name</html:font> • <html:BR></html:BR> • <html:BR><DIV><xsl:value-of select="Argument"/></DIV></html:BR> • <html:font color="red">End of Argument</html:font> • <html:BR></html:BR> • </TD> • </xsl:for-each> • <html:BR></html:BR> • <html:APPLET code="AgentAction.class" width="400" height="200"></html:APPLET> • <html:BR></html:BR> • </html:html> • </DIV>

  27. Applications that require XML • Information exchange between heterogeneous databases • Health care example • Distributed processing • Semiconductor industry example • Multiple views of the same data • “Intelligent” information agents

  28. Using XML • XML for Storage • Compact syntax • Generalized and standarized • Product independent • XML for Searching • Use of content specific markup enables robust searching • Search engines need to be XML aware • Can use current SGML search engines

  29. What is DOM? • A programming API for XML • logical structure of document • Access and Manipulation of documents

  30. What is DOM? • As an object model, DOM identifies • Interface and Objects used for the doc. • Behaviours and Attributes • Relationships and Collaborations of Interfaces and Objects

  31. What is DOM? • 2 Major Components for DOM Level 1 • DOM Core = Basic functionalities for XML • DOM HTML = Objects and Methods specific to HTML • Level 2 • DOM CSS, DOM Event, DOM Filters and Iterators, DOM Range

  32. Advantages of using DOM • Easy to create, navigate, add, modify documents • DOM abstraction avoids implementation dependencies • DOM applications may use additional language bindings

  33. A Typical DOM Structure <condition_statement> <if_statement> <if_tag> if </if_tag> <expression_tag> (b == c) </expression_tag> <statement_tag> {a += c} </statement_tag> </if_statement> </condition_statement>

  34. <condition_statements> <if_statements> <if_tag>> <expression_tag> <statement_tag> if (b==c) {a+=c} A Typical DOM Structure (2)

  35. A Typical DOM Structure (3) • DOM abstraction is a Tree or Forest Structure • Users have full flexibility to specify the structure • Structural Isomorphism

  36. Some Key Objects • Node • Tree node of the document • root node, parents and children • Element (is a Node object) • Elements of a document • Represents contents between the start tag and end tag • Attributes: defined by DTD

  37. Some Key Objects (2) • Document • root node of a document • NodeIterator • iterates over a set of nodes specified by a filter • AttributeList • collection of Attribute objects, indexed by attribute name

  38. Some Key Objects (3) • Attribute • attribute of an Element Object • DocumentContext • respository for metadata about a document • DOM • provides instance-independent document operations

  39. Memory Management for DOM • DOM APIs operate across a variety of memory implementation methods: • Language platforms that do not expose memory management to user • Language (Java) that provides constructors with Garbage collection capability • Language (C/C++) that requires explicit memory allocations

  40. Resources/Quirks • IE 5 and Navigator 5.0 implement different features: • IE 5.0 - XML/XSL Navigator - XML/CSS • Navigator to support RDF • XML Resources: • http://www.swen.uwaterloo.ca/~group1

  41. Using XML (cont.) • XML for Presentation • Convert to HTML at server • Use Java applications to render in browser • Slow • Use XSL or CSS to render in browser • Fast

  42. XML in the industry • Explosive growth of XML tools and specifications • Tools: JADE, MSXML, JUMBO,... • Specifications: CDF, CFML,EDI • Browsers: IE, Netscape

  43. Thoughts on XML • Seems like a transition stage between HTML and SGML • Will we eventually end up using SGML? • XML follows basic principles of SE • Higher abstraction layer • Reuse • Modularity

  44. References • XML.COM - A guide to XML • http://www.xml.com/xml/pub/w3j/s3.walsh.html • XML.COM - The Road to XML: Adapting SGML to the Web • http://www.xml.com/xml/pub/w3j/s1.discussion.html • The Computer Bulletin - The XML Files • http://www.bcs.org.uk/publicat/ebull/may98/xml.htm • XML, Java, and the future of the Web • http://sunsite.unc.edu/pub/sun-info/standards/xml/why/xmlapps.htm • XML: What is it • http://iai.sgml.com/980106-01.asp • Why do we need XML? • http://info.admin.kth.se/SGML/Konferenser/xml98sve/seminar.html • An Introduction to the Standard Generalized Markup Language • http://www.personal.u-net.com/~sgml/sgml.htm • SGML101 • http://www.uslynx.com/sgml101.htm

More Related