320 likes | 439 Vues
XML (eXtensible Markup Language) is a versatile markup language that blends flexibility, simplicity, and readability for both machines and humans. Unlike HTML, a presentation-oriented language with a set tag structure, XML does not have fixed tags, allowing for extensibility based on domain needs. This guide examines XML's structure, characteristics, and applications, including data exchange, B2B communications, and separation of content and presentation. It also highlights the relationship between XML, HTML, and CSS, providing an overview of their respective roles in web technology.
E N D
XML (I) COSC 643 Sungchul Hong
XML • eXtensible Markup Language • XML offers a unique combination of flexibility, simplicity, and readability by both humans and machines. • Format and converting formats • Standard Generalized Markup Language (SGML) predecessor
XML and HTLM • HTML • A presentation based markup language • Has a fixed set of tags • XML • A domain-based markup language • Has no fixed set of tags • extensible
XML and HTLM • SGML descendents • XML • HTML • HTML’s syntax has always been looser and more forgiving • Cascading Style Sheets Level 1 specification
MSXML3 • http://www.microsoft.com/msdownload • msxml3sp2Setup.exe • Install • Testing
test.xml <?xml version=‘1.0’ ?> <?xml:stylesheet type=“text/xsl” herf=“test.xsl”?> <document> <message> It worked! </message> </document>
test.xsl <?xml version=‘1.0’ ?> <?xslstylesheet version”1.0” xmlns:xsl=http://www.w3.org/1999/XSL/Transform> <xsl:template match=“/”> <html> <body> <h1><xsl:value-of select=“//message” /></h1> </body> </html> <xsl:template> </sxl:stylesheet>
CSS • <HTML> • <HEAD><TITLE>Formatting with CSS, modifying standard HTML</TITLE> • <STYLE TYPE="text/css"> • H1 {font-family: Arial, Helvetica; font-weight: bold; font-size: 24pt} • EM {font-weight: bold; font-style: normal} • CITE {font-style: italic} • VAR {font-family: Courier; font-weight: bold} • CODE {font-family: Courier} LI {font-family: Arial, Helvetica} • </STYLE> • </HEAD>
CSS • <BODY> • <H1>Introduction to HTML</H1> • <P>This page has been created purely with logical tags. No additional formatting has been specified by the designers.</P> • <P>While it might be nice to specify text like we could in QuarkXPress, we'll settle for applying <EM>emphasis</EM> where appropriate, <CITE>citations</CITE> when necessary, and maybe highlight a <VAR>variable</VAR> along the way. We can also indicate code listings:</P>
CSS • <CODE> 10 PRINT "HELLO WORLD"<BR> 20 END<BR> </CODE> <P>Bulleted lists are easy too:</P> • <UL> <LI>HTML Structures</LI> <LI>CSS Structures</LI> <LI>XML Structures</LI> • </UL> • <P>Numbered and lettered lists are also fun:</P> <OL> <LI>Item #1</LI> <LI>Item #2</LI> </OL> • </BODY> • </HTML>
XSL • eXstensible Style Language • XSL goes beyond CSS by creating formatting structures for documents as well as elements. • XSL allows developers to create styles that take into account (or even modify) an element’s position in a document, its ancestry (by which other elements it is contained), and its uniqueness.
XML Parsers • Used to extract or analyze the data in XML documents • Analyzes syntax of input XML documents • Passes results of analysis to applications using event callbacks • Reports errors and warnings discovered
XML Parsers • Simple API for XML (SAX) • No modification of the document • Fastest and least memory intensive • Sequential access • Document Object Model (DOM) • Memory intensive • Allows modification • Tree structure
What Can XML Be Used For? • Exchanging information between applications • Sharing data between distributed components • B2B communications (with XSL, XSLT) • Crating separation of presentation from content • Defining configuration information
XML Characteristics • XML documents contain a hierarchy of tags • The tag structure is kike HTML • XML is case-sensitive • XML is a superset of HTML • An HTML file is really just an XML document.
XML Elements • Basic components of XML documents • Elements must start with a letter, underscore or colon • Encapsulate element content, usually composed of: • Other elements • Character data • Entity references • Delimited using tags
XML Elements • All elements must have a start-tag and an end-tag. • Elements can optionally have attributes • Empty elements can use an abbreviated element form.
<ITEM><PRODNAME>Jimbo‘s Super Clock</PRODNAME>: <PART>SC45-A</PART> <PRICE>$199.95</PRICE> (<AIRF>$19.95</AIRF> freight/air, <GROUNDF>$7.95</GROUNDF> ground) <WARRANTY>Twenty- five year</WARRANTY> Warranty. Made in <ORIGIN>Canada</ORIGIN></ITEM> <ITEM><PRODNAME>Lamp Controller</PRODNAME>: <PART>LC45-X</PART> <PRICE>$25.95</PRICE> (<AIRF>$9.95</AIRF> freight/air, <GROUNDF>$4.95</GROUNDF> ground) <WARRANTY>Ten- year</WARRANTY> Warranty. Made in <ORIGIN>Canada</ORIGIN></ITEM> • <ITEM><PRODNAME>Electroshock Clips</PRODNAME>: <PART>ES45-L</PART> <PRICE>$59.95</PRICE> (<AIRF>$9.95</AIRF> freight/air, <GROUNDF>$4.95</GROUNDF> ground) <WARRANTY>One- year</WARRANTY> Warranty. Made in <ORIGIN>USA</ORIGIN></ITEM>
The Document Entity And Document Element • An XML document has one and only one document entity. • The document entity consists of • Processing elements • Comments • The document element • The document element is the parent of all other elements in the XML document • The Document element cannot be contained in any other elements.
Element Nesting • All elements must be nested properly • No cross overlapping • HTML allows overlapping. <ITEM><PRODNAME>Jimbo‘s Super Clock</PRODNAME> </ITEM>
XML Namespaces • XML Namespaces allow a prefix to be associated with an element to avoid name collisions • XML Namespaces are a W3C specification • A unique URI must be used with a prefix to denote elements in this namespace from other namespaces. • The URI is only for distinguishing prefixes. It is not actually resolved • Namespaces use the reserved word xmlns
Example • <?xml version = “1.0” ?> • <JU : LunchMenu xmlns : JU=“http://catering.com/JU”> • <JU : Maincours>Hamburger </JU : MainCourse> • <JU : Sidedish>French Fries</JU : Sidedish> • <JU : togo/> • </JU : LunchMenu>
Character Data • Character data is defined as any text that is not markup. • Character data • The textual content inside elements • The value of an attribute • A string literal • “&” and “<“ can not be contained inside character data. (entity reference) • <math> 1 < 2 & 2 < 3</math> • “1 < 2 & 2 < 3”
Attributes • Elements can contain attributes to provide information about the element • Attributes are often used to convey information to an XML application • Attributes are not considered part of an element’s content • Attributes must be string literals • Attributes are not part of the presentation to an end user, though they may be used to affect the presentation.
Example • <Dessert type = “Lowfat”>Cheesecake</Dessert> • </Moter cylinders = “6”>
White Space • XML defines white space • Horizontal tab • Line feed • Carriage return • Space • All end of line characters are converted to line feed characters by parsers.
Entity References • < (<) • & (&) • > (>) • ' (‘) • " (“)
Processing Instructions • Processing instructions allow non-content information to be sent from the parser to an application • Processing instructions use the following syntax: <?target instructions ?> • Any PI that starts with xml – is designed to communicate with an XML-specific technology • PIs can be used to communicate information to an XSL processor, for example • <?xml – stylesheet href=“MyXSL.xsl” type= “text/xml”?>
Comments • <!– comment text -- >
CDATA Section • CDATA sections are useful when there is content that would require a lot of escape characters • CDATA sections can be used anywhere regular character data can be used • An XML parser will not attempt to process any data in a CDATA section • CDATA syntax: • <! [ CDATA [ 1 < 2 & e < 3 ]]>