300 likes | 309 Vues
Extensible Markup Language. Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University. Outline. Overview. Basic XML Syntax. User-Defined XML Structure Document Type Definition. Overview. What is Markup Language ? Old style communication for editing.
E N D
Extensible Markup Language Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University
Outline • Overview. • Basic XML Syntax. • User-Defined XML Structure • Document Type Definition.
Overview • What is Markup Language ? • Old style communication for editing. • Between writer and editor. • Example: This are is a mark-up. We can more text. add • Sometimes, called “Metalanguage” ^
Overview • Family of Computer Markup Language • Standard Generalized Markup Language (SGML) • Father of them all. • Complex. • HyperText Markup Language (HTML) • The most popular child. • Focus on presentation: for human.
Overview • Extensible Markup Language (XML) • Become increasingly popular. • Similar to HTML. • Focus on describing data • For human and machine. • Extensible • Language for creating other languages. • Base syntax. • User-defined structure.
Example <?xml version=“1.0” encoding = “UTF-8”?> <endangered_species> <animal> <name language=“English”>Tiger</name> <name language=“Latin”>pantera tigris</name> <threats> <threat>poachers</threat> <threat>habitat destruction</threat> <threat>trade in tiger bones for traditional Chinese medicine(TCM)</threat> </threats> <weight>500 pounds</weight> <length>3 yards from nose to tail</length> ... </endangered_species>
XML Siblings • XML Structure Definition • Document Type Definition (DTD). • XML Schema. • XML Parser • DOM. • SAX. • XML-related technologies • XSLT. • XPath.
XML Components • Element: tag and content • Data. <name>Tiger</name>
XML Components • Attribute: name and value • Metadata = Data of Data. <name language=“English”>Tiger</name> <name> <language>English</language> <text>Tiger</text> </name>
XML Components • Nested element <animal> <name language=“English”>Tiger</name> <name language=“Latin”>Panthera tigris</name> <weight>500 pounds</weight> </animal>
XML Components • Empty element <animal></animal> <animal /> <picture filename=“tiger.jpg” />
XML Components • Special symbols • & for ampersand (&). • < for less than sign (<). • > for greater than sign (>). • " for double quotation (“). • ' for single quotation or apostrophe (‘). <weight><500 pounds</weight>
XML Components • Comment <!–- This is a comment. It can span multiple lines.-->
Basic XML Syntax • All XML files/applications must conform to basic XML syntax • XML declaration is not required (but recommended). <?xml version=“1.0”?> <endanger_species> <name>Tiger</name> </endangered_species>
Basic XML Syntax • One and only one root element. <?xml version=“1.0”?> <endanger_species> <name>Tiger</name> </endangered_species>
Basic XML Syntax • Balanced and matched opening/closing tags. <?xml version=“1.0”?> <endanger_species> <name>Tiger</name> <picture filename=“tiger.jpg” /> </endangered_species>
Basic XML Syntax • Case-sensitive. <name>Tiger</Name> • Case-sensitive. <picture filename=“tiger.jpg” />
User-Defined XML Structure • XML basic syntax • The pattern of all XML documents. • Does not say about “structure”. • Followed basic syntax = well-formed document. • User-Defined XML Structure • Which “tags” and “attributes” are allowed. • Describe the structure. • Followed “structure” = valid document.
Parser and DTD Check input using basic syntax and DTD. XML Parser XML Document Yes/No DTD
Document Type Definition (DTD) • Old-fashioned, simple, but widely used. • Internal DTD. <?xml version=“1.0”?> <!DOCTYPE endangered_species [ ... ]> <endangered_species> <animal> ...
Document Type Definition (DTD) • External DTD. <?xml version=“1.0” standalone=“no”?> <!DOCTYPE endangered_species SYSTEM “http://www.natawut.com/xml/my_xml.dtd”> <endangered_species> <animal> ...
Defining Elements <!ELEMENT endanger_species (animal)> <!ELEMENT picture EMPTY> <!ELEMENT endanger_species ANY>
Defining Elements <!ELEMENT name (#PCDATA)> <!ELEMENT weight (#PCDATA)> <!ELEMENT threat (#PCDATA)> <name language=“English”>Tiger</name> <weight>500 pounds</weight> ...
Defining Elements <!ELEMENT animal (name, threats, weight, length, source, picture, subspecies)> <animal> <name language=“English”>Tiger</name> <threats> <threat>poachers</threat> </threats> <weight>500 pounds</weight> ... </animal>
Defining Elements <!ELEMENT characteristics ((weight, length) | picture)> <characteristics> <weight>500 pounds</weight> <length>3 yards from nose to tail</length> </characteristics> <characteristics> <picture filename=“tiger.jpg”/> </characteristics>
Defining Elements <!ELEMENT animal (name+, threats, weight?, length?, source, picture, subspecies*)> <!ELEMENT threats (threat, threat, threat+)>
Defining Attributes <!ELEMENT population (#PCDATA)> <!ATTLIST population year CDATA #IMPLIES> <population>445</population> <population year=“2002”>445</population> <population year=“year-rabbit”>445</population>
Defining Attributes <!ELEMENT population (#PCDATA)> <!ATTLIST population year CDATA #REQUIRED> <!ELEMENT population (#PCDATA)> <!ATTLIST population year (2002|2003) #REQUIRED>
Defining Attributes <!ELEMENT population (#PCDATA)> <!ATTLIST population year CDATA ”2002”> <!ELEMENT population (#PCDATA)> <!ATTLIST population year CDATA #FIXED ”2002”>
Putting Them Together <!ELEMENT endangered_species (animal*)> <!ELEMENT animal(name+, threats, weight?, length?, source, picture, subspecies+)> <!ELEMENT name (#PCDATA)> <!ATTLIST name language (English | Latin)> ...