160 likes | 267 Vues
This document discusses the process of validating XML instances using NVDL (Non-Validating DTD Language). It highlights how the "what" part defines the taxonomy (e.g., philosophy books or the Chinese language) and the "payload" part contains the actual data (such as book details or language attributes). The solution outlines the steps to independently validate each section of the document against its respective schema and ensures that the taxonomy value corresponds correctly with the payload content using Schematron for relational validation.
E N D
<Document xmlns="http://www.example.org"> <What taxonomy="http://www.dmoz.org/Society/Book/Philosophy/" /> <Payload> <Book xmlns="http://www.book.org"> <Title>The First and Last Freedom</Title> <Author>J. Krishnamurti</Author> <Date>1954</Date> <ISBN>0-06-064831-7</ISBN> <Publisher>Harper & Row</Publisher> </Book> </Payload> </Document> Constraint Between Components • Consider an XML instance document that has two parts: • The "what" part describes - using a standard taxonomy - what's in the • other part. • 2. The "payload" part contains a data component. • EXAMPLE #1 • The "what" part uses the DMOZ taxonomy to provide the hierarchy for a • philosophy book. The "payload" part contains data on a philosophy book.
Constraint Between Components EXAMPLE #2 The "what" part uses the DMOZ taxonomy to provide the hierarchy for the Chinese language. The "payload" part contains data on the Chinese language. <Document xmlns="http://www.example.org"> <What taxonomy="http://www.dmoz.org/World/Language/Chinese/" /> <Payload> <Chinese xmlns="http://www.chinese.org"> <Pronouns> <Spoken>Ta</Spoken> </Pronouns> </Chinese> </Payload> </Document>
Constraint Between Components PROBLEM How do you create an NVDL script which will validate that: If the value of the taxonomy attribute is http://www.dmoz.org/Society/Book/Philosophy/ then validate the data component in <StructuredPayload> against: the Book schema If the value of the taxonomy attribute is http://www.dmoz.org/World/Language/Chinese/ then validate the data component in <StructuredPayload> against: the Chinese schema Further ... the data components in <StructuredPayload> may be expressed in any schema language (XML Schema, Relax NG, DTD, Schematron). How would you create an NVDL script to do this?
<Document xmlns="http://www.example.org"> <What taxonomy="http://www.dmoz.org/World/Language/Chinese/" /> <Payload> <Chinese xmlns="http://www.chinese.org"> <Pronouns> <Spoken>Ta</Spoken> </Pronouns> </Chinese> </Payload> </Document> An NVDL Processor Divides the Instance into Sections <Document xmlns="http://www.example.org"> <What taxonomy="http://www.dmoz.org/World/Language/Chinese/" /> <Payload> </Payload> </Document> N V D L processor <Chinese xmlns="http://www.chinese.org"> <Pronouns> <Spoken>Ta</Spoken> </Pronouns> </Chinese>
<Document xmlns="http://www.example.org"> <What taxonomy="http://www.dmoz.org/World/Language/Chinese/" /> <Payload> <Chinese xmlns="http://www.chinese.org"> <Pronouns> <Spoken>Ta</Spoken> </Pronouns> </Chinese> </Payload> </Document> … and then Validates each Section Document.xsd Validate <Document xmlns="http://www.example.org"> <What taxonomy="http://www.dmoz.org/World/Language/Chinese/" /> <Payload> </Payload> </Document> N V D L processor <Chinese xmlns="http://www.chinese.org"> <Pronouns> <Spoken>Ta</Spoken> </Pronouns> </Chinese> Chinese.xsd
<Document xmlns="http://www.example.org"> <What taxonomy="http://www.dmoz.org/World/Language/Chinese/" /> <Payload> <Chinese xmlns="http://www.chinese.org"> <Pronouns> <Spoken>Ta</Spoken> </Pronouns> </Chinese> </Payload> </Document> <Document xmlns="http://www.example.org"> <What taxonomy="http://www.dmoz.org/World/Language/Chinese/" /> <Payload> </Payload> </Document> N V D L processor How do we express the constraint between the taxonomy value and the payload value? <Chinese xmlns="http://www.chinese.org"> <Pronouns> <Spoken>Ta</Spoken> </Pronouns> </Chinese>
Acknowledgement • Thanks to George Cristian Bina for explaining to me how to solve this problem (see the following slides for the solution). Thanks George!
Solution • Validate each section independently • Validate the Document section against Document.xsd • Validate the payload section against it's schema • Validate a Book section against Book.xsd • Validate a Chinese section against Chinese.xsd • Attach the payload section to its parent (Document) section and then use Schematron to validate the relationship between the taxonomy and the payload
NVDL Script <?xml version="1.0" encoding="UTF-8"?> <rules xmlns="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" startMode="example"> <mode name="example"> <namespace ns="http://www.example.org"> <validate schema="Document.xsd" useMode="content"/> <validate schema="rules.sch"> <mode> <anyNamespace> <attach/> </anyNamespace> </mode> </validate> </namespace> </mode> <mode name="content"> <namespace ns="http://www.book.org"> <validate schema="book.xsd"/> </namespace> <namespace ns="http://www.chinese.org"> <validate schema="chinese.xsd"/> </namespace> </mode> </rules> See following slides for an explanation
<?xml version="1.0" encoding="UTF-8"?> <rules xmlns="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" startMode="example"> <mode name="example"> <namespace ns="http://www.example.org"> <validate schema="Document.xsd" useMode="content"/> <validate schema="rules.sch"> <mode> <anyNamespace> <attach/> </anyNamespace> </mode> </validate> </namespace> </mode> <mode name="content"> <namespace ns="http://www.book.org"> <validate schema="book.xsd"/> </namespace> <namespace ns="http://www.chinese.org"> <validate schema="chinese.xsd"/> </namespace> </mode> </rules> Validate the Document section against Document.xsd and then for the child (payload) section switch to the content mode.
<?xml version="1.0" encoding="UTF-8"?> <rules xmlns="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" startMode="example"> <mode name="example"> <namespace ns="http://www.example.org"> <validate schema="Document.xsd" useMode="content"/> <validate schema="rules.sch"> <mode> <anyNamespace> <attach/> </anyNamespace> </mode> </validate> </namespace> </mode> <mode name="content"> <namespace ns="http://www.book.org"> <validate schema="book.xsd"/> </namespace> <namespace ns="http://www.chinese.org"> <validate schema="chinese.xsd"/> </namespace> </mode> </rules> Validate a www.book.org payload against book.xsd and a www.chinese.org payload against chinese.xsd
<?xml version="1.0" encoding="UTF-8"?> <rules xmlns="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" startMode="example"> <mode name="example"> <namespace ns="http://www.example.org"> <validate schema="Document.xsd" useMode="content"/> <validate schema="rules.sch"> <mode> <anyNamespace> <attach/> </anyNamespace> </mode> </validate> </namespace> </mode> <mode name="content"> <namespace ns="http://www.book.org"> <validate schema="book.xsd"/> </namespace> <namespace ns="http://www.chinese.org"> <validate schema="chinese.xsd"/> </namespace> </mode> </rules> Attach any child sections of www.example.org to its parent, and then use a Schematron schema to validate constraints between the sections.
Schematron expresses the constraint between the taxonomy value and the payload <?xml version="1.0" encoding="UTF-8"?> <schema xmlns="http://www.ascc.net/xml/schematron"> <ns uri="http://www.example.org" prefix="ex"/> <ns uri="http://www.chinese.org" prefix="ch"/> <ns uri="http://www.book.org" prefix="bk"/> <pattern name="checkTaxonomyPayloadMatch"> <rule context="ex:Document[ex:What/@taxonomy='http://www.dmoz.org/World/Language/Chinese/']"> <assert test="ex:Payload/*[namespace-uri()='http://www.chinese.org']"> When taxonomy is http://www.dmoz.org/World/Language/Chinese/ the Payload should contain content from the http://www.chinese.org namespace </assert> </rule> <rule context="ex:Document[ex:What/@taxonomy='http://www.dmoz.org/Society/Book/Philosophy/']"> <assert test="ex:Payload/*[namespace-uri()='http://www.book.org']"> When taxonomy is http://www.dmoz.org/Society/Book/Philosophy/ the Payload should contain content from the http://www.book.org namespace </assert> </rule> </pattern> </schema>
Implementation • See the folder example23 for the NVDL script, the schemas, and actual XML instances.
Validating Constraints Across Components • This example has shown how to express constraints across components • Furthermore, the components can be expressed in different schema languages. Relax NG Wow! DTD Schematron XML Schema