260 likes | 373 Vues
RelaxNGCC is a parser generator and data binding tool designed to simplify XML parsing and development. It bridges the gap between existing code and schemas, reducing development time and improving code quality. With RelaxNGCC, anchoring data to variables, executing code at the right moment, and utilizing pattern blocks become easier. The code generation process assigns each pattern block its own class and fields, creating a more efficient system. The runtime environment provided by RelaxNGCC helps manage generated code seamlessly, allowing the addition of user-defined functions and fields. Put RelaxNGCC into practice to read XML configuration files and extend the runtime for customized functionality effortlessly.
E N D
Flexible Data-binding With RelaxNGCC Kohsuke Kawaguchi kk@kohsuke.org
What’s RelaxNGCC? • RelaxNGCC is: • Parser generator (e.g., YACC, JavaCC, or ANTLR) • Data-binding tool (e.g., JAXB, Castor, or Relaxer) • Purpose • To simplify XML parsing development
Before RelaxNGCC XMLdocument XML Parser Hand-written SAX Handler • But writing SAX handler ... • Is hard and tiring • Takes time • Is routine and not fun So people turn their eyes to data-binding
Problems With Data-binding Tools • Impedance mismatch b/w XML and ideal OM • What does (A|B|C)* mean? • Customization is limited • Generated code is low in quality • Expose a lot of unnecessary methods
Problems With Data-binding Tools • Unable to bridge existing code and existing schema • Take time to get used to the generated code • Need to know how schemas are mapped
After RelaxNGCC AnnotatedRELAX NG Schema • Reduces development time XMLdocument XML Parser Generated SAX Handler
How RelaxNGCC Works? • By associating code and schema <element name="team"> <oneOrMore> <element name="player"> <attribute name="number"> number=<data type="int" /> </attribute> <element name="name"> name=<text /> </element> </element> </oneOrMore> </element> <element name="team">System.out.println("start"); <oneOrMore> <element name="player"> <attribute name="number"> number=<data type="int" /> System.out.print(number+":"); </attribute> <element name="name"> name=<text /> System.out.println(name); </element> </element> </oneOrMore> System.out.println("end"); </element>
Key Concepts • Anchoring data to variables Values are copied to specified variables as document gets parsed <attribute name="number"> number=<data type="int" /> </attribute> <player number="1" />
Key Concepts • Code will be also executed at the "right" moment <?xml version="1.0“?> <team> <player number="1"> <name>me</name> </player> <player number="2"> <name>you</name> </player> </team> start 1:me 2:you end
Key Concepts • Pattern blocks work like function calls passing data down and up across boundaries <start> <element name="foo">result=<ref name="body"/> </element> </start> <define name="body" c:params="int i" c:return-type="int" c:return-value="i"> <element name="bar">j=<text/> </element> </define> <start> <element name="foo">result=<ref name="body"/>(3); </element> </start> <define name="body" c:params="int i" c:return-type="int" c:return-value="i"> <element name="bar">j=<text/> i+=Integer.parseInt(j); </element> </define>
Code Generation • Each pattern block gets its own class • At runtime, new object is allocated to process new block <grammar> <define name="Foo"> ... </define> <define name="Bar"> ... </define> </grammar> Class Foo Class Bar
Code Generation • Aliases become fields • Additional methods can be defined <define name="Foo"> <cc:java-import> *** 1 *** </cc:java-import> <cc:java-body> *** 2 *** </cc:java-body> abc = <text/> </define> import x.y.z; *** 1 *** class Foo { *** 2 *** String abc; ... }
Runtime • Code used to help generated code • Just 3 classes • No runtime version dependency • Runtime receives SAX events and coordinate handlers SAX events Generated Runtime Generated SAX Handler Generated SAX Handler Generated Handlers
Runtime • Provides services to user-specified code • Retrieve Locator object • Resolve namespace prefix • Redirect sub-tree to another SAX ContentHandler
Runtime • User-defined code can be added • Added methods/fields available to handlers • Useful to keep global info SAX events Default Runtime Generated SAX Handler Generated SAX Handler Generated Handlers extend access Customized Runtime
Runtime (example) <grammar cc:runtime-type="org.acme.foo.MyRuntime"> <define name="Foo"> runtime.myFunction(); ... </define> </grammar> class MyRuntime extends NGCCRuntime { public void myFunction() { ... } }
Put in Practice • Reading XML configuration file • Extend runtime to hold Options class • Fill in the structure as you go through document <element name="config"> <oneOrMore> <element name="param"> name = <attribute name="name"/> value= <text/> </element> runtime.opt.properties.put( name,value); </oneOrMore> <attribute name="paramX"> runtime.opt.paramX = <text/> </attribute> </element> class Options { Properties properties; String paramX; } class MyRuntime : NGCCRuntime { public Options opt; }
Put in Practice • Quickly build Abstract Syntax Tree • Just use generated class hierarchy and their fields • Use cc:class to throw in extra classes <element name="config"> <cc:java-body> public Set params; </cc:java-body> <oneOrMore> p= <group cc:class="Param"> <element name="param"> name = <attribute name="name"/> value= <text/> </element> </group> params.add(p); </oneOrMore> <attribute name="paramX"> paramX = <text/> </attribute> </element>
Put in Practice • Build full-blown object model • RelaxNGCC uses itself to parse RELAX NG • Design OM without worrying about syntax • Then use RelaxNGCC to build a parser for that Good efficient parser in short time
Why RELAX NG? • Cannot write annotation like this • How can I anchor 10 values to 10 different variables? • No way! <xs:sequence> <xs:element ref="foo"/> <cc:java>...</cc:java> <xs:element ref="bar"/> </xs:sequence> <xs:element ref="foo" maxOccurs="10"/>
Why RELAX NG? • Formal model makes RelaxNGCC simple • Simpler state management • Simpler schema parsing • Uniform treatment of attributes/elements
Why RELAX NG? • Some XML Schema features don't work well • Nillable • Type substitution • Substitution group ... ironic because all those features are supposed to be for data-oriented XMLs
Loosely-coupled Systems • Type sharing makes systems tightly-coupled • Some says that's what XML is trying to avoid • Better to share the syntax w/o sharing data model • RelaxNGCC allows you to do this!
License • Compiler • GPL • Generated code, including runtime • All yours!
To Get More Information • Project web-site • http://relaxngcc.sourceforge.net/ • Contact developers • http://groups.yahoo.com/group/reldeve/ • RELAX NG Info • http://relaxng.org/ • This presentation • http://www.kohsuke.org/
End • Acknowledgement • Daisuke Okajima, the inventor of RelaxNGCC • Sun Microsystems, for allowing me to work on RELAX NG • Any Question?