1 / 34

DB2 for Linux, Unix, and Windows

DB2 for Linux, Unix, and Windows. pureXML Indexing Overview DB2 9, 9.5, and 9.7 for Linux, Unix, and Windows Christina (Tina) Lee IBM Silicon Valley Laboratory. September 2009. Agenda. pureXML Basics Regions and Paths Indexes Index on XML column DB2 9.5 Reject Invalid Values option

york
Télécharger la présentation

DB2 for Linux, Unix, and Windows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DB2 for Linux, Unix, and Windows pureXML Indexing Overview DB2 9, 9.5, and 9.7 for Linux, Unix, and WindowsChristina (Tina) LeeIBM Silicon Valley Laboratory September 2009

  2. Agenda • pureXML Basics • Regions and Paths Indexes • Index on XML column • DB2 9.5 Reject Invalid Values option • Common User Errors • Queries and XML Indexes • Catalog Changes • DB2 9.7 • XML indexes on Range Partitioned Tables • Online Index Create and Online Index Reorg • Index Compression

  3. pureXML Basics • XML stored in a parsed hierarchical format • No fixed XML schema per XML column required • XML Schema validation is optional, per document • XML indexes for specific elements/attributes • XQuery and SQL/XML Integration create table dept (deptID char(8),…, deptdoc xml); DB2 Storage Relational Storage page page page pureXML Storage

  4. Storing the XML Document

  5. XML Regions and Column Path Indexes • XML Regions Index • System generated when first XML column created or added to table • Nodes and subtrees in a data page form regions in a document • Provides logical mapping of regions to retrieve document data • XML Column Path Index • System generated for each XML column created or added to table • Maps unique paths to path ids for each XML column • Used to improve performance during queries

  6. Index on an XML Column

  7. The Big XML Indexing Picture

  8. CREATE INDEX for Index on XML Column • Index created on single XML column • Composite keys not supported • Only indexes document nodes that satisfy XML pattern • XML index specification • GENERATE KEY USING XMLPATTERN • XML pattern expression • Data type

  9. Key Generation • Relational index inserts one key per table row • Index on XML Column may insert multiple keys per table row • Multiple parts of document may satisfy XML pattern

  10. XML Documents for Examples

  11. XML Pattern: Path Expression Steps • Supports subset of XQuery path expressions • Path expression steps separated by forward slash (/) • Double forward slash (//) is abbreviated syntax for /descendant-or-self::node()/

  12. Qualifying Paths and Nodes • Set of nodes may qualify if single path specified • Set of paths and nodes may qualify if wildcard, descendant axis, or descendant-or-self axis specified

  13. Specifying text()

  14. If text() not specified

  15. Data Types • Four SQL data types are supported • VARCHAR • DOUBLE • DATE • TIMESTAMP

  16. VARCHAR(n) • Values longer than specified length(n) are not indexed • Document insertion or index creation will fail • Index can support both range scans and equality look-ups • Trailing blanks are significant during string comparisons

  17. VARCHAR HASHED • Has no length limit and can index arbitrary length character strings • System generates an 8-byte hash code over entire string • Only used for equality look-ups and not range scans

  18. DOUBLE • All numeric values will be converted and stored in the index as the DOUBLE data type • Special numeric values (NaN, INF, -INF, +0, -0) indexed even though not supported by SQL DOUBLE data type

  19. DATE and TIMESTAMP • If timezone not specified, original value stored in index • If timezone specified, DATE and TIMESTAMP data type values are normalized to UTC (Coordinated Universal Time) before storing in index

  20. Document Rejection or CREATE INDEX Failures • Errors causing document rejection for INSERT or UPDATE statements and CREATE INDEX failure if table already populated • VARCHAR(n): Value length exceeds length constraint • Conversion Errors: Valid XML value but can't convert to DB2's representation for the data type because of DB2 limitations

  21. Invalid XML Values • For DOUBLE, DATE, and TIMESTAMP indexes • XML values without a valid lexical form for the target index XML data type are invalid • DB2 9 XML indexes always ignore invalid XML values • Invalid XML values can be rejected or ignored on new CREATE INDEX option for DB2 9.5

  22. DB2 9.5 Reject Invalid Values • New REJECT INVALID VALUES option for DB2 9.5 • If XML value can’t be cast to index XML data type, error returned • If index does not exist, index is not created • XML data not inserted or updated in the table if index exists

  23. DB2 9.5 Ignore Invalid Values • Invalid values for index XML data type ignored and not indexed • No error or warning is issued • Default option

  24. Unique Keyword • Uniqueness enforced across all documents within a single XML column • Enforced within index data type, XML path to node, and value of node after value cast to index data type

  25. Query Operators for XML • XSCAN ( XML Document Scan) • Traverses XML document trees and may evaluate predicates and extract document values • XISCAN (XML Index Scan) • Performs probes and scans on XML indexes and can evaluate predicates. • XANDOR (XML Index ANDing and ORing) • Evaluates two or more equality predicates by driving multiple XISCANs. CREATE INDEX AgeIndex on t1(XMLDoc) GENERATE KEY USING XMLPATTERN '/Person/Age' AS SQL DOUBLE; XQUERY for $i in db2-fn:xmlcolumn(‘T1.XMLDOC’)/Person where $i/Age = 17 return $i;

  26. Even if these requirements are satisfied, the optimizer can still decide NOT to use an eligible index! Index Eligibility Requirements for an XML index to be used for a query: • Index “contains” the query predicate, i.e. isequally or less restrictive than the predicate • Query predicate matches the index data type • /text() is used consistently in query predicate and index definition: both specify /text() or not specify /text()

  27. Queries using an Index on an XML Column • Some sample queries using equality and range predicates

  28. SYSCAT.INDEXXMLPATTERNS • Catalog view externalizes information on the XML pattern specified for an Index on an XML Column

  29. SYSCAT.INDEXES • Index on an XML Column has a logical and physical index • Logical index contains XML pattern created by user • Physical index contains index values • DB2 system generated key columns

  30. DB2 9.7 XML Indexes on Range Partitioned Tables CREATE INDEX zipcode ON sales(customer_info) GENERATE KEY USING XMLPATTERN ’/Customer/Address/Zipcode’ AS SQL varchar(10) NOT PARTITIONED; • Relational Indexes may be not partitioned or partitioned in DB2 9.7 • User-defined XML Indexes may be not partitioned (DB2 9.7 GA) or partitioned (DB2 9.7 FP1) • System generated XML Paths Indexes are always not partitioned • System generated XML Regions Indexes are always partitioned CREATE INDEX zipcode ON sales(customer_info) GENERATE KEY USING XMLPATTERN ’/Customer/Address/Zipcode’ AS SQL varchar(10) PARTITIONED;

  31. DB2 9.7 XML Indexes on Range Partitioned Tables Non-Partitioned Relational Index or Index on XML Column Partitioned Relational Index or Index on XML Column Partitioned Relational Index or Index on XML Column Partitioned Relational Index or Index on XML Column Base Table Partition 1 Base Table Partition 3 Base Table Partition 2 Partitioned Regions Index Partitioned Regions Index Partitioned Regions Index XDA XDA XDA Non-Partitioned XML Path Index

  32. DB2 9.7 Online XML Index Create and Reorg • Insert/Update/Delete transactions no longer need to wait until the CREATE INDEX/REORG INDEXES/REORG INDEX statement completes • Results in increased throughput and faster response time for concurrent transactions.

  33. DB2 9.7 Index Compression • Default for relational and XML indexes enables compression if data row compression enabled • New COMPRESS keyword on CREATE/ALTER INDEX can override default behavior • Index can be compressed even if data rows not compressed • MDC Block Indexes and XML Paths Indexes not compressed

  34. What Did You Learn Today? • What the difference is between XML and relational indexes • How to create an index on an XML column • How to avoid common user errors • What the requirements are for queries to use XML indexes • How the XML indexes are defined in the catalog • DB2 9.5 and DB2 9.7 XML index enhancements

More Related