1 / 6

Introduction to Bitmap Indices in Scientific Data Management

This work presents an overview of bitmap indices, a multi-dimensional data structure optimized for read-only data, commonly used in data warehouses and decision support systems. It highlights the effective performance of bitmap indices for low selectivity multi-dimensional queries and introduces various encoding techniques for discrete attribute values. The pros and cons of bitmap indices are discussed, emphasizing ease of maintenance and efficient record identification versus space inefficiencies for high cardinality attributes. Applications in High-Energy Physics (HEP) data are also considered.

orsin
Télécharger la présentation

Introduction to Bitmap Indices in Scientific Data Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Brief Introduction to Bitmap Indices for Scientific Data Kurt Stockinger CERN, IT-Division, Database Group Geneva, Switzerland Database Workshop, July 11-13, Geneva, Switzerland

  2. Features of Bitmap Indices • Multi-dim. index data structure which is optimised for read-only data • “Good” performance for multi-dim. queries with low selectivity (few records result from the query) • Applied in Data Warehouses and Decision Support Systems(e.g. Oracle, Informix, Sybase)

  3. Encoding Techniques forDiscrete Attribute Values a) list of attributes b) equality encoding c) range encoding Attribute cardinality = 10 Range encoding optimised for one-sided range queries, e.g. a0 <= 3

  4. Pros and Cons of Bitmap Indices (BMI) • Pros: • Easy to build and to maintain • Easy to identify records that satisfy a complex multi-attribute predicate(multi-dim. ad-hoc queries) • Bit-wise operators (AND, OR, XOR, NOT) are very efficiently supported by HW • Very space efficient for attributes with low cardinality (number of distinct attribute values, e.g. “Yes”, “No”) • Cons: • Space inefficient for attributes with high cardinality • Commercial database systems only “efficiently” support bitmap indices for discrete attribute values

  5. Example: Bitmap Indices for HEP Data attribute indices (bit matrices) Events(bit vectors) bins (bit slices)

  6. 2-Sided Range Query • E.g.:(pT > 25.7) && (pT < 91.8) 1) Candidate slices 3) OR 2)Hit slices 5) “Check” 4) OR Bin ranges: [0;20) [20;40)[40;60) [60;80)[80;100) ...

More Related