1 / 24

Issues in Database Performance

Issues in Database Performance. Performance in Read / write are hardware issues => throw money at it Performance of DB = ability of engine to locate data Factors affecting speed of retrieval Cache (sizing of objects) Access method (table scan / indexes…) Contention between processes

naomi
Télécharger la présentation

Issues in Database Performance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Issues in Database Performance • Performance in Read / write are hardware issues => throw money at it • Performance of DB = ability of engine to locate data • Factors affecting speed of retrieval • Cache (sizing of objects) • Access method (table scan / indexes…) • Contention between processes • Indirect processes (roll back, archiving…)

  2. Reading data in Oracle 1 – determine how to access block where data is located • read data dictionary stored in separate part of DB • DD may be loaded up in cache (aka row cache) to limit I/O activity 2 - DD indicates preferred access method for block • B tree index, partitioning, hash clusters etc… 3 - Search begins either in full scan or with index until data found

  3. Designing DB for reading • Supply methods for high precision access to data • But some queries will defeat the strategies • E.g. credit cards transactions – monthly report of scattered items • No solution = take off-line

  4. Changing data • Oracle makes hard work of changes • Rollback data (immediate) • Log files (long term) • Changed blocks read and updated in buffer • Released to disk as buffer is cleared • But rollback info generate most I/O operations • In sensitive environments, simultaneous archiving makes it worse (ARCHIVELOG mode)

  5. Indexing Problems • Super-fast indexes need updating as data is changed => DB slows down. • More complex index = more complex update mechanism = more rollback …. • DB physical structure degrades, so does index (eg split blocks) • Performance decreases over time • Rebuild needed (which interferes with operations) INDEX_STATS tells you how big the index has grown

  6. Side effects • Role of DBMS enforce data consistency • A reading process may need an older version of the data • Need to create a private version of the data => processes that should be Read Only require writes Attempt to describe all required operations in executing a query requiring old data

  7. Solution • To create an older version of data • Must apply roll back • find old roll back block (I/O) • Roll back index (I/O) • Find data (I/O) • Roll back data • Read old data • Reverse all changes (multiple I/O) • Significant I/O implications + buffer full of old stuff

  8. Conclusions • Writers and readers DO interfere with each other • The mechanisms used by Oracle to bypass locking have performance side effects • Performance come with minimising I/O • i.e. with good access techniques • Precision of data location • Physical proximity of related data (cf: caching) • However: Techniques to reduce I/O numbers tend to reduce the speed of access!

  9. Indexing example (see figure 3.1): • Alphabetic search using B tree index for name = Oscar Smith • Split the table in sections (eg: half) and read until find start beyond letter “S” + go back one • Do same in branch block • Smith, N is the one • Then look for page with Smith, N in header • Scan for actual entry in index • Read address • Move to table • Read

  10. About Btree indexes • Root and Branch blocks = approx 2% of index (small) • In frequent hit situations, both blocks loaded in Data Buffer all the time • Then only 2 I/O may be required: • One to read leaf block • One to read the table • In practice, read in index and in table may require reading several blocks • ?

  11. Creating and Using Indexes • Important for live access, • Even more for querying multiple tables • Value matching is costly process in RDB • No pointers • Connection purely on comparison basis only • One value against all values in joint field • If link between 2 huge tables, perf is low • All RDBs use some form of indexing • Some complexity involved as index can reduce physical I/O at the cost of logical I/O (CPU time)

  12. Btree indexing • Creating an index means creating a table with X+1 columns • X = number of columns in index • Rowid (added field) [table block + row] • Index is then copied into consecutive blocks • PCTFREE function leaves space for data growth • high value will generate many leaf blocks but reduce occurrence of split blocks in time • Pointer is added to previous and next leaf blocks in header of block

  13. Btree indexing (2) • Then branch layer is built: • If index > one block • Collect all first entries + block address of each leaf block • Write down into the first level branch block (packed) • If branch block is full, initiates second level of branch blocks etc…. • Room is saved in branch blocks: • No forward and backward pointer in branch blocks • Entries are “trimmed” to the bare minimum • First entries are omitted • See figure 6.1

  14. Syntax • CREATE INDEX name ON table name (field1, field2 …) PCTFREE 50; • utility programmes to assess performance of indexes – eg INDEX_STATS View

  15. Updating indexes • Index entries are NEVER changed • Marked as deleted and re-inserted • Space made available cannot be used until after index is re-built • Inserts that don’t fit split the block (rarely 50/50!) • If a blocks becomes empty, it is marked as free, but is never removed • Also, blocks never merge automatically

  16. Some problems • Some situations cannot be addressed with indexes • e.g. In a FIFO processing situation (e.g. a queue), indexes will prove counterproductive • Index may grow to stupid proportions even with small error rate (unsuccessful processing of data) • Every time a transaction is added or processed (deleted) the index must change

  17. Alternative: Bitmap indexing • Imagine following query in huge table Find customers living in London, with 2 cars and 3 children occupying a 4 bed house • Index not useful – why? • Too big • If query changes in any way =>new index needed • Maintaining a set of indexes for each query would just be too costly • Use a bitmap (see table 6.1, 6.2 and 6.3)

  18. Bitmap indexes (2) • Special for data warehouse type DBs • Build one bitmap for each relevant parameter • Combine bitmaps using the “and” SQL keyword • Also possible to use “not”ing of bitmap (see table 6.4, 6.5)

  19. Key points to remember • What is the key advantage of a bitmap index? • What situation does it best suit ? • Bitmaps can also be packed by Oracle compression features • But size is unpredictable – why?

  20. Example: • Table with 1,000,000 rows • Bitmap on one column that can contain one of 8 different values (e.g. city names) • Data is such that all same city together 125,000 times • Write the bitmap • Imagine what compression can be achieved • Data is such that cities are in random order, but same number of each • Same questions

  21. solution First scenario • Bitmap for first city: 125,000 ones and 875,000 zeros [trimmed off] • Size ~ 125,000 bits or approx 18Kbytes • Full bitmap = 156 Kbytes Second scenario: • Bitmap for first city sequences of 1’s and 7 zeros, repeated 125,000 times • Size ~ 1,000,000 bits or approx 140 Kbytes • Full bitmap = 1.12 Mbytes • But BTree index for such data would be around 12MB

  22. Conclusion • Bitmap indexes work best when combined • They are very quick to build • Up to a million rows for 10 seconds • Work best when limited number of values + when high repetitions • Best way to deal with huge volumes => make drastic selection of interesting rows before reading the table • Warning: one entry in a Bitmap = hundreds of records = > locking can be crazy (OLTP systems) • => for datawarehouse type applications (no contention)

  23. What oracle says about it • Use index for queries with low hit ratio or when queries access < 2 - 4% of data • Index maintenance is costly so index “just in case” is silly • Must analyse the type of data when deciding what kind of index • Do NOT use columns with loads of changes in an index • Use indexed fields in “where” statement • Can also write queries with NO_INDEX

  24. Administering indexes • Indexes degrade over time • Should stabilise around 75% efficiency, but don’t • Run stats: Analyse index NAME validate structure Analyse index NAME compute statistics Analyse index NAME estimate statistics sample 1 percent See table 6.6

More Related