1 / 156

CS 245: Database System Principles Notes 4: Indexing

CS 245: Database System Principles Notes 4: Indexing. Hector Garcia-Molina. Chapter 4. Indexing & Hashing value. record. ?. value. Topics. Conventional indexes B-trees Hashing schemes. 10. 30. 50. 70. 90. 20. 40. 60. 80. 100. Sequential File. 70. 50. 30. 10. 110. 90.

chet
Télécharger la présentation

CS 245: Database System Principles Notes 4: Indexing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 245: Database System PrinciplesNotes 4: Indexing Hector Garcia-Molina Notes 4

  2. Chapter 4 Indexing & Hashing value record ? value Notes 4

  3. Topics • Conventional indexes • B-trees • Hashing schemes Notes 4

  4. 10 30 50 70 90 20 40 60 80 100 Sequential File Notes 4

  5. 70 50 30 10 110 90 120 80 60 40 20 100 90 10 70 50 30 80 60 40 20 100 Sequential File Dense Index Notes 4

  6. 130 90 50 10 210 170 230 150 110 70 30 190 90 10 70 50 30 80 60 40 20 100 Sequential File Sparse Index Notes 4

  7. 210 50 10 130 170 490 90 170 330 10 30 570 70 110 410 150 190 90 230 250 10 50 30 90 70 100 80 60 40 20 Sequential File Sparse 2nd level Notes 4

  8. Comment: {FILE,INDEX} may be contiguous or not (blocks chained) Notes 4

  9. Question: • Can we build a dense, 2nd level index for a dense index? Notes 4

  10. Notes on pointers: (1) Block pointer (sparse index) can be smaller than record pointer BP RP Notes 4

  11. Notes on pointers: (2) If file is contiguous, then we can omit pointers (i.e., compute them) Notes 4

  12. R2 R1 R3 R4 K1 K2 K3 K4 Notes 4

  13. say: 1024 B per block R2 R1 R3 • if we want K3 block: • get it at offset • (3-1)1024 • = 2048 bytes R4 K1 K2 K3 K4 Notes 4

  14. Sparse vs. Dense Tradeoff • Sparse: Less index space per record can keep more of index in memory • Dense: Can tell if any record exists without accessing file (Later: • sparse better for insertions • dense needed for secondary indexes) Notes 4

  15. Terms • Index sequential file • Search key ( primary key) • Primary index (on Sequencing field) • Secondary index • Dense index (all Search Key values in) • Sparse index • Multi-level index Notes 4

  16. Next: • Duplicate keys • Deletion/Insertion • Secondary indexes Notes 4

  17. 10 10 20 30 40 10 20 30 30 45 Duplicate keys Notes 4

  18. 10 10 10 10 20 20 30 30 40 40 45 10 20 30 10 20 45 30 30 30 Duplicate keys Dense index, one way to implement? 10 10 10 10 10 10 20 20 20 20 30 30 30 30 30 30 Notes 4

  19. 30 10 10 20 40 45 30 30 10 20 Duplicate keys Dense index, better way? 10 20 30 40 Notes 4

  20. 30 10 10 20 40 45 30 30 10 20 Duplicate keys Sparse index, one way? 10 10 20 30 Notes 4

  21. 30 10 10 20 40 45 30 30 10 20 careful if looking for 20 or 30! Duplicate keys Sparse index, one way? 10 10 20 30 Notes 4

  22. 30 10 10 20 40 45 30 30 10 20 Duplicate keys Sparse index, another way? • place first new key from block 10 20 30 30 Notes 4

  23. 10 10 20 30 40 30 20 30 45 10 should this be 40? Duplicate keys Sparse index, another way? • place first new key from block 10 20 30 30 Notes 4

  24. Duplicate values, primary index Summary • Index may point to first instance of each value only File Index a a a . . b Notes 4

  25. 10 30 50 70 80 20 40 60 Deletion from sparse index 10 30 50 70 90 110 130 150 Notes 4

  26. 70 10 30 50 20 40 60 80 Deletion from sparse index • delete record 40 10 30 50 70 90 110 130 150 Notes 4

  27. 70 10 30 50 20 40 60 80 Deletion from sparse index • delete record 40 10 30 50 70 90 110 130 150 Notes 4

  28. 70 10 30 50 20 40 60 80 Deletion from sparse index • delete record 30 10 30 50 70 90 110 130 150 Notes 4

  29. 10 30 50 70 20 40 60 80 40 40 Deletion from sparse index • delete record 30 10 30 50 70 90 110 130 150 Notes 4

  30. 70 10 30 50 20 40 60 80 Deletion from sparse index • delete records 30 & 40 10 30 50 70 90 110 130 150 Notes 4

  31. 70 10 30 50 20 40 60 80 Deletion from sparse index • delete records 30 & 40 10 30 50 70 90 110 130 150 Notes 4

  32. 10 30 50 70 20 40 60 80 50 70 Deletion from sparse index • delete records 30 & 40 10 30 50 70 90 110 130 150 Notes 4

  33. 10 30 50 70 80 20 40 60 Deletion from dense index 10 20 30 40 50 60 70 80 Notes 4

  34. 70 10 50 20 80 60 Deletion from dense index • delete record 30 10 20 30 30 40 40 50 60 70 80 Notes 4

  35. 70 10 50 80 20 60 40 Deletion from dense index • delete record 30 10 20 30 30 40 40 50 60 70 80 Notes 4

  36. 10 50 70 20 60 80 40 40 Deletion from dense index • delete record 30 10 20 30 30 40 40 50 60 70 80 Notes 4

  37. 10 30 40 60 20 50 Insertion, sparse index case 10 30 40 60 Notes 4

  38. 10 30 40 60 20 50 Insertion, sparse index case • insert record 34 10 30 40 60 Notes 4

  39. 10 30 40 60 20 50 34 • our lucky day! • we have free space • where we need it! Insertion, sparse index case • insert record 34 10 30 40 60 Notes 4

  40. 10 30 40 60 20 50 Insertion, sparse index case • insert record 15 10 30 40 60 Notes 4

  41. 10 30 40 60 20 50 15 20 20 30 Insertion, sparse index case • insert record 15 10 30 40 60 Notes 4

  42. 10 30 40 60 20 50 15 20 20 30 Insertion, sparse index case • insert record 15 10 30 40 60 • Illustrated: Immediate reorganization • Variation: • insert new block (chained file) • update index Notes 4

  43. 10 30 40 60 20 50 Insertion, sparse index case • insert record 25 10 30 40 60 Notes 4

  44. 10 30 40 60 25 20 50 overflow blocks (reorganize later...) Insertion, sparse index case • insert record 25 10 30 40 60 Notes 4

  45. Insertion, dense index case • Similar • Often more expensive . . . Notes 4

  46. 30 20 80 100 90 50 70 40 10 60 Secondary indexes Sequence field Notes 4

  47. 30 20 80 100 90 40 70 60 10 50 30 90 20 ... 80 100 Secondary indexes Sequence field • Sparse index Notes 4

  48. 100 30 20 80 90 50 70 40 10 60 30 90 ... 20 80 100 does not make sense! Secondary indexes Sequence field • Sparse index Notes 4

  49. 30 20 80 100 90 50 70 40 10 60 Secondary indexes Sequence field • Dense index Notes 4

  50. 100 90 20 30 80 70 40 10 60 50 10 50 20 60 70 30 40 ... Secondary indexes Sequence field • Dense index Notes 4

More Related