1 / 30

ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS. Authors: Steffen Zeuch, Frank Huber, Johann-Christoph Freytag Humboldt-Universität zu Berlin {zeuchste,huber,freytag}@informatik.hu-berlin.de. 1. Motivation. B + -Tree: common index structure

Télécharger la présentation

ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ADAPTING TREE STRUCTURES FOR PROCESSING WITH SIMD INSTRUCTIONS Authors:Steffen Zeuch, Frank Huber, Johann-Christoph Freytag Humboldt-Universität zu Berlin {zeuchste,huber,freytag}@informatik.hu-berlin.de 1

  2. Motivation • B+-Tree: common index structure • Common node-internal search algorithm: • Binary search in O(log2n) Can we do better? Yes with SIMD! 2

  3. Outline • Background • Binary Search and SIMD • Segmented Tree • Segmented Trie • Evaluation • Conclusion 3

  4. Add const to vector Compare two vectors Add two vectors SIMD 3 67 65 2 2 65 67 3 3 2 ≥ + • Single Instruction Multiple Data: • Available on CPU and GPU • Arithmetical, comparison, conversion, logical +2 +2 0 -1 67 69 5 4 4

  5. 5 1 2 3 4 Binary Search Search Key = 9 Iteration Search Space Search Key Separator Excluded 5

  6. Outline • Background • Binary Search and SIMD • SegmentedTree • Segmented Trie • Evaluation • Conclusion 6

  7. 2 3 Binary Search - two Separator Search Key = 9 Iteration 1 Search Space Search Key Separator Excluded 7

  8. SIMD Register A SIMD Register B Search Space Excluded Search Key Separator >= 8 17 9 9 Binary Search + SIMD 0 -1 SIMD Register C 8

  9. Problem: SIMD on CPU SIMD on CPU do not support Scatter and Gather functionality. SIMD load(start position) 4 x 32-bit SIMD Register 8 9 10 11 9

  10. 3-ary Search Tree (k = 3) Linearized Order Solution: K-ary Search by Schlegel et al. Search Key = 9 Search Space Search Key Separator Excluded 10

  11. 2 3 Linearized Order Applied K-ary Search 3-ary Search Tree Search Key = 9 1 Search Space Search Key Separator Excluded 11

  12. Degree of Parallelism 12

  13. Outline • Background • Binary Search and SIMD • SegmentedTree • Segmented Trie • Evaluation • Conclusion 13

  14. Segmented Tree Change inner-node search algorithm from commonly binary search to k-ary search. 14

  15. 3-ary Search Tree Linearized Order Smax+1 Problem: Unfilled Nodes K-ary requirement: multiple of k-1 keys 15

  16. Reordering • New keys require reordering: • Sorting → Inserting → Linearizing • Exceptions: • Empty Node • Key is greater than the largest existing key 16

  17. Segmented Tree Advantages: • High resource utilization • Less iterations required • Binary Search: log2n vs. K-ary Search logkn Disadvantages: • Reordering overhead • Large data types decrease performance 17

  18. Outline • Background • Binary Search and SIMD • SegmentedTree • Segmented Trie • Evaluation • Conclusion 18

  19. Key (Hex) Segmented Trie Level 1 Partial Key (Hex) Key (Dec) Level 2 19

  20. Segmented Trie 20

  21. Segmented Trie Advantages: • High SIMD search performance • Prefix compression • Early termination Disadvantages: • Fix level count • Reordering overhead 21

  22. SegTree vs. SegTrie 22

  23. Outline • Background • Binary Search and SIMD • SegmentedTree • Segmented Trie • Evaluation • Conclusion 23

  24. Test Setup HW/SW Configuration: • CPU: Intel Xeon 5520, 4 x 2,26 GHz • L1: 32KB, L2: 256 KB, L3: 8 MB, MM: 8 GB • Cacheline: 128 Byte, SIMD bandwidth: 128 Bit • Windows 7 64-bit Professional Test Dataset: • Synthetically generated, ascending, starting at 0 24

  25. Evaluation: Bitmask SIMD Register B • Three Algorithms: • Bit Shifting • Case-Switch • PopCnt SIMD Register A 9 9 8 17 >= 0 -1 SIMD Register C 25

  26. Evaluation: SegTree 26

  27. Evaluation: SegTrie 27

  28. Outline • Background • Binary Search and SIMD • SegmentedTree • Segmented Trie • Evaluation • Conclusion 28

  29. Our Contributions • B+-Tree and prefix B-Tree using SIMD • Transformation and search algorithm for breadth-first and depth-first data layout • Three algorithms for interpreting a SIMD comparison result • Solution for an arbitrary key count Thanks 29

  30. Backup 30

More Related