1 / 22

STEPS Towards Cache-Resident Transaction Processing

STEPS Towards Cache-Resident Transaction Processing. Yifei Tao Kitsuregawa Lab. Outline . Background Steps : Introduction cache-resident code Applying Steps to OLTP workloads Summary. Background.

goro
Télécharger la présentation

STEPS Towards Cache-Resident Transaction Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STEPS Towards Cache-Resident Transaction Processing Yifei Tao Kitsuregawa Lab

  2. Outline • Background • Steps: Introduction cache-resident code • Applying Steps to OLTP workloads • Summary

  3. Background • OLTP(OnLine Transaction Processing) is one of core technologies in RDBMS, especially in Business scene. • OLTPcode is concrete and complex and is not changed to gain advanced hardware technology, the new methodology is required improve performance by application code sharing on CPU cache without code changed

  4. Background Research [AD+99][LB+98][SBG02] shows OLTP are predominantly delayed by instruction cache misses, especially L1-I misses.

  5. Background To maximize L1-I cache utilization and minimize stalls: 1. application code should have few branches 2. most importantly, the “working set” code footprint should fit in the L1-I cache Unfortunately, OLTP workloads exhibit the exact opposite behavior

  6. Contents • Background • Steps: Introduction cache-resident code • Applying Steps to OLTP workloads • Summary

  7. What is STEPS? • Synchronized Transaction through Explicit Processor Scheduling • multiplexing concurrent transactions and exploiting common code paths. • One transaction paves the cache with instructions, while close followers enjoy a nearly miss-free execution.

  8. Basic Idea of Steps

  9. Fast, efficient context-switching Typical context-switching mechanisms occupy a significant portion of the L1-I cache and take hundreds of processor cycles to run Steps execute only the core contexts-switch code and updates only CPU state, ignoring thread-specific software structures such as the ready queue, until they must be updated.

  10. Steps in practice

  11. Instruction misses and thread group size

  12. Instruction misses Execute a operation P: α:decreasing ratio of cache miss in warm cache0 < α <= 1 β: sharing ratio0 <β < 1

  13. Gain analysis of Steps When comparing Steps to Shore(1 - #misses after/ #misses before)・100%, the bounds for computing the L1-I cache miss reduction are : For index fetch, α= 0.373, β=0.033, giving a range of 82% - 87% of overall reduction in L1–I misses for 10 threads, and 90% - 96% for 100 threads.

  14. Detailed behavior on two different processors

  15. Contents • Background • Steps: Introduction cache-resident code • Applying Steps to OLTP workloads • Summary

  16. Experimentation setup

  17. TPC-C • OLTP benchmark • Business application including New Order, Payment, Delivery, Stock Managements • Experiment 1: Payment • Experiment 2: New Order, Payment, Stock Managements

  18. TPC-C results: Experiment 1

  19. TPC-C results: Experiment 2

  20. Outline • Background • Steps: Introduction cache-resident code • Applying Steps to OLTP workloads • Summary

  21. Summary • Introduction to the Steps • Steps claims OLTP workload can be reduced by considering instruction cache misses. • The simulation with TPC-C shows L1-I cache misses and branch mispredicts decrease.

  22. Thank you!

More Related