1 / 21

Scheduling in Staged- DB Systems

Scheduling in Staged- DB Systems. Nicolas Bonvin, Rammohan Narendula, and Surender Reddy Yerva. Organization. What is Staged-DB? Scheduling in Staged-DB Our Contribution Scheduling in Execution Phase System Modeling System Design Details Performance Study Future Work. Motivation.

Télécharger la présentation

Scheduling in Staged- DB Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scheduling in Staged- DB Systems Nicolas Bonvin, Rammohan Narendula, and Surender Reddy Yerva 1

  2. Organization • What is Staged-DB? • Scheduling in Staged-DB • Our Contribution • Scheduling in Execution Phase • System Modeling • System Design Details • Performance Study • Future Work 2

  3. Motivation • Response time: time needed to produce the first page as output Big advantage for the overlapping case ('1')

  4. Query Lifetime in DBMS Query PARSER Query tree OPTIMIZER catalogs and statistics Query plan operators EXECUTION Data Answer EXECUTION(Disk-IO) : 90% OF TIME 4

  5. DBMS x thread pool no coordination DB Paradigm So Far.. • Query  Query Execution Plan (Tree of Operators) • Multiple Queries • Each query handled by a DIFFERENT THREAD • No cross communication/sharing across threads  • Sharing Opportunity is missed D C D C One Query Multiple Operators 5

  6. thread pool Staged-DB Paradigm • DB is remodeled as various stages • Stage • “Common execution logic” grouped into a stage • Each operator in QEP can be seen as a stage • Query passed through all the needed stages to get an output • Common Data needs  Detected by the Stage D C DBMS StagedDB D C One Operator Multiple queries 6

  7. Staged Database Systems StagedDB Stage 3 Stage 1 Stage 2 queries DBMS queries Conventional • DB  Stages ; Execution Stage microEngine • Each Stage has a queue, Also each microEngine has a request queue. High concurrency  locality across requests 7

  8. Scheduling In Staged-DB • Scheduling at Different levels • Stages (Parser, Optimizer, Execution) • Across MicroEngines (Execution Engine has SCAN,JOIN etc micro-engines) • Within MicroEngine • We Consider only scheduling “across microEngines” • Scheduling Policies: • Round-Robin • Heavy Load First • Light Load First 8

  9. Detailed System Design • Based on Discrete Event Simulation technique • All the computation, data needs, dependencies are modeled using events • System components • Global System Queue • Dispatcher • Operator (or) mEngine • Global Scheduler • Main Memory • Overlap Detector 9

  10. Engine Exec-Begin Engine Exec-End Global System Queue Query Arrival event Dispatcher eventId componentId functionId firingTime packet Scheduler Engine Insert Memory Disk-Fetch 10

  11. mEngine Input Packet Queue Request packet from parent node/ dispatcher Packet format queryId list queryPlans pageId contextInfo Engine Insert Call Overlap detector Send packet to Child OR execute and produce output Engine Execution Begin Insert packet Pick packet from Q Engine Execution End Insert event into Event queue for the scheduler 11

  12. mEngines • Join • Sort • Aggregation • Scan • Wait and Scan • Index Scan 12

  13. Overlap detection • With memory • With input queue • Two types • Linear • Spike 13

  14. Memory Manager • Pinning and unpinning • Put() • pageExists() • consumePage() 14

  15. Performance study • 5 queries • 5 runs • Uniform arrival rate 15

  16. Effect of Overlapping • Response time: time needed to produce the first page as output Big advantage for the overlapping case ('1')

  17. Effect of Overlapping • Memory consumption: max # of pages consumed in memory during the life time of the query Higher memory consumption with Overlapping !

  18. Effect of Overlapping • Throughput: # of queries completed in a unit of time Clear advantage with Overlap detection !

  19. Comparing scheduling policies • Mean response time Round Robin seems to perform a little better

  20. Comparing scheduling policies • Memory consumption No differences !

  21. Future Work • Few more interesting global scheduling policies are possible. • The system did not consider a local scheduling policy to pick one packet among many in the input packet queue, for processing next. It picks the fist packet in the queue at the moment. • Regarding implementation, experimentation should be done with more mEngines and a bench mark style input queries. 21

More Related