GeantV scheduling framework revisited (aka V3)

GeantV scheduling framework revisited (aka V3) Andrei Gheata Weekly GeantV Meeting May 16, 2017

Features GeantV scheduling framework revisited

GeantV version 3: A generic vector flow approach loop loop vector scalar e.g. ComptonFilter::DoIt Handler “i” Handler 1 GeantPropagator Basketizer“I” Basketizer 1 virtual DoIt(track) workers GeantPropagator virtual Select(track) default behavior to override SimulationStage SimulationStage SimulationStage virtual DoIt( , ) SimulationStage Select next stage if different from: SimulationStage::fFollowUp GeantTaskData Stage buffer AddTrack(track, ) Stage buffer Stage buffer Stage buffer GeantTrack * Processing flow per thread Event server lane0 laneN Stack-like buffer lane1 … secondaries… primaries GeantV scheduling framework revisited

Processing flow per propagator/NUMA node Event server Stage buffers Select Select Select Field prop. Linear prop. Volume2 Process2 PhysicsStage PropagationStage GeometryStage Basketizer Basketizer Basketizer Basketizer Process1 Volume1 Handlers Scalar code Scalar DoIt() Vector DoIt() Vectorized code Threads on same propagator/socket GeantV scheduling framework revisited

Stack-like handling of tracks Stepping loop Stack-like buffer buffer buffer buffer buffer buffer buffer buffer buffer PreStepStage DiscreteProcStage XSecSamplingStage SteppingActionsStage PropagationStage ContinuousProcStage GeomQueryStage Generation 0 (primaries) Generation 0 (primaries) buffer Generation 1 Generation 1 buffer Generation 2 Generation 2 GeantConfig::fNstackLanes Generation 3 Generation 3 buffer Generation 4 Generation 4 buffer Generation 5 Generation 5 Generation 6 Generation 6 buffer Generation 7 Generation 7 Generation 8 Generation 8 buffer Generation > 10 Generation > 10 Number of lanes flushed into the stepping loop controlled by: GeantConfig::fNmaxBuffSpill GeantV scheduling framework revisited

Performance V3 versus V2, Memory, scalability, NUMA, tuning knobs GeantV scheduling framework revisited

Memory control Stack-like control using a special buffer inserted in the stepping loop Higher generation secondaries flushed with priority Very good behavior even for high number of threads/secondaries GeantV scheduling framework revisited

NUMA awareness 8.5% Implemented using hwloc > 1.8 Enumerating NUMA nodes, cores, CPU’s Threads are bound to CPU’s A propagator will use threads bound to the same NUMA node More propagators can be bound to the same NUMA node Compact policy used for threads on same propagator, scatter for distributing propagators on different nodes Task data stage buffers, stack-like buffer, baskets and tracks bound to memory on the same node as the propagator owning the thread 1% GeantV scheduling framework revisited

Scalability Not as good as expected Interaction between threads lesser, removed contingency points, SOA basketizing, no more basket queue Profiling comparison N/2N threads does not reveal obvious hotspots To be further pursued Memory operations are high in the profile, we expect picture to improve when having a more balanced scenario with more (vector) work on physics side. GeantV scheduling framework revisited

Performance v3 versus v2 40% 80% Relevant improvements in both single and multi-threaded mode Coming mostly from the increase of locality (simulation stages) Removal of SOA gather/scatter overheads NUMA awareness Yardstick measurements to be redone GeantV scheduling framework revisited

Tuning knobs Much less than before – complexity went down by large factor… Stack-like buffer parameters: up to 10% influence on performance Basket size used by basketizers: negligible impact now, expected to become important for vectorized stages NUMA placement parameters (number of propagators, threads per propagator): up to 10% impact on performance Many V2 knobs became obsolete… GeantV scheduling framework revisited

Where we go from here • Implementation of simulation stages for New Physics • Abstraction of the interface defining stages needed for library decoupling • Mapping of actions performed by processes with stages, Geant::Handler interface for invoking models • Interfacing/testing vector physics • Currently all stages basketized by default, no overhead observed so far • Completion of interfaces for handling user actions • Porting examples to using them • … and all the rest… GeantV scheduling framework revisited

GeantV scheduling framework revisited (aka V3)

GeantV scheduling framework revisited (aka V3)

Presentation Transcript

Quartz Job Scheduling Framework

Introduction and Framework

FM Example

Customer – Webinar “ V3-Wireless ”

Entity Framework

Proposed Mobilization Framework

SIAC V3 (Sistema Identificativo Automatizzato Centralizzato)

Real- Time Scheduling II : Compositional Scheduling Framework

Arc consistency ac3, ac4, ac6/7/8

snmp v3 (one more time)

Money Making...Made Eazy!

Higgins Trust Framework

A Framework to Enable Enriched Training Experience

Adaptative track scheduling to optimize concurrency and vectorization in GeantV

Arc consistency

Motion Object V3 Review - SECRET of Motion Object V3

Motion Object V3 review- Motion Object V3 $27,300 bonus & discount

Motion Mascot V3 Review-TRUST about Motion Mascot V3 cand 80% discount

Motion Mascot V3 REVIEW - DEMO of Motion Mascot V3

Epic Video Pack V3 review & bonus - I was Shocked!

Instant Spokesperson Kit V3 Review - Instant Spokesperson Kit V3 100 bonus items

Cross-Cutting Themes