A Process Splitting Transformation for Kahn Process Networks

A Process Splitting Transformation for Kahn Process Networks Sjoerd Meijer

Contents • Background • Problem Definition and Project Goal • Splitting • Producer Selection • Inter-process Communication • Consumer Selection • Implementation • Conclusion And Further Work

CPU CPU CPU CPU Main Memory Cache Cache Cache Cache Memory Bus Background : do i, n do j, n a(i,j) = … : end do end do : • Parallelization is not new • Forking a sequential application Classic example,matrix-matrix multiplication: • Master processor executes code up to parallel loop • Execute parallel iterations on other processors • Synchronize at end of parallel loop

Background Applications are specified as parallel tasks: • Example JPEG decoder:

Cakesim (eCos+CCP) - profile for JPEG-KPN: Problem Definition ?

Problem Definition Automatic procedure for process splitting in KPNs to take advantage of multiprocessor architectures. Split-up network: Original process network

Splitting – The Concept Required: • Determine computational expensive process: profiling or pragma’s + static support • Partitioning of the Iteration Space (IS) • N = number of times a process has to be split • L = loop-nest level at which the splitting takes place To do: • Duplication of code and FIFOs • Adding control for token production and consumption

Techniques used: Data dependence analysis: • Data flow analysis • Array data flow analysis Tree transformations: • Adding/removing/duplicating tree statements Compiler framework: • GCC

P1 P2 P3 Solution for KPNs Four step approach: COMPUTATION: • Partitioning (computation) COMMUNICATION: 2. Interprocess communication 3. Token production 4. Token consumption P1 P21 P3 P22

Partitioning of the original process computation over the resulted split-up processes

Interprocess Communication : for(int i=1; i<10; i++) a[i] = a[i-1] + i; //s1 : • Inter process communication is given by the loop-carried dependency: a[i-1] at iteration i is produced at iteration i-1. • If execution of stmt s1 is distributed over different processes, token a[i-1] needs to be communicated: : : for(int i=1; i<10; i++){ for(int i=1; i<10; i++){ if(i%2==0) if(i%2==1) a[i] = a[i-1] + i; a[i] = a[i-1] + i; : :

P2’ ? P1 P2 P1 P2’’ ? P2’ P2 P3 P3 P2’’ Token Production&Consumption Problems: • P1.At the producer side: where to send the tokens to? • PII.At the consumer side: from where to consume tokens ? Solutions P1: • Producer filters the tokens (static solution) • Producer sends all tokens to all split-up processes (run time solution) Solutions PII: • The consumer knows by it self when to switch (static solution) • Each producer sends a signal to the consumer when to switch reading data from a different FIFO (run time solution)

Static solution 50 tokens 100 tokens P2’ 50 tokens 100 tokens P1 P2’’ Runtime solution P2’ P1 P2’’ Token Production– runtime vs. static 100 tokens P1 P2

Static solution Switch is known internally by the consumer 50 tokens P2’ P2 P2’’ 50 tokens 100 tokens P2 P3 Runtime solution 50 tagged tokens Switch is communicated over the channels to the consumer P2’ P3 P2’’ 50 tagged tokens Token Consumption – runtime vs. static

Token Production & Consumption – static solution • Establish the data-dependencies over the processes HOW? • Data Dependence function (DD) and DD-1 DD -1 : Producer Consumer DD : Consumer Producer • However, DD cannot always be determined at compile time

Token Production – static solution without DD -1 Observation: loop counters producer side equal loop counters from consumer side

Token Production – static solution without DD -1 DD-1 (w1,w2,w3)=(w4,w5,w6); P2(DD-1 (w1,w2,w3))=w5 w5=w2 => P2(DD-1 (w1,w2,w3)%2= w2%2

Token Consumption – static solution without DD Similar to production of tokens.

Runtime solution:

Split-up into 3 processes P2’ P3’ P4 P1 P2’’ P3’’ P3’’’ P2’’’ Multiple split-up processes P3 P4 P1 P2

Copy-nodes insertion P1 P2 P4 P3 Splitting transformation P2’ P3’ P1 P2’’ P4 P3’’ P2’’ P3’’ Copy-nodes P3 P4 P1 P2

Copy-nodes • Pros: • Simple network structure • Apply four-step splitting approach • Cons: • More processes => more communication (can be improved) => overhead

Implementation • Used technique: • Runtime solution (general) • Used framework: • GCC (GNU Compiler Collection) • Advantages GCC: • Availability of data dependence information • Supported by large community; • We are in contact with Sebastian Pop, maintainer and developer of various compiler phases e.g. the data dependence analysis, control flow and induction variable.

Implementation • Data dependence analysis (already present): • scalars • arrays • Data Dependence Graph (DDG) present only on RTL level, not on tree SSA • Two new passes: • Create DDG • Splitting

Implementation • Splitting pragma • Data dependence graph • Class definition reconstruction • Function cloning • Modulo condition insertion

Implementation To do: • Copying of class definition • Copying of class member functions • Reconstruction network structure • FIFO • Network definition

Implementation Final result: • Data dependence information tells whether splitting is legal (no IPC) • Semi-automatic transformation/case-study

Original KPN KPN with copy nodes Processes split-up into two Improvement of 21% Results

Merge P2’ P3’ Fork Mesh P1 P2’’ P4 P3’’ P2’’ P3’’ Future work: YAPI and CCP • Difference in active and passive connectors. • Active connectors in YAPI are modeled as a thread • Passive do not run in a separate thread • More connectors in CCP:

Future Work • Connect GCC with SCOTTY: • GCC branch • Main branch: may not accept the patch • GOMP branch targets parallelization + data dependence + Network topology

Conclusion • Only split-up the most computationally expensive processes • The transformation is profitable

A Process Splitting Transformation for Kahn Process Networks

A Process Splitting Transformation for Kahn Process Networks

Presentation Transcript

BOLOGNA PROCESS - A Challenging Process for Albanian Universities

Process Evaluation of a SPNS Practice Transformation Model

Genetic Transformation: Process and Applications

An Assembly Process Transformation

Nuclear fission is the process of splitting atoms.

BULGARIAN ARMED FORCES TRANSFORMATION PROCESS

Updated Process for Transformation Priorities February, 2011

A Plan for Transformation: Integrity Process

BOLOGNA PROCESS - A Challenging Process for Albanian Universities

Nokia Networks Inspection Process

Lab 1: Kahn Process Networks - Session 4 -

Transformation process

Procurement Strategic Planning Process Transformation

Economic transformation process in Lithuania

Deadlock Detection for Distributed Process Networks

Gaussian Process Networks

Basic Mapping: The Transformation Process

Deadlock Detection for Distributed Process Networks

A Process for Programming

Business Process Management - Digital Transformation

Transformation process

Gaussian Process Networks