170 likes | 300 Vues
Paper Report. Synchronization for Hybrid MPSoC Full-System Simulation. Luis Gabriel Murillo, Juan Eusse, Jovana Jovic, Sergey Yakoushkin , Rainer Leupers and Gerd Ascheid Design Automation Conference (DAC), 2012 49th. Presenter: Zong-Ze Huang. Abstract.
E N D
Paper Report Synchronization for Hybrid MPSoC Full-System Simulation Luis Gabriel Murillo, Juan Eusse, Jovana Jovic, Sergey Yakoushkin, Rainer Leupers and GerdAscheid Design Automation Conference (DAC), 2012 49th Presenter: Zong-Ze Huang
Abstract • Full-system simulators are essential to enable early software development and increase the MPSoC programming productivity, however, their speed is limited by the speed of processor models. • Although hybrid processor simulators provide native execution speed and target architecture visibility, their use for modern multi-core OSs and parallel software is restricted due to dynamic temporal and state decoupling side effects.
Abstract (cont.) • This work analyzes the decoupling effects caused by hybridization and presents a novel synchronization technique which enables full-system hybrid simulation for modern MPSoC software. • Experimental results show speed-ups from 2x to 45x over instruction-accurate simulation while still attaining functional correctness.
What is the Problem • Instruction Set Simulators(ISSs) is slower than the real systems and increasing their speed is a difficult challenge • Hybrid Full-System Simulation • Target ISS (TS) • Host-compiled abstract simulator (AS) • Multi-processor system simulation • Temporally decoupling • Hybridization-introduced decoupling
Related work simulation frameworks [1][2][3][4][6] HiSim [10] More abstract processor models Increase ISS speed Virtualized function estimated time [7] Dynamic binary translation [17] [9],[14],[15],[16] Synchronize hybrid processor simulator Synchronization for Hybrid MPSoC Full-System Simulation This paper:
Traditional simulation workflow Closed Source Libraries C Sources Application Target Compiler Target Binary Simulator Target ISS Memory
HySim – A Hybrid Simulation Framework • Bridge the gap between two different abstraction levels. • Host mode • One level of abstraction above ISS-IA • Execute dircetlyit on the host machine • Disadvantage • Losing the visibility of the target architecture • Synchronization problem
Temporal Decoupling (1) • TLM-2 offers Temporal Decoupling to improve simulation speed. • Concept : • Some simulation parts that not interact with the surrounding environment frequently might run ahead of the current simulation time for a short amount of time. • Avoid unnecessary kernel synchronization points and context switches. Synchronized simulation time Temporally decoupled simulation time
Temporally Decoupled timing entities • TLM-2 defines four timing entities to describe temporal decoupling. • (System) Global Quantum • This represents the time unit on which all PEs synchronize. • (PE) Global Quantum(βi) • This represents the time unit on which a particular PE synchronizes. • Local Quantum(αi) • For each PE, this represents the time remaining from the current SystemC time until the end of the current PE Global Quantum. • Loacl Time Offset(λi) • Time PEi is ahead of the system.
Hybridization-Introduced Decoupling • Concept : • Host-compiled execution is incapable of affecting directly the simulated time. • Execution of a virtualized function is performed inzoro time from the simulator’s perspective. • Software performance estimation techniques help to obtain timing values Ƭ for the functions executed natively. • This causes a hybrid ISS to be temporally decoupled from the rest of the system.
Suspension Quantum • Suspension quantum is created dynamically upon the execution of a virtualized function. • Advantage: • Avoid unnecessary kernel synchronizations • Disadvantage: • It would losing interrupts causes systems behave incorrectly.
Breaking the Suspension Quantum Step 1. The hybrid ISS detects an incoming interrupt. Step 2. The processor is waken up Step 3. The PC value is associated to a remaining suspension quantum. Step 4. A breakpoint-like mechanism is activated on the saved PC in order to restore the remaining suspension time.
local time exceeds the next synchronizes time • Mix hybridization-introduced decoupling and traditional temporal decoupling in the same PE. • Suspension quantum are used to recompute decoupling parameters. • Update local time exceeds the end of the next βi (i.e. > α-λ) • ’= i + (– (α-λ) ) • ti’ = ti + • Update local time exceeds the end of the next βi (i.e. α-λ) • ti’ = ti +
HySim virtualization chain • In a full system simulation, virtualizable function are not allowed to : • Perform software synchronization or unrestrictedly access shared memory. • Memory accesse in AS mode will interact with peripherals and acceleracors.
Test Cases 1 • Simulator • Simics • Tensilica Diamond and Xtensa ISSs • Host machine • 64-bit AMD Phenom Quad-Core • 8GB of memory • Fedora Core 5 • Scenario 1 : 3DES on Single-core system • Single-core platform • Tensilica Diamond DC_B_570T • Scenario 2 : MJPEG on Single-core system • Single-core platform • Tensilica Diamond DC_B_570T • Multi-media acceleration • LCD controller
Test Cases 2 • Scenario 3 : Circular-FFT on Multi-core system • Multi-core platform • Three Xtensa XRC_D2MR cores • Scenario 4 : OFDM(Orthogonal Frequency Division Multiplexing) Transceiver system • Multi-core platform • Three Xtensa XRC_D2MR cores
Conclusion • Presented an approach to synchronize hybrid processor simulators within full-system . • Defining a specialized temporal decoupling mechanism. • Identifying functions that must be avoided in native execution in order ensure correctness of parallel applications. • Future work • Combination with other advanced simulation techniques in this hybridization simulation. • My comment • Novel idea to improve simulation speed.