310 likes | 442 Vues
Building Fake Body Parts: Digital Mockups. Frank Vahid Univ. of California, Riverside. Support provided by NSF, SRC, and CareFusion. Building fake body parts. How test medical equipment software?. http://www.nhlbi.nih.gov/. Simulation: Slow/Inaccurate. Weibel lung complexity
E N D
Building Fake Body Parts: Digital Mockups Frank Vahid Univ. of California, Riverside Support provided by NSF, SRC, and CareFusion
Building fake body parts • How test medical equipment software? http://www.nhlbi.nih.gov/
Simulation: Slow/Inaccurate Weibel lung complexity 4 gen: 32 ODEs 6 gen: 128 ODEs 8 gen: 512 ODEs 10 gen: 2048 ODEs Accurate simulation is slow 2-3 minutes to simulate one breath accurately Decrease accuracy for real-time
Physical phenomena disconnected Transducers Transducer models Environment Model Processing Core Digital Mockup Device Intercepted transducer packets Mockups Physical phenomena Digital communication http://www.youtube.com/watch?feature=player_embedded&v=rb0ik1HopBk Transducers Processing Core Device Physical mockup Device How run in real-time?
Physical models are inherently parallel V[1],F[1] V[2],F[2] V[7],F[7] ODE dependency graph
GPUs • Tried, failed • GPU research group also • (results later)
* * * * * * * * * * * * + + + + + + + + + + + + FPGA Processor Processor Processor FPGAs: Sw circuits (parallel) C Code for FIR Filter Circuit for FIR Filter • 1000’s of instructions • Several thousand cycles for (i=0; i < 128; i++) y += c[i] * x[i] .. .. .. for (i=0; i < 128; i++) y[i] += c[i] * x[i] .. .. .. • ~ 7 cycles (though slower clock) • Speedup > 10x-100x
2x2 switch matrix y w a b c 0 1 SM SM SM SM SM SM 0110 1100 z 0 LUT LUT x 1 SM SM SM SM SM SM a b 0000 1111 00 01 10 11 1 1 1 1 0 1 1 0 1 1 1 1 0 0 0 0 D E a b FPGAs “101” (A Quick Intro) FPGA SM LUT 4x2 Memory 1 0 a1 a0 00 01 10 11 11 a b 11 0 d1 d0 0 F G F G a b c 1 1 1 0 1 1 0 0 0 0 0 0 0 0 1 0 D E
FPGA Digital mockup Interface Differential Equation Processing Element • General PE • Diffeq can't be solved exactly • Use iterative approximation (Euler, RK4) • Computes equation solutions at given timestep (e.g. 0.1 ms timesteps). Device under test DEPE Huang, Vahid, Givargis. A Custom FPGA Processor for Physical Model Ordinary Differential Equation Solving. Embedded Systems Letters, Dec, 2011.
Single DEPE • CPU(1),(4): Pentium IV, 3.0 GHz • DEPE: Xilinx Virtex6-240T Microblaze: 2000-4000 LUTs.
Homogeneous network of general PEs Map ODEs to homogeneous PE network ODE dependency graph Scheduling FPGA PE1 Digital mockup Interface PE2 PE3 V[1],F[1] PE1 Synthesis tool V[2],F[2] PE3 PE2 V[7],F[7] 100s of PEs Huang, Vahid, Givargis. 2012.Synthesis of networks of custom processing elements for real-time physical system emulation. Transactions on Design Automation of Electronic Systems (TODAES). *To Appear (Dec-2012) ODE dependency graph
Homogeneous network of general PEs FPGA Digital mockup
Homogeneous network of general PEs ODE mapping via simulated annealing 10K iterations 150K iterations
Homogeneous network of general PEs – FPGA Usage • 150KLuts available on Virtex6-240T Demo http://www.youtube.com/watch?v=ThUKVhqoA3Q
Inputs PE Input_sel Address We Data RAM FPGA Digital mockup SUB MUL Controller Interface Controller MUL SUB Const ROM Address Output Custom Processing Element • Custom PE • Custom datapath to solve specific type of equation V’ = F1 – F2 F’ = P1-P2-(F*CR)*CL Custom PE for each ODE type Huang, Vahid, Givargis. 2012.Synthesis of networks of custom processing elements for real-time physical system emulation. Transactions on Design Automation of Electronic Systems (TODAES). *To Appear (Dec-2012)
FPGA Digital mockup Interface Networks of Heterogeneous Processing Elements • General PE: • Slow, flexible (can solve any types of ODEs) • Custom PE: • Fast, Inflexible (only solves one type of ODEs) • Multi-Type PE • Combined multiple types of ODEs into single custom PE Huge solution space: How to choose types of PEs? How many PEs to allocate? How to bind ODEs to PEs? Huang, Miller, Vahid, Givargis. Synthesis of Heterogeneous Processing Elements for Physical System Emulation. CODES+ISSS 2012, Oct, 2012.
Initial random allocation Simulated Annealing ODE-to-PE mapper New PE allocation N Best solution Better solution Y Cycles of each PE PE allocator Automatic allocation and binding
Network of PEs VS GPU and PC 1430 1490 1522 1184 Speedup vs real-time PC(1): 0.76x PC(4): 3.07x GPU: 1.63x HLS: 3.23x General PE: 4.94x Custom PE: 6.1x Hetero PE: 34.5x
Performance (ms): time to emulate 1000 ms, using Euler with 0.01 ms step. Size (equivalent LUTs) Network of general/custom/heterogeneous PEsVS HLS (regularity extraction) Heterogeneous PE: (10x, 1.1x) HLS (7x, 0.85x) general PE (6x, 1.35x) custom PE (Speed, Size)
Speedup / dollar Heterogeneous PEs: 3X better than PC(4) 4.5x better than GPU FPGA: Easier to build custom interfaces CPU (I7-950 + Intel X58 board): $480 GPU(GTX460 + I3-540 + H55 board): $380 FPGA (Xilinx Virtex6 240T-2 board): $1800
Current: Embedding-based placement of networks • Most physical models have a regular structure • Meshes, trees, grids, etc. • We can apply theoretical graph embedding techniques to embed models into FPGA • Minimal network dilation FPGA Heart cells Lungs Neuron mesh
EqP2 EqV2 EqP4 EqV4 EqP5 EqV5 EqP1 EqV1 EqP1 EqV1 EqP2 EqV2 EqP3 EqV3 EqP6 EqV6 EqP3 EqV3 EqP7 EqV7 EqP4 EqV4 EqP5 EqV5 EqP6 EqV6 EqP7 EqV7 Embedding-based placement of networks Map virtual PEs to physical PEs via embedding EqP1 EqV1 Map equations to virtual PEs EqP3 EqV3 EqP2 EqV2 EqP6 EqV6 EqP4 EqV4 EqP5 EqV5 EqP7 EqV7 Structured virtual PE graph Physical model equations Physical placement Simulated Annealing Placement Embedding Placement No placement strategy
Embedding-based placement of networks Not routable Work submitted to FPGA'13 (Miller/Vahid/Givargis)
Other projects • Assistive monitoring • www.cs.ucr.edu/~vahid/assistivemonitoring/ • http://www.youtube.com/watch?feature=player_embedded&v=Sf8tU-78lXs • ..\Desktop\Fall montage.mp4 • Web-based learning • "Textbook is dead" • pcpp.zyante.com (C++) • Embedded systems educ • New prog. model, virtual lab • Also riosscheduler.org • Drunk driving (DUI) • ..\Desktop\dui.MOV • duicam.org
..\Desktop\Meti ER 2.mov Contributors • Chen Huang (UC Riverside, now Amazon) • Bailey Miller (UC Riverside) • Prof. Tony Givargis (UC Irvine) • Ting-Shuo Chou (UC Irvine) • Others... https://docs.google.com/file/d/0B7I3PmI9QsJTM2MzY2QyYWQtZjk4Mi00YWE0LTk1NzQtZTUwMTM5ZDA5ZDc5/edit • Fastest cost-effective execution of physical models • Real-time (or faster) cyber-physical system testing • Scientific research • More apps
Key contributors • Chen Huang (UC Riverside, now Amazon) • Bailey Miller (UC Riverside) • Prof. Tony Givargis (UC Irvine) • Ting-Shuo Chou (UC Irvine) • Others...