1 / 41

Reconfigurable Computing

Reconfigurable Computing. Dominique LAVENIER IRISA / CNRS Rennes lavenier@irisa.fr. Reconfigurable Computing Idea (1). micro processor. ASIC. FPGA. programmable slow. not programmable fast. program. architecture. Reconfigurable Computing Idea (2). Y(i) = X(i-k) W(k).

salome
Télécharger la présentation

Reconfigurable Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reconfigurable Computing Dominique LAVENIER IRISA / CNRS Rennes lavenier@irisa.fr

  2. Reconfigurable Computing Idea (1) micro processor ASIC FPGA programmable slow not programmable fast program architecture

  3. Reconfigurable Computing Idea(2) Y(i) = X(i-k) W(k) Sequence of pre-defined instructions Assembly of boolean functions memory memory Von Neumann model

  4. Talk overview • FPGA Technology • Reconfigurable Architectures • Reconfigurable Processor Arrays • Perspectives

  5. 1995 1996 1997 1998 1999 2000 2001 FPGA in short • FPGA: Field Programmable Gate Array • Introduced by Xilinx in 1985 • Implement a few millions of logic gates • Market: 2500 - 2000 - 1500 - 1000 - 500 dollars in million

  6. FPGA Market Share - Q1 1997

  7. FPGA Structure I/O Logic block Switching box Routing network

  8. CLB(configurable logic block) REG RAM Look-up table

  9. Conventional FPGA Tile

  10. XC4K Interconnect Details

  11. Traditional Design Flow VHDL EDIF RTL a few minutes to a few hours Tech. Indep. Optimization LUT Mapping Placement Routing Bitstream Generation Config. Data

  12. FPGA Component Use • FPGA components are used for • ASIC substitution • Rapid prototyping • VHDL simulation • Reconfigurable Computing • . . .

  13. Reconfigurable Architectures • Functional Unit • Co-processor • Accelerator • System

  14. UAL MEM Reconfigurable Functional Unit • FPGA integrated into the datapath • Idea: • tailored the operations/instructions • to the application • Level of Reconfigurability: • Instructions

  15. Spyder Project • C. Iseli (Swiss Federal Institute of Technology, Lausanne) RFU1 RFU2 registers registers RFU3

  16. Why it does not work ? • RFUs are slow • between 5 to 10 times slower than standard functional units • No programming tools • the synthesis of specific operators must be automatic

  17. UAL MEM Reconfigurable Co-Processor • Close connection to the CPU • Integrated on the same die • Not (yet?) available • Level of Reconfigurability: • Functions

  18. ArMen • B. Pottier (UBO, Brest) P P M M P P M M

  19. Level of Reconfigurability: • Application Reconfigurable Accelerator UAL • Communicate through I/O bus • External board • Matrix of FPGA components • with external RAM • Commercial boards available MEM

  20. PAM boards • PAM : Programmable Active Memory) • J. Vuillemin, P. Bertin, D. Roncin (DEC PRL) • Perle-0 (87), Perle-1 (91), Pamette (95), … Host computer FPGA memory

  21. P P M M • Level of Reconfigurability: • System Reconfigurable System • System on Chip • - 1 reconfigurable zone connected • to several components • - available soon • Virtex + PowerPC (Xilinx/IBM)

  22. Functional Unit Co-processor Accelerator System ? ? ? Intensive computation cryptography, image processing, DNA sequencing, … Embedded systems mobils of 3rd generation, ... Architectures - Applications

  23. Reconfigurable Processor Arrays • Principle • parallelize intensive computation on an array of hundred (thousand) of tailored processors • Performance come from • the parallelization • the customization

  24. ... ... ... … send ( … ) receive ( … ) … … Parallelization initial code ... ... ... … for ( … ) for ( … ) for (… ) … …

  25. Customization • data-path width • dedicated operator • parallelism A C B D

  26. Design of Reconfigurable Processor Arrays • fast design time thanks to • regular structure • specify one processor, then replicate • local interconnection • optimize place-and-route step

  27. Reconfigurable Processor ArraysApplications • Image processing • Signal processing • Bio-computing • Crypyography • Text processing • ... Today : mostly integer applications

  28. Performance examples • DNA search • PeRLe-1 board (16 Xilinx 3090 - 1991) • speed-up = 50 • K-means clustering • Wildforce board (4 Xilinx 4036 - 1997) • speed-up = 100 • PPI algorithm • Spyder board (1 Xilinx V800 - 2000) • speed-up = 200 host same technology

  29. Limitations host • host-board data bandwidth • bottleneck • programming tools • automatic parallelization • partitioning • hardware generation • portability !

  30. Perspectives • Technology • Applications • Architecture

  31. Exponential Growth in Density LUT logic cells logic gates 1 000 000 100 000 10 000 1000 12 M 1.2 M 120 K 12 K 1994 1996 1998 2000 2002 2004 2006

  32. Technology 1998 2000 2002 2005 30-50M gates Xilinx Virtex XCV300 (0.3M gates) Xilinx Virtex II (10M gates) 400 Nios Xilinx Virtex XCV3200 (2M gates) • Altera APEX20K1500 (2.4 M gates) • 30 x 32-bit Nios processor (80K gates)

  33. Applications • until now • performance have been demonstrated on integer applications with a high degree of parallelism • from now • it becomes « reasonable » to investigate the implementation of floating point applications

  34. Floating-point operators • Estimation based on current research at IRISA • Component Xilinx XCV1000 (1 Mgates) • Pipelined operators Simple precision Double precision adder area 3% 5% multiplier area 5% 20% frequency 50Mhz 100Mhz

  35. Floating point performance 1998 2000 2002 2005 5 FPA 25 MHz 0.1 1 10 100 25 FPA 50 MHz 125 FPA 100 MHz 500 FPA 200 MHz Giga Flops FPA : double precision floating-point adder

  36. Architecture • Today accelerator board: • restricted bandwidth • parallelism on 1D array

  37. Architecture • dual-port RAM connection Fast dual-port memory

  38. Architecture • On-chip FPGA An alternative way of using the one billion-transistor processors of the next decade

  39. Conclusion • The technology is available for reconfigurable computing • 30-50 M gates in 2005 • Application domains are increasing • floating point • No programming tools • model ? • portability ?

More Related