Download
video on dsp and fpga n.
Skip this Video
Loading SlideShow in 5 Seconds..
Video on DSP and FPGA PowerPoint Presentation
Download Presentation
Video on DSP and FPGA

Video on DSP and FPGA

183 Vues Download Presentation
Télécharger la présentation

Video on DSP and FPGA

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Video on DSP and FPGA John Johansson April 12, 2004

  2. Agenda • Overview of video processing • A typical video encoder and the DCT • Requirements of DCT • Comparison of DSP and FPGA chips • Analysis and conclusions • Questions

  3. Overview of Video Processing Video processing generally involves • Compression / Decompression • Special Effects • TV Broadcasting • Focus on Compression

  4. Video Encoding Typical Video Encoder • Focus on DCT algorithm

  5. The Discrete Cosine Transformation • DCT is a spatial transform, like the FFT • Rearranges data into a more compressible format • Typically done on 64 (8x8) pixels at a time • Big nasty equation … • … But no sharp teeth (optimizes extremely well)

  6. Requirements for DCT Basic Idea • Read in data (64 values, 8-24 bits signed / unsigned) • Do transformation • Write out data • Profit !!! • Easy, right ??

  7. Requirements for DCT Memory Limitations • Load an entire frame? • One frame can vary from 50K to 50 MB in size when uncompressed • External memory is much slower, more plentiful • Do the DCT in chunks (8x8 block)

  8. Requirements for DCT Degree of Parallelism • DCT can be done serially, or broken up and done in parallel • Parallelism depends largely on available memory • Price / Performance tradeoffs

  9. The Challengers Xilinx Spartan-3 FPGA • 50K – 5M gates • 326 MHz • 100 KB – 2.3 MB internal memory • 4 - 104 dedicated multipliers • Oodles of I/O pins (up to 784) Look at XC3S1000 • 1M gates, 560 KB memory, 24 multipliers, 376 I/O pins

  10. The Challengers ADSP-BF5xx Blackfin Processor • 200 – 750 MHz • Single or dual core • DMA memory controller • 52 KB – 326 KB internal memory • Other processor goodies Look at ADSP-BF533 • 500 MHz, single core, 148 KB memory

  11. Performance How do we correctly benchmark an algorithm between two completely different processors? • I don’t really know • Look at some rough performance indicators and try and draw a conclusion

  12. Performance FPGA • Varies from 1-25 cycle(s) / pixel for DCT • Reading and writing of data takes additional time • Clock speed limited by degree of parallelism DSP • Roughly 5 cycles / pixel for DCT • DMA controller allows parallel reading and writing with some setup overhead

  13. (Ideal) Performance Spartan-3 • 64 read + 64 compute + 64 write = 196 cycles / block • 326 MHz = 1.66 Mblocks / second Blackfin • 319 compute + 10 DMA transfer = 329 cycles / block • 500 MHz = 1.52 Mblocks / second

  14. Advantages FPGA • Potential for very high parallelism • Existing video designs available for purchase • Good middleman functionality DSP • Higher potential clock speed • Much more flexible design • DMA memory controller

  15. Disadvantages FPGA • Low flexibility • Hard to optimize • Limited logic blocks DSP • Difficult to achieve full utilization • Higher power consumption

  16. Conclusions FPGA • Best for well defined roles, like DCT • Faster in situations where throughput matters • Can be very expensive DSP • Better off for more flexible roles, like full encoder • Situations where large amounts of (additional) memory are needed

  17. Questions?

  18. References Xilinx Spartan III http://www.xilinx.com/xlnx/xil_prodcat_landingpage.jsp?title=Spartan-3 Analog Devices Blackfin http://www.analog.com/processors/processors/blackfin/index.html

  19. References Other articles http://www.xilinx.com/publications/products/services/xc_pdf/xc_videoapps44.pdf http://www.xilinx.com/publications/products/sp2e/xc_dspvid43.htm http://www.reed-ectronics.com/ednmag/article/CA336860?stt=000&pubdate=11%2F27%25