1 / 21

Accelerating Belief Propagation in Hardware

Accelerating Belief Propagation in Hardware. Skand Hurkat and José Martínez Computer Systems Laboratory Cornell University http ://www.csl.cornell.edu /. The Cornell Team. Prof. José Martínez (PI), Prof. Rajit Manohar @ Computer Systems Lab

melora
Télécharger la présentation

Accelerating Belief Propagation in Hardware

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Accelerating Belief Propagation in Hardware SkandHurkat and José Martínez Computer Systems Laboratory Cornell University http://www.csl.cornell.edu/

  2. The Cornell Team • Prof. José Martínez (PI), Prof. RajitManohar@ Computer Systems Lab • Prof. Tsuhan Chen@ Advanced Multimedia Processing Lab • MS/Ph.D. students • Yuan Tian, MS ’13 • SkandHurkat • Xiaodong Wang

  3. The Cornell Graph

  4. The Cornell Project Inference Algorithm Graph • Provide hardware accelerators for belief propagation algorithms on embedded SoCs(retail/car/home/mobile) • High speed • Very low power • Self-optimizing • Highly programmable BP Accelerator within SoC Result

  5. What is belief propagation? Belief propagation is a message passing algorithm for performing inference on graphical models, such as Bayesian networks or Markov Random Fields

  6. What is belief propagation? • Labelling problem • Energy as a measure of convergence • Minimize energy (MAP label estimation) • Exact results for trees • Converges in exactly two iterations • Approximate results for graphs with loops • Yields “good” results in practice • Minimum over large neighbourhoods • Close to optimal solution

  7. Not all “that” alien to embedded Remember the Viterbi algorithm? • Used extensively in digital communications

  8. What does this mean? • Every mobile device uses Viterbi decoders • Error correction codes (eg: turbo codes) • Mitigating inter-symbol interference (ISI) • Increasing number of mobile applications involve belief propagation • More general belief propagation accelerators can greatly improve user experience with mobile devices

  9. Target markets Retail/Car/Home/Mobile • Image processing • De-noising • Segmentation • Object detection • Gesture recognition • Handwriting recognition • Improved recognition through context identification • Speech recognition • Hidden Markov models are key to speech recognition Servers • Data mining tasks • Part-of-speech tagging • Information retrieval • “Knowledge graph” like applications • Machine learning based tasks • Constructive machine learning • Recommendation systems • Scientific computing • Protein structure inference

  10. Hardware accelerator for BP Inference Algorithm Graph BP Accelerator within SoC Result

  11. Work done so far Software • General purpose MRF inference library • Support for arbitrary graphs • Floating point math • Parallel techniques for faster inference • Library optimized for grid graphs • Optimized data structures • Template can use any data type • Multiple inference techniques optimized for early vision • Stereo matching in 200 ms Hardware • High level synthesis of message update unit • Vivado HLS (C-to-gates) tool used to synthesize message update unit on ZedBoard • ∼2x improvement in inference speed on CPU+FPGA compared to CPU-only inference • Fixed point math • GraphGen collaboration • On-going work • Stereo matching task mapped to multiple platforms • 10x speedup on GPU w.r.t. CPU only implementation

  12. Work done so far Software • General purpose MRF inference library • Support for arbitrary graphs • Floating point math • Parallel techniques for faster inference • Library optimized for grid graphs • Optimized data structures • Template can use any data type • Multiple inference techniques optimized for early vision • Stereo matching in 200 ms Hardware • High level synthesis of message update unit • Vivado HLS (C-to-gates) tool used to synthesize message update unit on ZedBoard • ∼2x improvement in inference speed on CPU+FPGA compared to CPU-only inference • Fixed point math • GraphGen collaboration • On-going work • Stereo matching task mapped to multiple platforms • 10x speedup on GPU w.r.t. CPU only implementation

  13. Work done so far Software • General purpose MRF inference library • Support for arbitrary graphs • Floating point math • Parallel techniques for faster inference • Library optimized for grid graphs • Optimized data structures • Template can use any data type • Multiple inference techniques optimized for early vision • Stereo matching in 200 ms Hardware • High level synthesis of message update unit • Vivado HLS (C-to-gates) tool used to synthesize message update unit on ZedBoard • ∼2x improvement in inference speed on CPU+FPGA compared to CPU-only inference • Fixed point math • GraphGen collaboration • On-going work • Stereo matching task mapped to multiple platforms • 10x speedup on GPU w.r.t. CPU only implementation

  14. Hierarchical belief propagation

  15. Results – Stereo Matching

  16. Work done so far Software • General purpose MRF inference library • Support for arbitrary graphs • Floating point math • Parallel techniques for faster inference • Library optimized for grid graphs • Optimized data structures • Template can use any data type • Multiple inference techniques optimized for early vision • Stereo matching in 200 ms Hardware • High level synthesis of message update unit • Vivado HLS (C-to-gates) tool used to synthesize message update unit on ZedBoard • ∼2x improvement in inference speed on CPU+FPGA compared to CPU-only inference • Fixed point math • GraphGen collaboration • On-going work • Stereo matching task mapped to multiple platforms • 10x speedup on GPU w.r.t. CPU only implementation

  17. Work done so far Software • General purpose MRF inference library • Support for arbitrary graphs • Floating point math • Parallel techniques for faster inference • Library optimized for grid graphs • Optimized data structures • Template can use any data type • Multiple inference techniques optimized for early vision • Stereo matching in 200 ms Hardware • High level synthesis of message update unit • Vivado HLS (C-to-gates) tool used to synthesize message update unit on ZedBoard • ∼2x improvement in inference speed on CPU+FPGA compared to CPU-only inference • Fixed point math • GraphGen collaboration • On-going work • Stereo matching task mapped to multiple platforms • 10x speedup on GPU w.r.t. CPU only implementation

  18. GraphGen synthesis of BP-M • BP-M update (logspace messages) implemented using GraphGen (Intel/CMU/UW) • GPU implementation 10x faster than CPU based implementation • On-going work on FPGA based implementation and on implementing hierarchical update

  19. Cornell Publications (2013 only) • 3x Comp. Vision & Pattern Recognition (CVPR) • 3x Asynchronous VLSI (ASYNC) • 2x Intl. Symp. Computer Architecture (ISCA) • 1x Intl. Conf. Image Processing (ICIP) • 1x ASPLOS (w/ GraphGen folks, under review)

  20. Year 3 Plans • GraphGen extensions for BP applications • Multiple inference techniques • Extraction of “BP ISA” • Ops on arbitrary graphs • Efficient representation • Amplification work on UAV ensembles • Self-optimizing, collaborative SoCs • One-day “graph” workshop with GraphGen+UIUC

  21. Accelerating Belief Propagation in Hardware SkandHurkat and José Martínez Computer Systems Laboratory Cornell University http://www.csl.cornell.edu/

More Related