1 / 15

A Fast Fourier Transform Compiler

A Fast Fourier Transform Compiler. Silvio D Carnevali. Contents. FFTW and genfft : an introduction genfft: How it works 1.) DAG Creation 2.) Simplifier 3.) Scheduler 4.) Unparsing Conclusion: similar applications. genfft. special purpose compiler objective Camelot

rhoda
Télécharger la présentation

A Fast Fourier Transform Compiler

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Fast Fourier Transform Compiler Silvio D Carnevali

  2. Contents • FFTW and genfft: an introduction • genfft: How it works 1.) DAG Creation 2.) Simplifier 3.) Scheduler 4.) Unparsing • Conclusion:similarapplications

  3. genfft • special purpose compiler • objective Camelot • produces DFT subroutines • Outputs C code • parameterized according to: - Input length - Data type

  4. FFTW • Collection of “Codelets” • Codelets: fragments of C code • Generated by genfft • plan: optimal composition of codelets  depends on input size and HW  automatically selected by FFTW (FJ98)

  5. Performance of FFTW Powers of 2 Any powers of 2, 3, 5, 7

  6. genfft: creation of the codelet’s DAG • Nodes: data types  Encode arithmetic expressions  Use real numbers for C compatibility • Generic node = operator • Children = operands • DAG Algorithm depends on input size

  7. DAG creation Algorithms

  8. FT Equation • X = input vector • Y = FT of X • wn = nth root of unity

  9. genfft: DAG Simplifier • Bottom-up traversal of DAG • local improvements:  Algebraic transformations (constant folding, +/* simplification)  CSE: eliminate existing + create new ones  DFT-specific improvements

  10. Algebraic transformations • Simplifies multiplication by 1, 0 or -1 • Simplifies addition by 0 • Distribution: kx + ky = k(x + y)

  11. DFT-Specific improvements • Numeric constants made positive (Local)  Constants: generally k and -k  Reduces number of loads • DAG transposition (for Linear Function)  Simplifies DAG, transpose + simplify, transpose + simplify  Reduces number of multiplications only

  12. 5 X A 2 3 Y B 4 5 X A 2 3 Y B 4 5 X A 2 3 Y B 4 DFT-Specific improvements Simplify DAG E DAG D Transpose Simplify DAG FT DAG ET Transpose Simplify DAG F DAG E

  13. genfft: DAG Scheduler • Goal: minimize use of regs • No instruction scheduling • Partitions DAG in 2 recursively  register mapping  Optimal for n = 2k  Partitioning heuristics • Optimality? Not for n != 2k

  14. genfft: Unparsing • Schedule unparsed to C • Pipeline usage managed by C compiler • genfft + C compiler: performance problems  egcs “optimizer”

  15. Conclusion & future work • FFTW: The best of the best of the best… • Over 100 downloads every week! • genfft: specialized for linear functions  Crystallographic FT  FIR & IIR filters  Image processing (JPEG discrete cosine transform)

More Related