1 / 15

Selective, Embedded Just-in-Time Specialization (SEJITS)

Selective, Embedded Just-in-Time Specialization (SEJITS). As a platform for implementing communication-avoiding algorithms accessible from Python. Traditional deployment. C/C++ libraries LAPACK MKL Accessible from high-level languages like Python using C bindings

edric
Télécharger la présentation

Selective, Embedded Just-in-Time Specialization (SEJITS)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Selective, Embedded Just-in-Time Specialization (SEJITS) As a platform for implementing communication-avoiding algorithms accessible from Python

  2. Traditional deployment • C/C++ libraries • LAPACK • MKL • Accessible from high-level languages like Python using C bindings • Most execution time spent in the library, so it’s fast

  3. Problems: Static source code • C/C++ library has static source code • Cannot adapt to different architectures • Cannot adapt to different input data • Optimizations and functionality are mixed together • SEJITS solves these problems withruntime code generation

  4. Problems: Composition • Example: Suppose we want to compute a complex matrix expression like • Library interface requires decomposing into a sequence of operations: • T1 = matrix_matrix_multiply(A, B) • t1 = matrix_vector_multiply(T1, x) • t2= matrix_vector_multiply(C, x) • result = dot_product(t1, t2) • Performance problem! Not the best sequence

  5. Problems: Composition • Complex operations are formed by combining simpler ones • Application programmer interface consists of an awkward sequence of low-level operations • User doesn’t know how to choose best sequence • Library can’t see future operations • SEJITS solves these problems by providing a rich fluent interface and giving you access to the entire expression

  6. SEJITS architecture • Applications are written in Python, a high-level productivity language • When certain functions are called, a specializer is invoked which compiles that function down to C/C++ and executes it on-the-fly • Specializers written in Python, supported by Asp infrastructure

  7. SEJITS architecture Productivity app .py .c f() h() cc/ld $ PLL Interp ASP.py .so Specializer OS/HW

  8. Implementation methods • Templates • Static C/C++ code with “holes” filled in at runtime for (inti=0; i < ${num_items}; i++) { arr[i] *= 2.0; } • Facilitates compiler optimizations • Allows adapting to machine parameters • Allows choosing among implementations based on architecture

  9. Implementation methods • Tree transformations • Input/output code expressed as abstract syntax tree • Specializer walks over tree and translates nodes • Facilitates complex transformations and optimizations • Can be used together with templates

  10. Akx specializer • Built by Jeffrey Morlan • Uses a communication-avoiding algorithm to compute Akx for many values of k • Building block in other algorithms like Conjugate Gradient • Generates different code depending on dimensions of the input matrices as well as their contents

  11. Akx specializer Conjugate Gradient solver performance using communication-avoiding matrix powers kernel. A matrix labeled 141K/7.3M has 141K rows and 7.3M nonzero elements. The dark part of each bar shows time spent on matrix powers while the light part shows time in the remainder of the solver.

  12. Live exercise • SSH to: moonflare.com • Log in as username “cs294-76”, password “2xyb3pex” • Do: • mkdiryourname • cp *.py *.makoyourname • cd yourname • Run with: python double.py • View generated C++ in “cache” subdirectory

  13. Live exercise • Edit double_template.mako • Use your favorite editor or “nano” if you don’t have one • Try changing it to multiply the vector by 3.0 instead of 2.0 • Then run “python double.py” again • Don’t worry about assertion failure (sorry!)

  14. Live exercise • Next we’ll make it so you can multiply by any scalar you want • Replace constant in double_template.mako with a placeholder ${scalar} • Edit double.py • Add a parameter to double_using_template for the scalar multiple • Pass it to mytemplate.render • Update test_generated to add the argument • Then run “python double.py” again

  15. Download / Questions? • Download SEJITS at: • https://github.com/shoaibkamil/asp • Or just Google “SEJITS” • Contact parlab-sejits@lists.eecs for support • Questions?

More Related