1 / 8

Research with ocelot

Research with ocelot. Workload Characterization and Analysis. SM Load Imbalance (Mandelbrot). Intra-Thread Data Sharing. Activity Factor. Constructing Performance Models: Eiger. Develop a portable methodology to discover relationships between architectures and applications.

ogden
Télécharger la présentation

Research with ocelot

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Research with ocelot

  2. Workload Characterization and Analysis SM Load Imbalance (Mandelbrot) Intra-Thread Data Sharing Activity Factor

  3. Constructing Performance Models: Eiger • Develop a portable methodology to discover relationships between architectures and applications Adapteva’s multicore from electronicdesign.com • Extensions to Ocelotfor the synthesis of performance models • Used in macroscale simulation models • Used in JIT compilers to make optimization decisions • Used in run-times to make scheduling decisions

  4. Eiger Methodology • Use data analysis techniques to uncover application-architecture relationships • Discover and synthesize analytic models • Extensible in source data, analysis passes, model construction techniques, and destination/use Ocelot JIT SST/Macro

  5. Feedback-Driven Optimization: Autotuning • Use Ocelot’s dynamic instrumentation capability • Real-Time feedback drives the Ocelot kernel JIT • Decision models to drive existing/new auto-tuners • Change data layout to improve memory efficiency • Use different algorithms • Selective invocation  hot path profiling  algorithm selection Workload Characterization Decision Models Not available with CUPTI Measurements Code Generation

  6. Feedback-Driven Resource Management Applications • Real time customizedinformation available about GPU usage • Can drive scheduling decisions • Can drive management policies, e.g., power, throughput, etc. OCelot Ocelot’s Lynx Management Layer Instrumentation GPU Clusters PTX Instrumented PTX Instrumented PTX Instrumented PTX Instrumentation APIs C-on-Demand JIT Instrumentor C-PTX Translator PTX-PTX Transformer

  7. Domain Specific Compilation: Red Fox Joint with LogicBlox Inc. Datalog Queries LogicBlox Front-End Language Front-End Targeting Accelerator Clouds for meeting the demands of data warehousing applications src-src Optimization Datalog-to-RA (nvcc + RA-Lib) Translation Layer RA Primitives Harmony Kernel IR IR Optimization Harmony Machine Neutral Back-End Ocelot

  8. Thank YouQuestions?

More Related