1 / 9

Satisfying Data-Intensive Queries Using GPU Clusters

Satisfying Data-Intensive Queries Using GPU Clusters. Haicheng Wu, Jeff Young Sudhakar Yalamanchili Computer Architecture and Systems Laboratory Center for Experimental Research in Computer Systems School of Electrical and Computer Engineering Georgia Institute of Technology.

shania
Télécharger la présentation

Satisfying Data-Intensive Queries Using GPU Clusters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Satisfying Data-Intensive Queries Using GPU Clusters Haicheng Wu, Jeff Young Sudhakar Yalamanchili Computer Architecture and Systems Laboratory Center for Experimental Research in Computer Systems School of Electrical and Computer Engineering Georgia Institute of Technology Sponsors: AIC, AMD, LogicBlox Inc., National Science Foundation, NEC, NVIDIA

  2. Application: Data Warehousing • On-line and off-line analysis • Retail analysis • Forecasting • Pricing • Etc… • Combination of relational data queries and computational kernels • Current applications process 1 to 50 TBs of data [1] • Techniques can be applied to other “Big Data” problems like irregular graphs, sorting …… LargeQty(p) <- Qty(q), q > 1000. …… [1] Independent Oracle Users Group. A New Dimension to Data Warehousing: 2011 IOUG Data Warehousing Survey.

  3. Proposed System Model • Red Fox: Compilation and optimization of queries for GPUs • Remove need for application developer to optimize applications to run on GPUs • Oncilla: Global Address Space (GAS) layer • Create an API to simplify data movement and scheduling

  4. Red Fox Compilation Flow Datalog Queries Primitives Library RA-to-PTX (nvcc + RA-Lib) LogicBlox Front-End Runtime Manager Kernel Weaver Translation Layer Back-End Language Front-End Query Plan PTX/Binary Kernel

  5. Relational Algebra Primitives on GPUs Raw Performance (NVIDIA C2050) Fastest known for GPUs! • Multi-stage algorithm • Simple primitives are close to maximum performance • More complex primitives could show better performance with newer implementations (in progress with NVIDIA Research) Practical MAX

  6. Red Fox: TPC-H Q1 Results • GPU computation scales well with problem size • Improved primitives could lead to further 10x speedup

  7. Oncilla: Fabrics for Accelerator Clouds Companies: LogicBlox, NVIDIA, AIC • Goal: Transparent, efficient host memory aggregation across node for accelerators • Solution: Use Global Address Spaces (GAS) and commodity fabrics (HT, QPI, PCIe, 10GE, IB) • Support in-core databases using software from Red Fox project

  8. Oncilla: Efficient Data Movement • Oncilla aims to combine support for multiple types of data transfer and CUDA-based optimizations under a simplified runtime. • Ex: “oncilla_malloc(2 GB, node2, gpumem)” • Enable application developers and schedulers to take advantage of high-performance GAS without needing to be experts in specialized hardware

  9. Questions? For more information: Red Fox: H. Wu, G. Diamos, H. Cadambi, and S. Yalamanchili, “KernelWeaver: Automatically Fusing Database Primitives for Efficient GPU Computation,” MICRO, December 2012 http://gpuocelot.gatech.edu/projects/compiler-projects/ Oncilla: http://gpuocelot.gatech.edu/projects/compiler-projects/oncilla-gas-infrastructure/ J. Young, S. Yalamanchili, Commodity Converged Fabrics for Global Address Spaces in Accelerator Clouds,” HPCC, June, 2012

More Related