Cache-Conscious Structure Definition

Cache-ConsciousStructure Definition By Trishul M. Chilimbi, Bob Davidson, and James R. Larus Presented by Shelley Chen March 10, 2003

Motivation • Processor-Memory Performance Gap • Growing at 53% per year! • Chilimbi and Larus previous work • Placed objects with high temporal locality in the same cache block • Works best with objects < ½ cache block • Current paper proposes techniques for larger data structures

Improving Cache Performance

Two Cache-Conscious Definition Techniques • Structure Splitting • Split large data structures into two smaller structures • Field Reordering • Group fields in a structure with high temporal locality into the same cache block

Structure Splitting • Split large structures into two smaller structures • “hot” structure contains frequently accessed fields • “cold” structure contains rarely accessed fields • Allow more “hot” structures to fit into the cache • Has been done manually in the past, this paper is first to automate

Class Splitting Overview

Experimental Setup • Ran compiled programs on Sun Ultraserver E5000 • 2 GB of memory • L1 dcache, 16 KB DM, 16 byte blocks • L2 cache, 1 MB DM, 64 byte blocks

Results: Structure Splitting Reduces L2 miss rates by 10-27%; improves execution time by 10-20%

Field Reordering • Logical ordering of the program is not usually consistent with its data access patterns • Frequently accessed fields may be coded next to rarely accessed fields, putting them in the same cache block • cause excessive cache misses • Reorder field definitions of structure • fields with high temporal affinity in same cache block

bbcache • Tool that produces structure field reordering recommendations • Construct a database containing both static and dynamic information about the structure field accesses • Process database to construct field affinity graphs for each structure • Produce the structure field order recommendations for the affinity graphs • Attempts to group fields with high temporal affinity into the same cache block

Experimental Setup • 4 processor 400MHz Pentium II Xeon system • 1MB L2 cache/processor • 4GB memory • Ran TPC-C on Microsoft SQL Server 7.0 to collect traces as input to bbcache • Chose 5 structures which showed largest potential from the benefits of reordering

Results: Field Reordering Modified SQL Server was consistently better by 2-3% Cache block pressure =Σ (b1, …,bn)/n Cache block utilization = Σ(f11, …, fnbn)/ Σ(b1, …,bn)

Conclusion • 2 techniques to improve cache performance • change the internal organization of fields in a data structure • Structure splitting • Field reordering • Structure Layouts better left to compiler • Easier to determine field access order

Questions?

Class Splitting Algorithm

Structure Access Database

Building Field Affinity Graphs

Cache-Conscious Structure Definition

Cache-Conscious Structure Definition

Presentation Transcript

Conscious Sedation

Cache-Conscious Data Placement

Cache-Conscious Algorithms and Data Structures

Conscious Discipline

Conscious Discipline

Conscious Capitalism (CC) Conscious Business (CB)

A Comparison of Cache-conscious and Cache-oblivious Programs

Cache-Conscious Wavefront Scheduling

Conscious Sedation

Effect of Node Size on the Performance of Cache-Conscious B+ Trees

Making B+-Trees Cache Conscious in Main Memory

Cache-Conscious Data Placement

Cache-conscious Frequent Pattern Mining on a Modern Processor

Cache-Conscious Performance Optimization for Similarity Search

Topics 10: Cache Conscious Indexes

DEFINITION AND PROPERTIES OF ANTIGEN IMMUNOLOGICAL DEFINITION Any chemical structure

Conscious clubbing

SDG Data Structure Definition

Efficient Similarity Search with Cache-Conscious Data Traversal

Cache-Conscious Wavefront Scheduling

Conscious Sedation