HDF5 Chunking

HDF5 Chunking Performance issues HDF5 Workshop at PSI

Goal • To help you with understanding how HDF5 chunking works, so you can efficiently store and retrieve data from HDF5 HDF5 Workshop at PSI

Outline • HDF5 chunking overview • I/O considerations • HDF5 chunk cache • Case study or how to avoid performance pitfalls • Future directions • Current THG efforts • Proposals in works HDF5 Workshop at PSI

HDF5 chunking overview HDF5 Workshop at PSI

HDF5 Dataset Components Dataset header Dataset data array Dataspace Rank Dimensions 3 Dim_1 = 4 Dim_2 = 5 Dim_3 = 7 Datatype IEEE 32-bit float Attributes Storage info Time = 32.4 Chunked Pressure= 987 Compressed Temp = 56 Metadata Raw data HDF5 Workshop at PSI

Contiguous storage layout Applicationmemory Metadata cache (MDC) Dataset array data Datasetheader Dataset array data Datasetheader HDF5 File Raw data is stored in one contiguous block in HDF5 file HDF5 Workshop at PSI

What is HDF5 chunking? • Data is stored in chunks of predefined size • Two-dimensional instance sometimes referred to as data tiling • HDF5 library always writes/reads the whole chunk Chunked Contiguous HDF5 Workshop at PSI

Chunked storage layout Applicationmemory Dataset array data Metadata cache (MDC) B C D A Datasetheader Chunkindex HDF5 File Datasetheader Chunkindex C D B A Raw data is stored in separate chunks in HDF5 file HDF5 Workshop at PSI

Why HDF5 chunking? • Chunking is required for several HDF5 features • Applying compression and other filters like checksum • Expanding/shrinking dataset dimensions and adding/”deleting” data HDF5 Workshop at PSI

Why HDF5 chunking? • If used appropriately chunking improves partial I/O for big datasets Only two chunks are involved in I/O HDF5 Workshop at PSI

Creating a chunked dataset • Programming model • Create a dataset creation property list • Set property list to use chunked storage layout • Create dataset with the property list dcpl_id= H5Pcreate(H5P_DATASET_CREATE); rank = 2; ch_dims[0] = 100; ch_dims[1] = 200; H5Pset_chunk(dcpl_id, rank, ch_dims); dset_id = H5Dcreate(…, dcpl_id); H5Pclose(dcpl_id); HDF5 Workshop at PSI

Chunked dataset • Things to remember: • Chunk always has the same rank as a dataset • Chunk’s dimensions do not need to be factors of dataset’s dimensions • Caution: May cause more I/O than desired (see grey portions of the chunks below) HDF5 Workshop at PSI

Writing or reading a chunked dataset • Chunking storage mechanism is transparent to application • Use the same set of operation as for contiguous dataset, for example, H5Dopen(…); H5Sselect_hyperslab(…); H5Dread(…); • Selections do not need to align precisely with the chunks boundaries HDF5 Workshop at PSI

Extending a dataset • Chunking storage is required • Programming model • Set new extent with H5Dextent • Select hyperslab • Write data HDF5 Workshop at PSI

Applying HDF5 filters • Chunked storage is required • Types of filters • Algebraic Data transformation • Data shuffling • Checksum • Data compression • Scale + offset • N-bit • GZIP (deflate) • SZIP • Compression methods supported by HDF5 UserCommunity HDF5 Workshop at PSI

Applying filters to a dataset dcpl_id = H5Pcreate(H5P_DATASET_CREATE); cdims[0] = 100; cdims[1] = 100; H5Pset_chunk(dcpl_id, 2, cdims); H5Pset_shuffle(dcpl); H5Pset_deflate(dcpl_id, 9); dset_id= H5Dcreate(…, dcpl_id); H5Pclose(dcpl_id); HDF5 Workshop at PSI

Check point • Can I change chunk size after a dataset is created? • No • Use h5repack to change a storage layout or chunking/compression parameters • Why shouldn’tI make a chunk with dimension sizes equal to one? • Next slide… HDF5 Workshop at PSI

Pitfall – chunk size • Chunks are too small • File has too many chunks • Extra metadata increases file size • Extra time to look up each chunk • More I/O since each chunk is stored independently • Larger chunks results in fewer chunk lookups, smaller file size, and fewer I/O opertions HDF5 Workshop at PSI

Pitfall – chunk size • Chunks are too large • Set chunk size to be the same as the dataset size to enable compression on contiguous dataset • Entire chunk has to be read and uncompressed before performing any operations • Great performance penalty for reading a small subset • Entire chunk has to be in memory and may cause OS to page memory to disk, slowing down the entire system HDF5 Workshop at PSI

Motivation for using chunk storage and chunk cache I/O considerations HDF5 Workshop at PSI

Accessing a row in contiguous dataset One seek is needed to find the starting location of row of data. Data is read/written using one disk access. HDF5 Workshop at PSI

Accessing a row in chunked dataset Five seeks are needed, one for each chunk. Data is read/written using five disk accesses. Chunking storage is less efficient than contiguous storage. HDF5 Workshop at PSI

Check point • How might I improve this situation, if it is common to access my data in this way? HDF5 Workshop at PSI

Accessing data in contiguous dataset M rows M seeks are needed to find the starting location of the element. Data is read/written using M disk accesses. Performance may be very bad. HDF5 Workshop at PSI

Motivation for chunk storage M rows Two seeks are needed to find two chunks. Data is read/written using two disk accesses. For this pattern chunking helps with I/O performance. HDF5 Workshop at PSI

Check point • If I know I shall always access a column at a time, what size and shape should I make my chunks? HDF5 Workshop at PSI

Motivation for chunk cache A B H5Dwrite H5Dwrite Selection shown is written by two H5Dwrite calls (one for each row). Chunks A and B are accessed twice (one time for each row). If both chunks fit into cache, only two I/O accesses needed to write the shown selections. HDF5 Workshop at PSI

Motivation for chunk cache A B H5Dwrite H5Dwrite Question: What happens if there is a space for only one chunk at a time? HDF5 Workshop at PSI

HDF5 chunk cache HDF5 Workshop at PSI

HDF5 raw data chunk cache • Improves performance whenever the same chunks are read or written multiple times • Chunk cache is per dataset HDF5 Workshop at PSI

Reading row selection Application buffer B A C B A C Chunk cache Chunks in HDF5 file HDF5 Workshop at PSI

HDF5 chunk cache APIs • H5Pset_chunk_cache sets raw data chunk cache parameters for a dataset • H5Pset_chunk_cache (dapl, …); • H5Pset_cache sets raw data chunk cache parameters for all datasets in a file • H5Pset_cache (fapl, …); • Other parameters to control chunk cache • nbytes– total size in bytes (1MB) • nslots – number of slots in a hash table (521) • w0 – preemption policy (0.75) HDF5 Workshop at PSI

HDF5 raw data chunk cache • Chunk stays in cache until evicted • Space’s id needed for new chunks • Hash values collide • The chunk preemption policy defined by the parameter w0, 0≤ w0 ≤ 1 • Determines when to evict fully read or written chunks from cache • 0 means fully read or written chunks are treated no differently than other chunks (the preemption is strictly LRU) • 1 means fully read or written chunks are always preempted before other chunks • If application only reads or writes data once, this can be safely set to 1. Otherwise, this should be set lower, depending on how often you re-read or re-write the same data HDF5 Workshop at PSI

HDF5 raw data chunk cache • Current implementation doesn‘tadjust parameters automatically (cache size, size of hash table) • Chunks are indexed with a simple hash table • Hash function = (cindex mod nslots), where cindex is the linear index into a hypothetical array of chunks and nslots is the size of hash table • Only one of several chunks with the same hash value stays in cache • nslotsshouldbe a prime number to minimize the number of hash value collisions. HDF5 Workshop at PSI

Pitfall – improper hash table (nslots) • What will happen in this case when nslots = 5 and application reads by columns? Chunk A will be evicted as soon as data from chunk B is needed; the same with C; C will be evicted as soon as the second row is read, etc. A lot of chunk swapping! A B C HDF5 Workshop at PSI

Pitfall – cache is not big enough (nbytes) • All chunks in selection will be read intothe cache and then written to disk (if writing) and evicted • If application revisits the same chunks, they will have to be read and possibly written again HDF5 Workshop at PSI

Pitfall – cache is not big enough (nbytes) A B M rows Each row is read by a separate call to H5Dread • If both chunks fit into cache, 2 disks accesses are needed to read the data • If one chunk fits into cache, 2M disks accesses are needed to read the data (compare with M accesses for contiguous storage!) HDF5 Workshop at PSI

Hints for using chunk cache settings • Chunk dimension sizes should align as closely as possible with hyperslab dimensions for read/write • Chunk cache size (nbytes) should be large enough to hold all the chunks in a selection • If this is not possible, it may be best to disable chunk caching altogether (set nbytesto 0) • nslotsshould be a prime number that is at least 10 to 100 times the number of chunks that can fit into nbytes • w0should be set to 1 if chunks that have been fully read/written will never be read/written again HDF5 Workshop at PSI

Better to See Something Once Than Hear About it Hundred Times Case study HDF5 Workshop at PSI

Case study: writing chunked dataset • 1000x100x100 dataset • 4 byte integers • Random values 0-99 • 50x100x100 chunks (20 total) • Chunk size: 2 MB • Write the entire dataset using 1x100x100 slices • Slices are written sequentially • Chunk cache size 1MB (default) compared with chunk cache size of 5MB HDF5 Workshop at PSI

Test setup • 20 Chunks • 1000 slices • Chunk size ~ 2MB • Total size ~ 40MB • Each slice ~ 40KB HDF5 Workshop at PSI

Aside: Writing dataset with contiguous storage • 1000 disk accesses to write 1000 planes • Total size written 40MB HDF5 Workshop at PSI

Writing chunked dataset • Example: Chunk fits into cache • Chunk is filled in cache and then written to disk • 20 disk accesses are needed • Total size written 40MB 1000 disk access for contiguous HDF5 Workshop at PSI

Writing chunked dataset • Example: Chunk doesn’t fit into cache • For each chunk (20 total) • Fill chunk in memory with the first plane and write it to the file • Write 49 new planes to file directly • End For • Total disk accesses 20 x(1 + 49)= 1000 • Total data written ~80MB (vs. 40MB) B B Chunk cache A November 3-5, 2009 HDF/HDF-EOS Workshop XIII 44 A A HDF5 Workshop at PSI

Writing compressed chunked dataset • Example: Chunk fits into cache • For each chunk (20 total) • Fill chunk in memory, compress it and write it to file • End For • Total disk accesses 20 • Total data written less than 40MB Chunk cache Chunk in a file B B B A A November 3-5, 2009 45 A HDF5 Workshop at PSI

Writing compressed chunked dataset • Example: Chunk doesn‘tfit into cache • For each chunk (20 total) • Fill chunk with the first plane, compress, write to a file • For each new plane (49 planes) • Read chunk back • Fill chunk with the plane • Compress • Write chunk to a file • End For • End For • Total disk accesses 20 x(1+2x49)= 1980 • Total data written and read ? (see next slide) • Note:HDF5 can probably detect such behavior and increase cache size November 3-5, 2009 HDF5 Workshop at PSI

Effect of chunk cache size on write No compression, chunk size is 2MB Gzip compression HDF5 Workshop at PSI

Effect of chunk cache size on write • With the 1 MB cache size, a chunk will not fit into the cache • All writes to the dataset must be immediately written to disk • With compression, the entire chunk must be read and rewritten every time a part of the chunk is written to • Data must also be decompressed and recompressed each time • Non sequential writes could result in a larger file • Without compression, the entire chunk must be written when it is first written to the file • If the selection were not contiguous on disk, it could require as much as 1 I/O disk access for each element HDF5 Workshop at PSI

Effect of chunk cache size on write • With the 5 MB cache size, the chunk is written only after it is full • Drastically reduces the number of I/O operations • Reduces the amount of data that must be written (and read) • Reduces processing time, especially with a compression filter HDF5 Workshop at PSI

Conclusion • It is important to make sure that a chunk will fit into the raw data chunk cache • If you will be writing to multiple chunks at once, you should increase the cache size even more • Try to design chunk dimensions to minimize the number you will be writing to at once HDF5 Workshop at PSI

HDF5 Chunking

HDF5 Chunking

Presentation Transcript

HDF5 Tools

Chunk- ing ….Chunking!

Chunking Shallow Parsing

CHUNKING

Skills -Chunking-

Sequence Classification: Chunking

CHUNKING

Chunking

Chunking

Parallel HDF5

Chunking

Parallel HDF5

Chunking

Chunking

Chunking: Shallow Parsing

Chunking 101

Chunking

EAP 3 Chunking

Chunking

HDF5-iRODS

HDF5 Tools