1 / 80

Introduction to HDF5

Introduction to HDF5. HDF & HDF-EOS Workshop XII October 15, 2008. 1. Topics Covered. Introduce HDF5 Describe HDF5 Data and Programming Models Walk Through Example Code. 2. For More Information …. All workshop slides will be available from:

darva
Télécharger la présentation

Introduction to HDF5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to HDF5 HDF & HDF-EOS Workshop XII October 15, 2008 HDF & HDF-EOS Workshop XII 1

  2. Topics Covered • Introduce HDF5 • Describe HDF5 Data and Programming Models • Walk Through Example Code HDF & HDF-EOS Workshop XII 2

  3. For More Information … All workshop slides will be available from: http://hdfeos.org/workshops/ws12/workshop_twelve.php HDF & HDF-EOS Workshop XII

  4. What is HDF5? HDF = Hierarchical Data Format • Data model, library and file format for managing data • Tools for accessing data in the HDF5 format HDF & HDF-EOS Workshop XII

  5. Brief History of HDF 1987 At NCSA (University of Illinois), a task force formed to create an architecture-independent format and library: AEHOO (All Encompassing Hierarchical Object Oriented format) Became HDF Early NASA adopted HDF for Earth Observing System project 1990’s 1996 DOE’s ASC (Advanced Simulation and Computing) Project began collaborating with the HDF group (NCSA) to create “Big HDF” (Increase in computing power of DOE systems at LLNL, LANL and Sandia National labs, required bigger, more complex data files). “Big HDF” became HDF5. 1998 HDF5 was released with support from National Labs, NASA, NCSA 2006 The HDF Group spun off from University of Illinois as non-profit corporation HDF & HDF-EOS Workshop XII

  6. Why HDF5? In one sentence ... HDF & HDF-EOS Workshop XII 6

  7. Matter and the universe Life and nature August 24, 2001 August 24, 2002 Total Column Ozone (Dobson) 60 385 610 Weather and climate Answering big questions … HDF & HDF-EOS Workshop XII 7

  8. … involves big data … HDF & HDF-EOS Workshop XII 8

  9. … varied data … LCI Tutorial Thanks to Mark Miller, LLNL HDF & HDF-EOS Workshop XII 9

  10. … and complex relationships … SNP Score Contig Summaries Discrepancies Contig Qualities Coverage Depth Trace Reads Aligned bases Read quality Contig Percent match HDF & HDF-EOS Workshop XII 10

  11. … on big computers … … andsmallcomputers … HDF & HDF-EOS Workshop XII 11

  12. How do we… • Describe our data? • Read it? Store it? Find it? Share it? Mine it? • Move it into, out of, and between computers and repositories? • Achieve storage and I/O efficiency? • Give applications and tools easy access our data? HDF & HDF-EOS Workshop XII 12

  13. Solution: HDF5! • Can store all kinds of data in a variety of ways • Runs on most systems • Lots of tools to access data • Emphasis on standards (HDF-EOS, CGNS) • Library and format emphasis on I/O efficiency and storage HDF & HDF-EOS Workshop XII

  14. Structure of HDF5 Library Applications Object API (C, F90, C++, Java) Library internals Virtual file I/O File or other “storage” HDF & HDF-EOS Workshop XII

  15. HDF Tools - HDFView and Java Products - Command-line utilities (h5dump, h5ls, h5cc, h5diff, h5repack) HDF & HDF-EOS Workshop XII 15

  16. Simulation, visualization, remote sensing… Examples: Thermonuclear simulations Product modeling Data mining tools Visualization tools Climate models Storage ? HDF5 format User-defined device Split metadata and raw data files File on parallel file system File HDF5 Applications & Domains HDF-EOS CGNS ASC Communities HDF5 Data Model & API Virtual File Layer (I/O Drivers) Stdio Split Files MPI I/O Custom Storage ? HDF5 format User-defined device Split metadata and raw data files File on parallel file system File HDF & HDF-EOS Workshop XII

  17. Lots of Layers in HDF5! “Ogres are like onions.” Shrek  HDF5 Monster?? Just like Shrek, once you get to know HDF5 you will really like it!! HDF & HDF-EOS Workshop XII

  18. The HDF5 Format HDF & HDF-EOS Workshop XII 18

  19. palette An HDF5 file is a container… …into which you can put your data objects. lat | lon | temp ----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 HDF & HDF-EOS Workshop XII 19

  20. “/” (root) “foo” 3-D array lat | lon | temp ----|-----|----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 Table palette Raster image Raster image 2-D array HDF5 Structures for Organizing Objects HDF & HDF-EOS Workshop XII 20

  21. HDF5 Data Model Primary Objects • Groups • Datasets Additional ways to organize and annotate data • Attributes • Storage and access properties Everything else is built from these parts. HDF & HDF-EOS Workshop XII 21

  22. Metadata Data Dataspace Rank Dimensions 3 Dim_1 = 4 Dim_2 = 5 Dim_3 = 7 Datatype Integer Attributes Storage Info Time = 32.4 Chunked Pressure = 987 Compressed Temp = 56 HDF5 Dataset HDF & HDF-EOS Workshop XII 22

  23. Dataspaces Two roles: • Dataspace contains spatial info about a dataset stored in a file • Rank and dimensions • Permanent part of dataset definition • Partial I/0: Dataspace describes application’s data buffer and data elements participating in I/O Rank = 2 Dimensions = 4x6 Rank = 1 Dimension = 10 HDF & HDF-EOS Workshop XII 23

  24. Write – from memory to disk memory disk HDF & HDF-EOS Workshop XII 24

  25. disk memory (b) Regular series of blocks from a 2D array to a contiguous sequence at a certain offset in a 1D array Partial I/O Move just part of a dataset disk memory (a) Slab from a 2D array to the corner of a smaller 2D array Elements in each must be same. HDF & HDF-EOS Workshop XII 25

  26. Datatypes (array elements) • Datatype – how to interpret a data element • Permanent part of the dataset definition • Two classes: atomic and compound HDF & HDF-EOS Workshop XII 26

  27. Datatypes • HDF5 atomic types include: • integer & float • user-definable (e.g., 13-bit integer) • variable length types (e.g., strings) • references to objects/dataset regions • enumeration - names mapped to integers • HDF5 compound types • Comparable to C structs (“records”) • Members can be atomic or compound types HDF & HDF-EOS Workshop XII 27

  28. HDF5 dataset: array of records 3 5 Dimensionality: 5 x 3 int8 int4 int16 2x3x2 array of float32 Datatype: Record HDF & HDF-EOS Workshop XII 28

  29. Properties • Properties are characteristics of HDF5 objects that can be modified • Default properties handle most needs • By changing properties can take advantage of the more powerful features in HDF5 HDF & HDF-EOS Workshop XII

  30. Better subsetting access time; extensible chunked Improves storage efficiency, transmission speed compressed Arrays can be extended in any direction extensible File B Metadata in one file, raw data in another Dataset “Fred” split file File A Metadata for Fred Data for Fred Special Storage Properties HDF & HDF-EOS Workshop XII 30

  31. Attributes (optional) • Attribute – data of the form “name = value”, attached to an object • Operations similar to dataset operations, but … • Not extensible • No compression or partial I/O • Can be overwritten, deleted, added during the “life” of a dataset HDF & HDF-EOS Workshop XII 31

  32. HDF5 Dataset (again) Metadata Data Dataspace Rank Dimensions 3 Dim_1 = 4 Dim_2 = 5 Dim_3 = 7 Datatype Integer Attributes Storage info Time = 32.4 Chunked Pressure = 987 Compressed Temp = 56 HDF & HDF-EOS Workshop XII 32

  33. Groups • A mechanism for organizing collections • Every file starts with a root group • Similar to UNIX directories • Can have attributes “/” C A B l k m HDF & HDF-EOS Workshop XII 33

  34. Path to HDF5 Object in a File “/” • / (root) • /x • /foo • /foo/temp • /foo/bar/temp foo x bar temp temp HDF & HDF-EOS Workshop XII 34

  35. Shared Objects “/” A C B R P P • /A/P • /B/R • /C/P HDF & HDF-EOS Workshop XII 35

  36. Questions So Far? HDF & HDF-EOS Workshop XII

  37. Useful Tools For New Users h5dump: Tool to “dump” or display contents of HDF5 files h5cc, h5c++,h5fc: Scripts to compile applications HDFView: Java browser to view HDF4 and HDF5 files HDF & HDF-EOS Workshop XII

  38. H5dump Command-line Utility To View HDF5 File h5dump [--header] [-a ] [-d <names>] [-g <names>] [-l <names>] [-t <names>] [-p] <file> --headerDisplay header only; no data is displayed. -a <names> Display the specified attribute(s). -d <names> Display the specified dataset(s). -g <names> Display the specified group(s) and all the members. -l <names> Displays the value(s) of the specified soft link(s). -t <names> Display the specified named datatype(s). -pDisplay properties. <names> is one or more appropriate object names. HDF & HDF-EOS Workshop XII

  39. “/” Example of h5dump Output HDF5 "dset.h5" { GROUP "/" { DATASET "dset" { DATATYPE { H5T_STD_I32BE } DATASPACE { SIMPLE ( 4, 6 ) / ( 4, 6 ) } DATA { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 } } } } ‘dset’ HDF & HDF-EOS Workshop XII

  40. HDF5 Compile Scripts • h5cc – HDF5 C compiler command • h5fc – HDF5 F90 compiler command • h5c++ – HDF5 C++ compiler command To compile: % h5cc h5prog.c % h5fc h5prog.f90 HDF & HDF-EOS Workshop XII 40

  41. Compile option: -show -show: displays the compiler commands and options without executing them % h5cc –show Sample_c.c gcc -I/home/packages/hdf5_1.6.6/Linux_2.6/include -UH5_DEBUG_API -DNDEBUG -I/home/packages/szip/static/encoder/Linux2.6-gcc/include -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_POSIX_SOURCE -D_BSD_SOURCE -std=c99 -Wno-long-long -O -fomit-frame-pointer -finline-functions -c Sample_c.c gcc -std=c99 -Wno-long-long -O -fomit-frame-pointer -finline-functions -L/home/packages/szip/static/encoder/Linux2.6-gcc/lib Sample_c.o -L/home/packages/hdf5_1.6.6/Linux_2.6/lib /home/packages/hdf5_1.6.6/Linux_2.6/lib/libhdf5_hl.a /home/packages/hdf5_1.6.6/Linux_2.6/lib/libhdf5.a -lsz -lz -lm -Wl,-rpath -Wl,/home/packages/hdf5_1.6.6/Linux_2.6/lib HDF & HDF-EOS Workshop XII 41

  42. Browsing HDF5 Files with HDFView HDF & HDF-EOS Workshop XII

  43. HDFView Structure of File Contents of Dataset HDF & HDF-EOS Workshop XII

  44. HDFView File Menu HDF & HDF-EOS Workshop XII

  45. HDF & HDF-EOS Workshop XII

  46. Simple HDF5 File in HDFView Right-click and select “Open” with mouse Right-click and select “Show Properties” with mouse HDF & HDF-EOS Workshop XII

  47. Simple HDF5 File in HDFView HDF & HDF-EOS Workshop XII

  48. HDF-EOS5 File in HDFView HDF & HDF-EOS Workshop XII

  49. Right-click and select “Open As” with mouse HDF & HDF-EOS Workshop XII

  50. What you can’t see • with slides: • Picture displayed instantly • File size is 906,229,176 HDF & HDF-EOS Workshop XII

More Related