720 likes | 914 Vues
HDF Update. Mike Folk The HDF Group HDF and HDF-EOS Workshop XI November 7, 2007. Outline. What is The HDF Group? HDF Software Update Other Activities of Interest. What is The HDF Group (THG)?. THG, the Company. Spun-off from University of Illinois July 2006 Non-profit
E N D
HDF Update Mike Folk The HDF Group HDF and HDF-EOS Workshop XI November 7, 2007 The HDF Group
Outline • What is The HDF Group? • HDF Software Update • Other Activities of Interest The HDF Group
What is The HDF Group (THG)? The HDF Group
THG, the Company • Spun-off from University of Illinois July 2006 • Non-profit • 20+ scientific, technology, professional staff • Intellectual property: • THG owns HDF4 and HDF5 • HDF formats and libraries to remain open • Libraries have BSD-type license • Continue ties to U of I and NCSA The HDF Group
The mission of The HDF Group is to ensure long-term accessibility of HDF data through sustainable development and support of HDF technologies. The HDF Group
Goals • Maintain, evolve HDF for sponsors and communities that depend on it • Do consulting, training, tuning, development, research • Sustain The HDF Group for long term to assure data access over time The HDF Group
THG Services • Helpdesk and Mailing Lists • Available to all users as a first level of support • Standard Support • Rapid issue resolution support • Consulting • Needs assessment, troubleshooting, design reviews, etc. • Enterprise Support • Coordinating HDF activities across divisions • Special Projects • Adapting customer applications to HDF • New features and tools, with changes normally incorporated into open source product • Research and Development • Training • Tutorials and hands-on practical experience The HDF Group
HDF Software Update The HDF Group
HDF4 update The HDF Group
HDF 4.2r2 Released in October The HDF Group
New features and changes • New APIs added to the SD and GR interfaces: • SDreset_maxopenfiles, SDget_maxopenfiles, Modifies, reports maximum allowable number of files • SDget_numopenfiles:Gets number of open files • SDgetcompinfo, GRgetcompinfo: Gets compression info • SDgetfilename: Retrieves name of file, given its ID • SDgetnamelen: Retrieves length of object name, given its ID • SZIP compression • Now can be invoked by Fortran API • Now available for raster images via GR interface • SDS, Vgroup names no longer limited to 64 characters The HDF Group
New features and changes • HDF configuration changes • --enable-netcdf flag introduced • Autotools versions updated • Many bug fixes made to hrepack and hdiff • See RELEASE.txt for a full list of changes The HDF Group
Drop Windows XP with MSVC++ 6.0 Linux 2.4 IRIX64 6.5 SunOS 5.8, 5.9 Add Windows 64-bit (32 and 64-bit binaries) Platforms to drop/add next release The HDF Group
Systems AIX 5.3 (32-bit, 64-bit) Free BSD 6.2 (32-bit, 64-bit)* HP-UX B.11.23 (32-bit, 64-bit)* IRIX 64 v6.5 (32-bit, 64-bit) Linux 2.4, 2.6* Linux ia64 Linux x86_64 Sun OS 5.8, 5.10* (32-bit, 64-bit) SunOS 5.10 on Intel Windows XP, Vista Mac OS X Intel* * New platforms For detailed info, see RELEASE.txt Compilers IBM C and Fortran compilers GNU gcc 3.4* and GNU Fortran HPUX C and Fortran compilers GNU gcc 3.4 and 4.* Intel C and Fortran versions 9.1 and 10.00 SUN WorkShop C and Fortran Visual Studio .NET and 2005 and Intel Fortran Visual Studio 2005 (no fortran) GNU gcc 4.0.1 with gfortran and g95 Platforms tested The HDF Group
HDF5 Update The HDF Group
HDF5 1.6.6 The HDF Group
HDF5 1.6.6 release • Primarily a bug-fix release • Some tool changes (see later slide) • http://hdfgroup.org/HDF5/release/obtain5.html The HDF Group
Compilers PGI 6.5-* Platforms dropped • Operating systems • AIX 5.3 • Solaris 2.8 and 2.9 • OSF1 • Windows XP with MSVC++ 6.0 http://www.hdfgroup.org/HDF5/release/alpha/obtain518.html The HDF Group
Systems Alpha Open VMS MAC OSX 10.4 (Intel) Solaris 2.* on Intel Cray XT3 Windows 64-bit (32 and 64-bit) BG/L Compilers PGI V. 7.* Intel 10.* MPICH 1.2.7 MPICH2 Platforms added The HDF Group
HDF5 1.8 The HDF Group
HDF5 1.8 new library features • Datatype and dataspace features • Create datatype from text description • Integer to float conversions during I/O • Compact storage for N-bit datatypes • Offset+size storage filter, saving space • “Null” dataspace – datasets with no elements • Data transformation filter The HDF Group
HDF5 1.8 – new library features • Group improvements • Creation order access • Compact groups – small groups take less space • Large group storage improvements • Intermediate group creation • Link improvements • Unicode names allowed • External links – to objects in another file • User defined links – create own kinds of links The HDF Group
HDF5 1.8 – new library features • Attribute improvements • Improved storage for large number of attributes • Iterate or look up by creation order • Unicode names allowed • Support for Unicode UTF-8 character set • Shared header information, possibly saving space • Metadata cache improvements – faster I/O on files with many objects • Better UNIX/Linux portability The HDF Group
HDF5 1.8 – new APIs • New extendible error-handling API • New APIs to copy objects between files quickly • Dimension scale model and API • “HDFpacket” API, to read/write packets efficiently The HDF Group
HDF5 1.8 – Backward and Forward Compatibility The HDF Group
HDF5 1.8 and 1.6 • Differences between 1.8 and 1.6.x • Some file format changes • Several new routines added • Old APIs deprecated – may be removed in later release • Consequences • Applications requiring 1.8 format changes will generate objects that cannot be read by 1.6 library • To exploit 1.8 changes, applications need to be rewritten The HDF Group
“The art of progress is to preserve order amid change, and to preserve change amid order.”Alfred North Whitehead The HDF Group
Principle of Maximum File Format Compatibility Unless instructed otherwise, the HDF5 library will write objects using the earliest version of the format possible for describing the information. Assures older library versions are forward compatible whenever possible: Objects in new files can be read with old versions of the library, if the objects are “known” to the old libraries. New versions of the library can always read objects in files written with older versions. 10/8/2014 The HDF Group The HDF Group 28
Command Line Tools 10/8/2014 The HDF Group The HDF Group 32
New features for existing tools -V option for all tools Prints HDF5 library version number used by tool h5repack: -L option Use latest version of file format to create objects h5dump: dumps groups/attributes in creation or name order -q Q, --sort_by=Q Sort groups and attributes by index Q -z Z, --sort_order=Z Sort groups and attributes by order Z 10/8/2014 The HDF Group The HDF Group 33
New command line tools • h5mkgrp • Creates new groups and group hierarchies in an HDF5 file • h5stat • Provides statistics regarding the file, such as number of objects per group, sizes of datasets, amount of free space in file • h5copy • Copy object within a file or cross files • h5check • Verifies an HDF5 file against the defined HDF5 File Format Specification • Completed for 1.6. • In progress for 1.8 10/8/2014 The HDF Group The HDF Group 34
Tool work in the pipeline Export numeric data formatted in several different ways (such as MS excel, XML, etc) Import ASCII data that conforms to certain format Use a common text format for h5import and h5dump Support NaN in tools such as h5diff. Challenges: NaN is platform specific NaN can have different values for the same machine Checking NaN can be a performance hit 10/8/2014 The HDF Group The HDF Group 35
HDF Java Products 10/8/2014 The HDF Group The HDF Group 36
HDF5 Java is Growing UP The HDF Group
HDFView changes HDFView 2.4 released Many new features, such as Support for compound datatypes of 2D+ arrays Support for "filtering fill value" in Image Viewer Effective handling of large 3D images Support large fonts in GUI components New autogain algorithm for image Brightness/Contrast New platforms Mac intel Linux 64-bit AMD Solaris 64-bit 10/8/2014 The HDF Group The HDF Group 38
Other Java products 36 new enhancements and 44 bugs fixed Test suite (using junit testing framework) Tests all public methods in the object package Added “make check” to run the test suite Enhanced documentation All public methods in the object package are fully documented 10/8/2014 The HDF Group The HDF Group 39
Future work for Java Update HDF5 JNI APIs for HDF5 1.8 release Release HDFView with bug fixes/new features with HDF5 1.8 release Port HDF5-SRB model to HDF5-iRODS model Writing capability for HDF5-iRODS model 10/8/2014 The HDF Group The HDF Group 40
Other Activities of Interest The HDF Group
New THG Website The HDF Group
New THG Website 10/8/2014 The HDF Group The HDF Group 43
HDF Performance Framework The HDF Group
Goals A framework for performance regression testing A tool for Testing on multiple platforms Testing different versions Long term regression testing Assistance in debugging The HDF Group
Solution HDF5 1.6 HDF5 1.8 Database cron A User’s Benchmark Performance Library PHP Web Server www Graph/Text The HDF Group
Sample Usage H5Perf_startTimer(&time); for(i=0;i<1000 ;i++) { H5Gcreate(fileid,group_name,(size_t)0)); // Add groups } H5Perf_endTimer(&time); H5Perf_addInstance(db_host, date, time); 00 21 * * * /home/local/hyoklee/src/chicago/test-perf-hdfdap-3.sh | 178820 | 2007-08-17 21:51:14 | 10000 groups | creating 10000 empty groups | 1.8.0 | hdfdap | 0.670198 | 4384 | Timestamp Instance Name Version Platform Time The HDF Group
Improved Crash Survivability in the HDF5 Library 10/8/2014 The HDF Group The HDF Group 48
Crash Survivability in HDF5 Problem: Data in HDF5 files susceptible to corruption in the event of an application or system crash. Corruption possible if structural metadata is being written when the crash occurs. Initial Objective: Guarantee an HDF5 file with consistent metadata can be reconstructed in the event of a crash. No guarantee on state of raw data – contains whatever made it to disk prior to crash. 10/8/2014 The HDF Group The HDF Group 49
Crash Survivability in HDF5 Approach: Metadata Journaling When a piece of metadata is modified and in a consistent state, make a journal note. If the application crashes, a recovery program can replay the journal by applying in order all metadata writes until the end of the last completed transaction written to the journal file. 10/8/2014 The HDF Group The HDF Group 50
Faster HDF5 Data Appends The HDF Group
Fast Data Appends • Problem: Metadata operations limit the rate at which HDF5 can append data to datasets. • Solution: new data structure for indexing chunks: • Allows constant time extend, shrink and lookup of chunks in datasets with single unlimited dimension • # of metadata I/O operations to append to dataset is independent of # of chunks • Allows single-writer/multiple-reader access • Details at: http://www.hdfgroup.uiuc.edu/RFC/HDF5/SkipListChunkIndex/SkipListChunkIndex.html 10/8/2014 The HDF Group The HDF Group 52
netCDF-4 The HDF Group