1 / 31

NetCDF-4 Interoperability with HDF4 and HDF5 Ed Hartnett Unidata, 8/4/9

NetCDF-4 Interoperability with HDF4 and HDF5 Ed Hartnett Unidata, 8/4/9. Purpose of Interoperability Features: World Conquest. The purpose of the interoperability features is to allow users to use netCDF programs on non-netCDF data archives.

nelson
Télécharger la présentation

NetCDF-4 Interoperability with HDF4 and HDF5 Ed Hartnett Unidata, 8/4/9

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NetCDF-4 Interoperability with HDF4 and HDF5Ed Hartnett Unidata, 8/4/9

  2. Purpose of Interoperability Features: World Conquest • The purpose of the interoperability features is to allow users to use netCDF programs on non-netCDF data archives. • NetCDF-Java can read many data formats; the idea is to bring some of this functionality to the C/Fortran/C++ libraries.

  3. Warning and Request • HDF4 and HDF5 interoperability features are still being tested. They are not ready for operational use yet. • The interoperability features are available in the netCDF daily snapshot release. • Please use them and send feedback to: support-netcdf@unidata.ucar.edu

  4. Overview • HDF4 Interoperability • What is HDF4 and why bother with it? • Reading HDF4 files with netCDF. • Limitations and request for help. • HDF5 Interoperability • What is HDF5 and why bother with it? • Reading HDF5 files with netCDF. • Limitations.

  5. What is HDF4? • The original HDF format, superseded by HDF5. • HDF4 has built-in 32-bit limits that make it unattractive for new data sets. It is still actively supported by The HDF Group, but no new features are added. • Get more info about HDF4 at: http://www.hdfgroup.org/products/hdf4

  6. Why Read HDF4? • Some important data sets are distributed in HDF4, for example the Aqua/Terra satellite data.

  7. HDF4 Background • HDF4 has several different APIs. The one of greatest interest to netCDF users is the SD (Scientific Data) API. • The SD API is (intentionally) very similar to the netCDF classic data model.

  8. Confusing: HDF4 Includes NetCDF v2 API • A netCDF V2 API is provided with HDF4 which writes SD data files. • This must be turned off at HDF4 install-time if netCDF and HDF4 are to be linked in the same application. • There is no easy way to use both HDF4 with netCDF API and netCDF with HDF4 read capability in the same program.

  9. Reading HDF4 SD Files • Starting with version 4.1, netCDF will be able to read HDF4 files created with the “Scientific Dataset” (SD) API. • This is read-only: NetCDF can't write HDF4! • The intention is to make netCDF software work automatically with important HDF4 scientific data collections.

  10. Building NetCDF to Read HDF4 • This is only available for those who also build netCDF with HDF5. • HDF4, HDF5, zlib, and other compression libraries must exist before netCDF is built. • Build like this: ./configure –with-hdf5=/home/ed –enable-hdf4

  11. Compiling with HDF4 • Include netcdf header file as usual. • Include locations of netCDF, HDF5, and HDF4 include directories: • -I/loc/of/netcdf/include -I/loc/of/hdf5/include -I/loc/of/hdf4/include

  12. Linking with HDF4 • The HDF4 and HDF5 libraries (and associated libraries) are needed and must be linked into all netCDF applications. The locations of the lib directories must also be provided: • -L/loc/of/netcdf/lib -L/loc/of/hdf5/lib -L/loc/of/hdf4/lib • -lmfhdf -ldf -ljpeg -lhdf5_hl -lhdf5 -lz

  13. Use nc-config to Help with Compile Flags • The nc-config utility is provided to help with compiler flags: $ ./nc-config --cflags -I/usr/local/include $ ./nc-config --libs -L/usr/local/lib -lnetcdf -L/machine/local/lib -lhdf5_hl -lhdf5 -lz -lm -lhdf4 $ ./nc-config --flibs -M/usr/local/lib -lnetcdf -L/machine/local/lib -lhdf5_hl -lhdf5 -lz -lm -lhdf4

  14. Implementation Notes • You don't need to identify the file as HDF4 when opening it with netCDF, but you do have to open it read-only. • The HDF4 SD API provides a named, shared dimension, which fits easily into the netCDF model. • The HDF4 SD API uses other HDF4 APIs, (like vgroups) to store metadata. This can be confusing when using the HDF4 data dumping tool hdp.

  15. C Code to Read HDF4 SD File /* Create a file with one SDS, containing our phony data. */ sd_id = SDstart(FILE_NAME, DFACC_CREATE); sds_id = SDcreate(sd_id, PRES_NAME, DFNT_INT32, DIMS_2, dim_size); SDwritedata(sds_id, start, NULL, edge, (void *)data_out); if (SDendaccess(sds_id)) ERR; if (SDend(sd_id)) ERR; /* Now open with netCDF and check the contents. */ if (nc_open(FILE_NAME, NC_NOWRITE, &ncid)) ERR; if (nc_inq(ncid, &ndims_in, &nvars_in, &natts_in, &unlimdim_in)) ERR; ...

  16. ncdump and HDF4 SD Files • With HDF4 reading enabled, ncdump works on HDF4 files. • Sample MODIS file: ../ncdump/ncdump -h MOD29.A2000055.0005.005.2006267200024.hdf netcdf MOD29.A2000055.0005.005.2006267200024 { dimensions: Coarse_swath_lines_5km\:MOD_Swath_Sea_Ice = 406 ; Coarse_swath_pixels_5km\:MOD_Swath_Sea_Ice = 271 ; Along_swath_lines_1km\:MOD_Swath_Sea_Ice = 2030 ; Cross_swath_pixels_1km\:MOD_Swath_Sea_Ice = 1354 ; variables: float Latitude(Coarse_swath_lines_5km\:MOD_Swath_Sea_Ice, Coarse_swath_pixels_5km\:MOD_Swath_Sea_Ice) ; Latitude:long_name = "Coarse 5 km resolution latitude" ; Latitude:units = "degrees" ; ...

  17. HDF-EOS Not Understood • Many HDF4 data sets of interest follow the HDF-EOS metadata standard. • Stored as a long text string in global attributes, the HDF-EOS metadata looks messy. // global attributes: :HDFEOSVersion = "HDFEOS_V2.9" ; :StructMetadata.0 = "GROUP=SwathStructure\n\tGROUP=SWATH_1\n\t\tSwathName=\"MOD_Swath_Sea_Ice\"\n\t\tGROUP=Dimension\n\t\t\\tOBJECT=Dimension_1\n\t\t\t\tDimensionName=\"Coarse_swath_lines_5km\"\n\t\t\t\tSize=406\n\t\t\tEND_OBJECT=Dimension_1\n\t\t\tOBJECT=Dimension_2\n\t\t\t\tDimensionName=\"Coarse_swath_pixels_5km\"\n\t\t\t\tSize=271\n\t\t\t...

  18. HDF4 Read Testing • Tested in libsrc4/tst_interops2.c, which creates some HDF4 files with the SD API, and then reads them with netCDF. • If –enable-hdf4-file-tests is used with netCDF configure, some Aura/Terra satellite data files are downloaded from Unidata FTP site, then read by libsrc4/tst_interops3.c.

  19. HDF4 Interoperability Limitations • File must be opened read-only. • Only HDF4 SD data files are currently understood. • This feature cannot be used at the same time as HDF4's netCDF v2 API, because HDF4 steals the netCDF v2 API function names. So you must use –disable-netcdf when building HDF4. (It might also work to –disable-v2 for the netCDF build.)

  20. Future HDF4 Work • More tests. • Support for HDF4 image types. • Test support for compressed data. • Add some support for HDF-EOS metadata in the libcf library, using the HDF-EOS toolkit.

  21. Request for User Help – What Data to Read? • Please send me pointers to scientifically important HDF4 datasets. • The intention is not to read any HDF4 data, just those of wide scientific interest.

  22. Contribute Code to Write HDF4? • Some programmers use the netCDF v2 API to write HDF4 files. • It would not be too hard to write the glue code to allow the v2 API -> HDF4 output from the netCDF library. • The next step would be to allow netCDF v3/v4 API code to write HDF4 files. • Writing HDF4 seems like a low priority to our users. I would be happy to help any user who would like to undertake this task.

  23. What is HDF5? • HDF5 is an extremely general data storage format with many advanced features: on-the-fly compression, parallel I/O, a rich data model, etc. • Starting with netCDF-4.0, netCDF has been able to use HDF5 as a storage layer, exposing some of the advanced features. • But, until version 4.1, only HDF5 files created with netCDF-4 could be understood by netCDF-4.

  24. Why Read HDF5 Files? • Many important datasets are available in HDF5 format, including data from the Aqua satellite.

  25. Rules for Reading HDF5 Files • NetCDF-4.1 provides read-only access to existing HDF5 files if they do not violate some rules: • Must not use circular group structure. • HDF5 reference type (and some other obscure types) are not understood. • Write access still only possible with netCDF-4/HDF5 files.

  26. HDF5 Version 1.8 Background • In version 1.8, HDF5 introduced “dimension scales” as a way of supporting shared dimensions. • Also in version 1.8, HDF5 introduced ordering by creation, rather than ordering alphabetically. • But most data providers don't use these features, but instead use HDF5 1.6.

  27. NetCDF-4.1 Relaxes Some Restrictions for HDF5 Files • Before netCDF-4.1, HDF5 files had to use creation ordering and dimension scales in order to be understood by netCDF-4. • Starting with netCDF-4.1, read-only access is possible to HDF5 files with alphabetical ordering and no dimension scales. (Created by HDF5 1.6 perhaps.) • HDF5 may have dimension scales for all dimensions, or for no dimensions (not for just some of them).

  28. HDF5 C Code to Write HDF5 File /* Create file. */ if ((fileid = H5Fcreate(FILE_NAME, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT)) < 0) ERR; /* Create the space for the dataset. */ dims[0] = LAT_LEN; dims[1] = LON_LEN; if ((pres_spaceid = H5Screate_simple(DIMS_2, dims, dims)) < 0) ERR; /* Create a variable. It will not have dimension scales. */ if ((pres_datasetid = H5Dcreate(fileid, PRES_NAME, H5T_NATIVE_FLOAT, pres_spaceid, H5P_DEFAULT)) < 0) ERR; if (H5Dclose(pres_datasetid) < 0 || H5Sclose(pres_spaceid) < 0 || H5Fclose(fileid) < 0) ERR;

  29. NetCDF C Code to Read HDF5 File /* Read the data with netCDF. */ if (nc_open(FILE_NAME, NC_NOWRITE, &ncid)) ERR; if (nc_inq(ncid, &ndims_in, &nvars_in, &natts_in, &unlimdim_in)) ERR; if (ndims_in != 2 || nvars_in != 1 || natts_in != 0 || unlimdim_in != -1) ERR; if (nc_close(ncid)) ERR;

  30. Future Plans for HDF5 Interoperability • More testing. • Proper handling of reference types. This will require (probably) an extension of the netCDF APIs. • Better handling of strange group structures, if this proves necessary to read important data.

  31. Summary • With the 4.1 release, the netCDF C/Fortran/C++ libraries allow read-only access to some existing HDF4 and HDF5 data archives. • The intention is not to develop a completely general translation, but instead to focus on datasets of significance to the Earth science community. • Write capability is quite possible, but we don't plan on providing it because the demand for this is low.

More Related