hdf5/release_docs/RELEASE.txt

1922 lines
88 KiB
Plaintext

HDF5 version 1.15.0 currently under development
================================================================================
INTRODUCTION
============
This document describes the differences between this release and the previous
HDF5 release. It contains information on the platforms tested and known
problems in this release. For more details check the HISTORY*.txt files in the
HDF5 source.
Note that documentation in the links below will be updated at the time of each
final release.
Links to HDF5 documentation can be found on:
https://portal.hdfgroup.org/documentation/
The official HDF5 releases can be obtained from:
https://www.hdfgroup.org/downloads/hdf5/
Changes from Release to Release and New Features in the HDF5-1.16.x release series
can be found at:
https://portal.hdfgroup.org/documentation/hdf5-docs/release_specific_info.html
If you have any questions or comments, please send them to the HDF Help Desk:
help@hdfgroup.org
CONTENTS
========
- New Features
- Support for new platforms and languages
- Bug Fixes since HDF5-1.14.0
- Platforms Tested
- Known Problems
- CMake vs. Autotools installations
New Features
============
Configuration:
-------------
- Added configure options for enabling/disabling non-standard programming
language features
* Added a new configuration option that allows enabling or disabling of
support for features that are extensions to programming languages, such
as support for the _Float16 datatype:
CMake: HDF5_ENABLE_NONSTANDARD_FEATURES (ON/OFF) (Default: ON)
Autotools: --enable-nonstandard-features (yes/no) (Default: yes)
When this option is enabled, configure time checks are still performed
to ensure that a feature can be used properly, but these checks may not
be sufficient when compiler support for a feature is incomplete or broken,
resulting in library build failures. When set to OFF/no, this option
provides a way to disable support for all non-standard features to avoid
these issues. Individual features can still be re-enabled with their
respective configuration options.
* Added a new configuration option that allows enabling or disabling of
support for the _Float16 C datatype:
CMake: HDF5_ENABLE_NONSTANDARD_FEATURE_FLOAT16 (ON/OFF) (Default: ON)
Autotools: --enable-nonstandard-feature-float16 (yes/no) (Default: yes)
While support for the _Float16 C datatype can generally be detected and
used properly, some compilers have incomplete support for the datatype
and will pass configure time checks while still failing to build HDF5.
This option provides a way to disable support for the _Float16 datatype
when the compiler doesn't have the proper support for it.
- Deprecate bin/cmakehdf5 script
With the improvements made in CMake since version 3.23 and the addition
of CMake preset files, this script is no longer necessary.
See INSTALL_CMake.txt file, Section X: Using CMakePresets.json for compiling
- Overhauled LFS support checks
In 2024, we can assume that Large File Support (LFS) exists on all
systems we support, though it may require flags to enable it,
particularly when building 32-bit binaries. The HDF5 source does
not use any of the 64-bit specific API calls (e.g., ftello64)
or explicit 64-bit offsets via off64_t.
Autotools
* We now use AC_SYS_LARGEFILE to determine how to support LFS. We
previously used a custom m4 script for this.
CMake
* The HDF_ENABLE_LARGE_FILE option (advanced) has been removed
* We no longer run a test program to determine if LFS works, which
will help with cross-compiling
* On Linux we now unilaterally set -D_LARGEFILE_SOURCE and
-D_FILE_OFFSET_BITS=64, regardless of 32/64 bit system. CMake
doesn't offer a nice equivalent to AC_SYS_LARGEFILE and since
those options do nothing on 64-bit systems, this seems safe and
covers all our bases. We don't set -D_LARGEFILE64_SOURCE since
we don't use any of the POSIX 64-bit specific API calls like
ftello64, as noted above.
* We didn't test for LFS support on non-Linux platforms. We've added
comments for how LFS should probably be supported on AIX and Solaris,
which seem to be alive, though uncommon. PRs would be appreciated if
anyone wishes to test this.
This overhaul also fixes GitHub #2395, which points out that the LFS flags
used when building with CMake differ based on whether CMake has been
run before. The LFS check program that caused this problem no longer exists.
- The CMake HDF5_ENABLE_DEBUG_H5B option has been removed
This enabled some additional version-1 B-tree checks. These have been
removed so the option is no longer necessary.
This option was CMake-only and marked as advanced.
- New option for building with static CRT in Windows
The following option has been added:
HDF5_BUILD_STATIC_CRT_LIBS "Build With Static Windows CRT Libraries" OFF
Because our minimum CMake is 3.18, the macro to change runtime flags no longer
works as CMake changed the default behavior in CMake 3.15.
Fixes GitHub issue #3984
- Added support for the new MSVC preprocessor
Microsoft added support for a new, standards-conformant preprocessor
to MSVC, which can be enabled with the /Zc:preprocessor option. This
preprocessor would trip over our HDopen() variadic function-like
macro, which uses a feature that only works with the legacy preprocessor.
ifdefs have been added that select the correct HDopen() form and
allow building HDF5 with the /Zc:preprocessor option.
The HDopen() macro is located in an internal header file and only
affects building the HDF5 library from source.
Fixes GitHub #2515
- Renamed HDF5_ENABLE_USING_MEMCHECKER to HDF5_USING_ANALYSIS_TOOL
The HDF5_USING_ANALYSIS_TOOL is used to indicate to test macros that
an analysis tool is being used and that the tests should not use
the runTest.cmake macros and it's variations. The analysis tools,
like valgrind, test the macro code instead of the program under test.
HDF5_ENABLE_USING_MEMCHECKER is still used for controlling the HDF5
define, H5_USING_MEMCHECKER.
- New option for building and naming tools in CMake
The following option has been added:
HDF5_BUILD_STATIC_TOOLS "Build Static Tools Not Shared Tools" OFF
The default will build shared tools unless BUILD_SHARED_LIBS = OFF.
Tools will no longer have "-shared" as only one set of tools will be created.
- Incorporated HDF5 examples repository into HDF5 library.
The HDF5Examples folder is equivalent to the hdf5-examples repository.
This enables building and testing the examples
during the library build process or after the library has been installed.
Previously, the hdf5-examples archives were downloaded
for packaging with the library. Now the examples can be built
and tested without a packaged install of the library.
However, to maintain the ability to use the HDF5Examples with an installed
library, it is necessary to map the option names used by the library
to those used by the examples. The typical pattern is:
<example option> = <library option>
HDF_BUILD_FORTRAN = ${HDF5_BUILD_FORTRAN}
- Added new option for CMake to mark tests as SKIPPED.
HDF5_DISABLE_TESTS_REGEX is a REGEX string that will be checked with
test names and if there is a match then that test's property will be
set to DISABLED. HDF5_DISABLE_TESTS_REGEX can be initialized on the
command line: "-DHDF5_DISABLE_TESTS_REGEX:STRING=<regex>"
See CMake documentation for regex-specification.
- Added defaults to CMake for long double conversion checks
HDF5 performs a couple of checks at build time to see if long double
values can be converted correctly (IBM's Power architecture uses a
special format for long doubles). These checks were performed using
TRY_RUN, which is a problem when cross-compiling.
These checks now use default values appropriate for most non-Power
systems when cross-compiling. The cache values can be pre-set if
necessary, which will preempt both the TRY_RUN and the default.
Affected values:
H5_LDOUBLE_TO_LONG_SPECIAL (default no)
H5_LONG_TO_LDOUBLE_SPECIAL (default no)
H5_LDOUBLE_TO_LLONG_ACCURATE (default yes)
H5_LLONG_TO_LDOUBLE_CORRECT (default yes)
H5_DISABLE_SOME_LDOUBLE_CONV (default no)
Fixes GitHub #3585
- Improved support for Intel oneAPI
* Separates the old 'classic' Intel compiler settings and warnings
from the oneAPI settings
* Uses `-check nouninit` in debug builds to avoid false positives
when building H5_buildiface with `-check all`
* Both Autotools and CMake
- Added new options for CMake and Autotools to control the Doxygen
warnings as errors setting.
* HDF5_ENABLE_DOXY_WARNINGS: ON/OFF (Default: ON)
* --enable-doxygen-errors: enable/disable (Default: enable)
The default will fail compile if the doxygen parsing generates warnings.
The option can be disabled if certain versions of doxygen have parsing
issues. i.e. 1.9.5, 1.9.8.
Addresses GitHub issue #3398
- Added support for AOCC and classic Flang w/ the Autotools
* Adds a config/clang-fflags options file to support Flang
* Corrects missing "-Wl," from linker options in the libtool wrappers
when using Flang, the MPI Fortran compiler wrappers, and building
the shared library. This would often result in unrecognized options
like -soname.
* Enable -nomp w/ Flang to avoid linking to the OpenMPI library.
CMake can build the parallel, shared library w/ Fortran using AOCC
and Flang, so no changes were needed for that build system.
Fixes GitHub issues #3439, #1588, #366, #280
- Converted the build of libaec and zlib to use FETCH_CONTENT with CMake.
Using the CMake FetchContent module, the external filters can populate
content at configure time via any method supported by the ExternalProject
module. Whereas ExternalProject_Add() downloads at build time, the
FetchContent module makes content available immediately, allowing the
configure step to use the content in commands like add_subdirectory(),
include() or file() operations.
Removed HDF options for using FETCH_CONTENT explicitly:
BUILD_SZIP_WITH_FETCHCONTENT:BOOL
BUILD_ZLIB_WITH_FETCHCONTENT:BOOL
- Thread-safety + static library disabled on Windows w/ CMake
The thread-safety feature requires hooks in DllMain(), which is only
present in the shared library.
We previously just warned about this, but now any CMake configuration
that tries to build thread-safety and the static library will fail.
This cannot be overridden with ALLOW_UNSUPPORTED.
Fixes GitHub issue #3613
- Autotools builds now build the szip filter by default when an appropriate
library is found
Since libaec is prevalent and BSD-licensed for both encoding and
decoding, we build the szip filter by default now.
Both autotools and CMake build systems will process the szip filter the same as
the zlib filter is processed.
- Removed CMake cross-compiling variables
* HDF5_USE_PREGEN
* HDF5_BATCH_H5DETECT
These were used to work around H5detect and H5make_libsettings and
are no longer required.
- Running H5make_libsettings is no longer required for cross-compiling
The functionality of H5make_libsettings is now handled via template files,
so H5make_libsettings has been removed.
- Running H5detect is no longer required for cross-compiling
The functionality of H5detect is now exercised at library startup,
so H5detect has been removed.
- Updated HDF5 API tests CMake code to support VOL connectors
* Implemented support for fetching, building and testing HDF5
VOL connectors during the library build process and documented
the feature under doc/cmake-vols-fetchcontent.md
* Implemented the HDF5_TEST_API_INSTALL option that enables
installation of the HDF5 API tests on the system
- Added new CMake options for building and running HDF5 API tests
(Experimental)
HDF5 API tests are an experimental feature, primarily targeted
toward HDF5 VOL connector authors, that is currently being developed.
These tests exercise the HDF5 API and are being integrated back
into the HDF5 library from the HDF5 VOL tests repository
(https://github.com/HDFGroup/vol-tests). To support this feature,
the following new options have been added to CMake:
* HDF5_TEST_API: ON/OFF (Default: OFF)
Controls whether the HDF5 API tests will be built. These tests
will only be run during testing of HDF5 if the HDF5_TEST_SERIAL
(for serial tests) and HDF5_TEST_PARALLEL (for parallel tests)
options are enabled.
* HDF5_TEST_API_INSTALL: ON/OFF (Default: OFF)
Controls whether the HDF5 API test executables will be installed
on the system alongside the HDF5 library. This option is currently
not functional.
* HDF5_TEST_API_ENABLE_ASYNC: ON/OFF (Default: OFF)
Controls whether the HDF5 Async API tests will be built. These
tests will only be run if the VOL connector used supports Async
operations.
* HDF5_TEST_API_ENABLE_DRIVER: ON/OFF (Default: OFF)
Controls whether to build the HDF5 API test driver program. This
test driver program is useful for VOL connectors that use a
client/server model where the server needs to be up and running
before the VOL connector can function. This option is currently
not functional.
* HDF5_TEST_API_SERVER: String (Default: "")
Used to specify a path to the server executable that the test
driver program should execute.
- Added support for CMake presets file.
CMake supports two main files, CMakePresets.json and CMakeUserPresets.json,
that allow users to specify common configure options and share them with others.
HDF added a CMakePresets.json file of a typical configuration and support
file, config/cmake-presets/hidden-presets.json.
Also added a section to INSTALL_CMake.txt with very basic explanation of the
process to use CMakePresets.
- Deprecated and removed old SZIP library in favor of LIBAEC library
LIBAEC library has been used in HDF5 binaries as the szip library of choice
for a few years. We are removing the options for using the old SZIP library.
Also removed the config/cmake/FindSZIP.cmake file.
- Enabled instrumentation of the library by default in CMake for parallel
debug builds
HDF5 can be configured to instrument portions of the parallel library to
aid in debugging. Autotools builds of HDF5 turn this capability on by
default for parallel debug builds and off by default for other build types.
CMake has been updated to match this behavior.
- Added new option to build libaec and zlib inline with CMake.
Using the CMake FetchContent module, the external filters can populate
content at configure time via any method supported by the ExternalProject
module. Whereas ExternalProject_Add() downloads at build time, the
FetchContent module makes content available immediately, allowing the
configure step to use the content in commands like add_subdirectory(),
include() or file() operations.
The HDF options (and defaults) for using this are:
BUILD_SZIP_WITH_FETCHCONTENT:BOOL=OFF
LIBAEC_USE_LOCALCONTENT:BOOL=OFF
BUILD_ZLIB_WITH_FETCHCONTENT:BOOL=OFF
ZLIB_USE_LOCALCONTENT:BOOL=OFF
The CMake variables to control the path and file names:
LIBAEC_TGZ_ORIGPATH:STRING
LIBAEC_TGZ_ORIGNAME:STRING
ZLIB_TGZ_ORIGPATH:STRING
ZLIB_TGZ_ORIGNAME:STRING
See the CMakeFilters.cmake and config/cmake/cacheinit.cmake files for usage.
- Added the CMake variable HDF5_ENABLE_ROS3_VFD to the HDF5 CMake config
file hdf5-config.cmake. This allows to easily detect if the library
has been built with or without read-only S3 functionality.
Library:
--------
- Relaxed behavior of H5Pset_page_buffer_size() when opening files
This API call sets the size of a file's page buffer cache. This call
was extremely strict about matching its parameters to the file strategy
and page size used to create the file, requiring a separate open of the
file to obtain these parameters.
These requirements have been relaxed when using the fapl to open
a previously-created file:
* When opening a file that does not use the H5F_FSPACE_STRATEGY_PAGE
strategy, the setting is ignored and the file will be opened, but
without a page buffer cache. This was previously an error.
* When opening a file that has a page size larger than the desired
page buffer cache size, the page buffer cache size will be increased
to the file's page size. This was previously an error.
The behavior when creating a file using H5Pset_page_buffer_size() is
unchanged.
Fixes GitHub issue #3382
- Added support for _Float16 16-bit half-precision floating-point datatype
Support for the _Float16 C datatype has been added on platforms where:
- The _Float16 datatype and its associated macros (FLT16_MIN, FLT16_MAX,
FLT16_EPSILON, etc.) are available
- A simple test program that converts between the _Float16 datatype and
other datatypes with casts can be successfully compiled and run at
configure time. Some compilers appear to be buggy or feature-incomplete
in this regard and will generate calls to compiler-internal functions
for converting between the _Float16 datatype and other datatypes, but
will not link these functions into the build, resulting in build
failures.
The following new macros have been added:
H5_HAVE__FLOAT16 - This macro is defined in H5pubconf.h and will have
the value 1 if support for the _Float16 datatype is
available. It will not be defined otherwise.
H5_SIZEOF__FLOAT16 - This macro is defined in H5pubconf.h and will have
a value corresponding to the size of the _Float16
datatype, as computed by sizeof(). It will have the
value 0 if support for the _Float16 datatype is not
available.
H5_HAVE_FABSF16 - This macro is defined in H5pubconf.h and will have the
value 1 if the fabsf16 function is available for use.
H5_LDOUBLE_TO_FLOAT16_CORRECT - This macro is defined in H5pubconf.h and
will have the value 1 if the platform can
correctly convert long double values to
_Float16. Some compilers have issues with
this.
H5T_NATIVE_FLOAT16 - This macro maps to the ID of an HDF5 datatype representing
the native C _Float16 datatype for the platform. If
support for the _Float16 datatype is not available, the
macro will map to H5I_INVALID_HID and should not be used.
H5T_IEEE_F16BE - This macro maps to the ID of an HDF5 datatype representing
a big-endian IEEE 754 16-bit floating-point datatype. This
datatype is available regardless of whether _Float16 support
is available or not.
H5T_IEEE_F16LE - This macro maps to the ID of an HDF5 datatype representing
a little-endian IEEE 754 16-bit floating-point datatype.
This datatype is available regardless of whether _Float16
support is available or not.
The following new hard datatype conversion paths have been added, but
will only be used when _Float16 support is available:
H5T_NATIVE_SCHAR <-> H5T_NATIVE_FLOAT16 | H5T_NATIVE_UCHAR <-> H5T_NATIVE_FLOAT16
H5T_NATIVE_SHORT <-> H5T_NATIVE_FLOAT16 | H5T_NATIVE_USHORT <-> H5T_NATIVE_FLOAT16
H5T_NATIVE_INT <-> H5T_NATIVE_FLOAT16 | H5T_NATIVE_UINT <-> H5T_NATIVE_FLOAT16
H5T_NATIVE_LONG <-> H5T_NATIVE_FLOAT16 | H5T_NATIVE_ULONG <-> H5T_NATIVE_FLOAT16
H5T_NATIVE_LLONG <-> H5T_NATIVE_FLOAT16 | H5T_NATIVE_ULLONG <-> H5T_NATIVE_FLOAT16
H5T_NATIVE_FLOAT <-> H5T_NATIVE_FLOAT16 | H5T_NATIVE_DOUBLE <-> H5T_NATIVE_FLOAT16
H5T_NATIVE_LDOUBLE <-> H5T_NATIVE_FLOAT16
The H5T_NATIVE_LDOUBLE -> H5T_NATIVE_FLOAT16 hard conversion path will only
be available and used if H5_LDOUBLE_TO_FLOAT16_CORRECT has a value of 1. Otherwise,
the conversion will be emulated in software by the library.
Note that in the absence of any compiler flags for architecture-specific
tuning, the generated code for datatype conversions with the _Float16 type
may perform conversions by first promoting the type to float. Use of
architecture-specific tuning compiler flags may instead allow for the
generation of specialized instructions, such as AVX512-FP16 instructions,
if available.
- Made several improvements to the datatype conversion code
* The datatype conversion code was refactored to use pointers to
H5T_t datatype structures internally rather than IDs wrapping
the pointers to those structures. These IDs are needed if an
application-registered conversion function or conversion exception
function are involved during the conversion process. For simplicity,
the conversion code simply passed these IDs down and let the internal
code unwrap the IDs as necessary when needing to access the wrapped
H5T_t structures. However, this could cause a significant amount of
repeated ID lookups for compound datatypes and other container-like
datatypes. The code now passes down pointers to the datatype
structures and only creates IDs to wrap those pointers as necessary.
Quick testing showed an average ~3x to ~10x improvement in performance
of conversions on container-like datatypes, depending on the
complexity of the datatype.
* A conversion "context" structure was added to hold information about
the current conversion being performed. This allows conversions on
container-like datatypes to be optimized better by skipping certain
portions of the conversion process that remain relatively constant
when multiple elements of the container-like datatype are being
converted.
* After refactoring the datatype conversion code to use pointers
internally rather than IDs, several copies of datatypes that were
made by higher levels of the library were able to be removed. The
internal IDs that were previously registered to wrap those copied
datatypes were also able to be removed.
- Implemented optimized support for vector I/O in the Subfiling VFD
Previously, the Subfiling VFD would handle vector I/O requests by
breaking them down into individual I/O requests, one for each entry
in the I/O vectors provided. This could result in poor I/O performance
for features in HDF5 that utilize vector I/O, such as parallel I/O
to filtered datasets. The Subfiling VFD now properly handles vector
I/O requests in their entirety, resulting in fewer I/O calls, improved
vector I/O performance and improved vector I/O memory efficiency.
- Added a simple cache to the read-only S3 (ros3) VFD
The read-only S3 VFD now caches the first N bytes of a file stored
in S3 to avoid a lot of small I/O operations when opening files.
This cache is per-file and created when the file is opened.
N is currently 16 MiB or the size of the file, whichever is smaller.
Addresses GitHub issue #3381
- Added new API function H5Pget_actual_selection_io_mode()
This function allows the user to determine if the library performed
selection I/O, vector I/O, or scalar (legacy) I/O during the last HDF5
operation performed with the provided DXPL.
- Added support for in-place type conversion in most cases
In-place type conversion allows the library to perform type conversion
without an intermediate type conversion buffer. This can improve
performance by allowing I/O in a single operation over the entire
selection instead of being limited by the size of the intermediate buffer.
Implemented for I/O on contiguous and chunked datasets when the selection
is contiguous in memory and when the memory datatype is not smaller than
the file datatype.
- Changed selection I/O to be on by default when using the MPIO file driver
- Added support for selection I/O in the MPIO file driver
Previously, only vector I/O operations were supported. Support for
selection I/O should improve performance and reduce memory uses in some
cases.
- Changed the error handling for a not found path in the find plugin process.
While attempting to load a plugin the HDF5 library will fail if one of the
directories in the plugin paths does not exist, even if there are more paths
to check. Instead of exiting the function with an error, just logged the error
and continue processing the list of paths to check.
- Implemented support for temporary security credentials for the Read-Only
S3 (ROS3) file driver.
When using temporary security credentials, one also needs to specify a
session/security token next to the access key id and secret access key.
This token can be specified by the new API function H5Pset_fapl_ros3_token().
The API function H5Pget_fapl_ros3_token() can be used to retrieve
the currently set token.
- Added a Subfiling VFD configuration file prefix environment variable
The Subfiling VFD now checks for values set in a new environment
variable "H5FD_SUBFILING_CONFIG_FILE_PREFIX" to determine if the
application has specified a pathname prefix to apply to the file
path for its configuration file. For example, this can be useful
for cases where the application wishes to write subfiles to a
machine's node-local storage while placing the subfiling configuration
file on a file system readable by all machine nodes.
- Added H5Pset_selection_io(), H5Pget_selection_io(), and
H5Pget_no_selection_io_cause() API functions to manage the selection I/O
feature. This can be used to enable collective I/O with type conversion,
or it can be used with custom VFDs that support vector or selection I/O.
- Added H5Pset_modify_write_buf() and H5Pget_modify_write_buf() API
functions to allow the library to modify the contents of write buffers, in
order to avoid malloc/memcpy. Currently only used for type conversion
with selection I/O.
Parallel Library:
-----------------
- Added optimized support for the parallel compression feature when
using the multi-dataset I/O API routines collectively
Previously, calling H5Dwrite_multi/H5Dread_multi collectively in parallel
with a list containing one or more filtered datasets would cause HDF5 to
break out of the optimized multi-dataset I/O mode and instead perform I/O
by looping over each dataset in the I/O request. The library has now been
updated to perform I/O in a more optimized manner in this case by first
performing I/O on all the filtered datasets at once and then performing
I/O on all the unfiltered datasets at once.
- Changed H5Pset_evict_on_close so that it can be called with a parallel
build of HDF5
Previously, H5Pset_evict_on_close would always fail when called from a
parallel build of HDF5, stating that the feature is not supported with
parallel HDF5. This failure would occur even if a parallel build of HDF5
was used with a serial HDF5 application. H5Pset_evict_on_close can now
be called regardless of the library build type and the library will
instead fail during H5Fcreate/H5Fopen if the "evict on close" property
has been set to true and the file is being opened for parallel access
with more than 1 MPI process.
Fortran Library:
----------------
- Added Fortran H5E APIs:
h5eregister_class_f, h5eunregister_class_f, h5ecreate_msg_f, h5eclose_msg_f
h5eget_msg_f, h5epush_f, h5eget_num_f, h5ewalk_f, h5eget_class_name_f,
h5eappend_stack_f, h5eget_current_stack_f, h5eset_current_stack_f, h5ecreate_stack_f,
h5eclose_stack_f, h5epop_f, h5eprint_f (C h5eprint v2 signature)
- Added API support for Fortran MPI_F08 module definitions:
Adds support for MPI's MPI_F08 module datatypes: type(MPI_COMM) and type(MPI_INFO) for HDF5 APIs:
H5PSET_FAPL_MPIO_F, H5PGET_FAPL_MPIO_F, H5PSET_MPI_PARAMS_F, H5PGET_MPI_PARAMS_F
Ref. #3951
- Added Fortran APIs:
H5FGET_INTENT_F, H5SSEL_ITER_CREATE_F, H5SSEL_ITER_GET_SEQ_LIST_F,
H5SSEL_ITER_CLOSE_F, H5S_mp_H5SSEL_ITER_RESET_F
- Added Fortran Parameters:
H5S_SEL_ITER_GET_SEQ_LIST_SORTED_F, H5S_SEL_ITER_SHARE_WITH_DATASPACE_F
- Added Fortran Parameters:
H5S_BLOCK_F and H5S_PLIST_F
- The configuration definitions file, H5config_f.inc, is now installed
and the HDF5 version number has been added to it.
- Added Fortran APIs:
h5fdelete_f
- Added Fortran APIs:
h5vlnative_addr_to_token_f and h5vlnative_token_to_address_f
- Fixed an uninitialized error return value for hdferr
to return the error state of the h5aopen_by_idx_f API.
- Added h5pget_vol_cap_flags_f and related Fortran VOL
capability definitions.
- Fortran async APIs H5A, H5D, H5ES, H5G, H5F, H5L and H5O were added.
- Added Fortran APIs:
h5pset_selection_io_f, h5pget_selection_io_f,
h5pget_actual_selection_io_mode_f,
h5pset_modify_write_buf_f, h5pget_modify_write_buf_f
- Added Fortran APIs:
h5get_free_list_sizes_f, h5dwrite_chunk_f, h5dread_chunk_f,
h5fget_info_f, h5lvisit_f, h5lvisit_by_name_f,
h5pget_no_selection_io_cause_f, h5pget_mpio_no_collective_cause_f,
h5sselect_shape_same_f, h5sselect_intersect_block_f,
h5pget_file_space_page_size_f, h5pset_file_space_page_size_f,
h5pget_file_space_strategy_f, h5pset_file_space_strategy_f
- Removed "-commons" linking option on Darwin, as COMMON and EQUIVALENCE
are no longer used in the Fortran source.
Fixes GitHub issue #3571
C++ Library:
------------
-
Java Library:
-------------
-
Tools:
------
-
High-Level APIs:
----------------
- Added Fortran HL API: h5doappend_f
C Packet Table API:
-------------------
-
Internal header file:
---------------------
-
Documentation:
--------------
-
Support for new platforms, languages and compilers
==================================================
-
Bug Fixes since HDF5-1.14.0 release
===================================
Configuration:
-------------
- Fix Autotools -Werror cleanup
The Autotools temporarily scrub -Werror(=whatever) from CFLAGS, etc.
so configure checks don't trip over warnings generated by configure
check programs. The sed line originally only scrubbed -Werror but not
-Werror=something, which would cause errors when the '=something' was
left behind in CFLAGS.
The sed line has been updated to handle -Werror=something lines.
Fixes one issue raised in #3872
Library
-------
- Fixed a divide-by-zero issue when a corrupt file sets the page size to 0
If a corrupt file sets the page buffer size in the superblock to zero,
the library could attempt to divide by zero when allocating space in
the file. The library now checks for valid page buffer sizes when
reading the superblock message.
Fixes oss-fuzz issue 58762
- Fixed a bug when using array datatypes with certain parent types
Array datatype conversion would never use a background buffer, even if the
array's parent type (what the array is an array of) required a background
buffer for conversion. This resulted in crashes in some cases when using
an array of compound, variable length, or reference datatypes. Array types
now use a background buffer if needed by the parent type.
- Fixed potential buffer read overflows in H5PB_read
H5PB_read previously did not account for the fact that the size of the
read it's performing could overflow the page buffer pointer, depending
on the calculated offset for the read. This has been fixed by adjusting
the size of the read if it's determined that it would overflow the page.
- Fixed CVE-2017-17507
This CVE was previously declared fixed, but later testing with a static
build of HDF5 showed that it was not fixed.
When parsing a malformed (fuzzed) compound type containing variable-length
string members, the library could produce a segmentation fault, crashing
the library.
This was fixed after GitHub PR #4234
Fixes GitHub issue #3446
- Fixed a cache assert with very large metadata objects
If the library tries to load a metadata object that is above a
certain size, this would trip an assert in debug builds. This could
happen if you create a very large number of links in an old-style
group that uses local heaps.
There is no need for this assert. The library's metadata cache
can handle large objects. The assert has been removed.
Fixes GitHub #3762
- Fixed an issue with the Subfiling VFD and multiple opens of a
file
An issue with the way the Subfiling VFD handles multiple opens
of the same file caused the file structures for the extra opens
to occasionally get mapped to an incorrect subfiling context
object. The VFD now correctly maps the file structures for
additional opens of an already open file to the same context
object.
- Fixed a bug that causes the library to incorrectly identify
the endian-ness of 16-bit and smaller C floating-point datatypes
When detecting the endian-ness of an in-memory C floating-point
datatype, the library previously always assumed that the type
was at least 32 bits in size. This resulted in invalid memory
accesses and would usually cause the library to identify the
datatype as having an endian-ness of H5T_ORDER_VAX. This has
now been fixed.
- Fixed a bug that causes an invalid memory access issue when
converting 16-bit floating-point values to integers with the
library's software conversion function
The H5T__conv_f_i function previously always assumed that
floating-point values were at least 32 bits in size and would
access invalid memory when attempting to convert 16-bit
floating-point values to integers. To fix this, parts of the
H5T__conv_f_i function had to be rewritten, which also resulted
in a significant speedup when converting floating-point values
to integers where the library does not have a hard conversion
path. This is the case for any floating-point values with a
datatype not represented by H5T_NATIVE_FLOAT16 (if _Float16 is
supported), H5T_NATIVE_FLOAT, H5T_NATIVE_DOUBLE or
H5T_NATIVE_LDOUBLE.
- Fixed a bug that can cause incorrect data when overflows occur
while converting integer values to floating-point values with
the library's software conversion function
The H5T__conv_i_f function had a bug which previously caused it
to return incorrect data when an overflow occurs and an application's
conversion exception callback function decides not to handle the
overflow. Rather than return positive infinity, the library would
return truncated data. This has now been fixed.
- Corrected H5Soffset_simple() when offset is NULL
The reference manual states that the offset parameter of H5Soffset_simple()
can be set to NULL to reset the offset of a simple dataspace to 0. This
has never been true, and passing NULL was regarded as an error.
The library will now accept NULL for the offset parameter and will
correctly set the offset to zero.
Fixes HDFFV-9299
- Fixed an issue where the Subfiling VFD's context object cache could
grow too large
The Subfiling VFD keeps a cache of its internal context objects to
speed up access to a context object for a particular file, as well
as access to that object across multiple opens of the same file.
However, opening a large amount of files with the Subfiling VFD over
the course of an application's lifetime could cause this cache to grow
too large and result in the application running out of available MPI
communicator objects. On file close, the Subfiling VFD now simply
evicts context objects out of its cache and frees them. It is assumed
that multiple opens of a file will be a less common use case for the
Subfiling VFD, but this can be revisited if it proves to be an issue
for performance.
- Fixed error when overwriting certain nested variable length types
Previously, when using a datatype that included a variable length type
within a compound or array within another variable length type, and
overwriting data with a shorter (top level) variable length sequence, an
error could occur. This has been fixed.
- Take user block into account in H5Dchunk_iter() and H5Dget_chunk_info()
The address reported by the following functions did not correctly
take the user block into account:
* H5Dchunk_iter() <-- addr passed to callback
* H5Dget_chunk_info() <-- addr parameter
* H5Dget_chunk_info_by_coord() <-- addr parameter
This means that these functions reported logical HDF5 file addresses,
which would only be equal to the physical addresses when there is no
user block prepended to the HDF5 file. This is unfortunate, as the
primary use of these functions is to get physical addresses in order
to directly access the chunks.
The listed functions now correctly take the user block into account,
so they will emit physical addresses that can be used to directly
access the chunks.
Fixes #3003
- Fixed asserts raised by large values of H5Pset_est_link_info() parameters
If large values for est_num_entries and/or est_name_len were passed
to H5Pset_est_link_info(), the library would attempt to create an
object header NIL message to reserve enough space to hold the links in
compact form (i.e., concatenated), which could exceed allowable object
header message size limits and trip asserts in the library.
This bug only occurred when using the HDF5 1.8 file format or later and
required the product of the two values to be ~64k more than the size
of any links written to the group, which would cause the library to
write out a too-large NIL spacer message to reserve the space for the
unwritten links.
The library now inspects the phase change values to see if the dataset
is likely to be compact and checks the size to ensure any NIL spacer
messages won't be larger than the library allows.
Fixes GitHub #1632
- Fixed a bug where H5Tset_fields does not account for any offset
set for a floating-point datatype when determining if values set
for spos, epos, esize, mpos and msize make sense for the datatype
Previously, H5Tset_fields did not take datatype offsets into account
when determining if the values set make sense for the datatype.
This would cause the function to fail when the precision for a
datatype is correctly set such that the offset bits are not included.
This has now been fixed.
- Fixed H5Fget_access_plist so that it returns the file locking
settings for a file
When H5Fget_access_plist (and the internal H5F_get_access_plist)
is called on a file, the returned File Access Property List has
the library's default file locking settings rather than any
settings set for the file. This causes two problems:
- Opening an HDF5 file through an external link using H5Gopen,
H5Dopen, etc. with H5P_DEFAULT for the Dataset/Group/etc.
Access Property List will cause the external file to be opened
with the library's default file locking settings rather than
inheriting them from the parent file. This can be surprising
when a file is opened with file locking disabled, but its
external files are opened with file locking enabled.
- An application cannot make use of the H5Pset_elink_fapl
function to match file locking settings between an external
file and its parent file without knowing the correct setting
ahead of time, as calling H5Fget_access_plist on the parent
file will not return the correct settings.
This has been fixed by copying a file's file locking settings
into the newly-created File Access Property List in H5F_get_access_plist.
This fix partially addresses GitHub issue #4011
- Memory usage growth issue
Starting with the HDF5 1.12.1 release, an issue (GitHub issue #1256)
was observed where running a simple program that has a loop of opening
a file, reading from an object with a variable-length datatype and
then closing the file would result in the process fairly quickly
running out of memory. Upon further investigation, it was determined
that this memory was being kept around in the library's datatype
conversion pathway cache that is used to speed up datatype conversions
which are repeatedly used within an HDF5 application's lifecycle. For
conversions involving variable-length or reference datatypes, each of
these cached pathway entries keeps a reference to its associated file
for later use. Since the file was being closed and reopened on each
loop iteration, and since the library compares for equality between
instances of opened files (rather than equality of the actual files)
when determining if it can reuse a cached conversion pathway, it was
determining that no cached conversion pathways could be reused and was
creating a new cache entry on each loop iteration during I/O. This
would lead to constant growth of that cache and the memory it consumed,
as well as constant growth of the memory consumed by each cached entry
for the reference to its associated file.
To fix this issue, the library now removes any cached datatype
conversion path entries for variable-length or reference datatypes
associated with a particular file when that file is closed.
Fixes GitHub #1256
- Suppressed floating-point exceptions in H5T init code
The floating-point datatype initialization code in H5Tinit_float.c
could raise FE_INVALID exceptions while munging bits and performing
comparisons that might involve NaN. This was not a problem when the
initialization code was executed in H5detect at compile time (prior
to 1.14.3), but now that the code is executed at library startup
(1.14.3+), these exceptions can be caught by user code, as is the
default in the NAG Fortran compiler.
Starting in 1.14.4, we now suppress floating-point exceptions while
initializing the floating-point types and clear FE_INVALID before
restoring the original environment.
Fixes GitHub #3831
- Fixed a file handle leak in the core VFD
When opening a file with the core VFD and a file image, if the file
already exists, the file check would leak the POSIX file handle.
Fixes GitHub issue #635
- Fixed some issues with chunk index metadata not getting read
collectively when collective metadata reads are enabled
When looking up dataset chunks during I/O, the parallel library
temporarily disables collective metadata reads since it's generally
unlikely that the application will read the same chunks from all
MPI ranks. Leaving collective metadata reads enabled during
chunk lookups can lead to hangs or other bad behavior depending
on the chunk indexing structure used for the dataset in question.
However, due to the way that dataset chunk index metadata was
previously loaded in a deferred manner, this could mean that
the metadata for the main chunk index structure or its
accompanying pieces of metadata (e.g., fixed array data blocks)
could end up being read independently if these chunk lookup
operations are the first chunk index-related operation that
occurs on a dataset. This behavior is generally observed when
opening a dataset for which the metadata isn't in the metadata
cache yet and then immediately performing I/O on that dataset.
This behavior is not generally observed when creating a dataset
and then performing I/O on it, as the relevant metadata will
usually be in the metadata cache as a side effect of creating
the chunk index structures during dataset creation.
This issue has been fixed by adding callbacks to the different
chunk indexing structure classes that allow more explicit control
over when chunk index metadata gets loaded. When collective
metadata reads are enabled, the necessary index metadata will now
get loaded collectively by all MPI ranks at the start of dataset
I/O to ensure that the ranks don't unintentionally read this
metadata independently further on. These changes fix collective
loading of the main chunk index structure, as well as v2 B-tree
root nodes, extensible array index blocks and fixed array data
blocks. There are still pieces of metadata that cannot currently
be loaded collectively, however, such as extensible array data
blocks, data block pages and super blocks, as well as fixed array
data block pages. These pieces of metadata are not necessarily
read in by all MPI ranks since this depends on which chunks the
ranks have selected in the dataset. Therefore, reading of these
pieces of metadata remains an independent operation.
- Fixed potential hangs in parallel library during collective I/O with
independent metadata writes
When performing collective parallel writes to a dataset where metadata
writes are requested as (or left as the default setting of) independent,
hangs could potentially occur during metadata cache sync points. This
was due to incorrect management of the internal state tracking whether
an I/O operation should be collective or not, causing the library to
attempt collective writes of metadata when they were meant to be
independent writes. During the metadata cache sync points, if the number
of cache entries being flushed was a multiple of the number of MPI ranks
in the MPI communicator used to access the HDF5 file, an equal amount of
collective MPI I/O calls were made and the dataset write call would be
successful. However, when the number of cache entries being flushed was
NOT a multiple of the number of MPI ranks, the ranks with more entries
than others would get stuck in an MPI_File_set_view call, while other
ranks would get stuck in a post-write MPI_Barrier call. This issue has
been fixed by correctly switching to independent I/O temporarily when
writing metadata independently during collective dataset I/O.
- Dropped support for MPI-2
The MPI-2 supporting artifacts have been removed due to the cessation
of MPI-2 maintenance and testing since version HDF5 1.12.
- Fixed a bug with the way the Subfiling VFD assigns I/O concentrators
During a file open operation, the Subfiling VFD determines the topology
of the application and uses that to select a subset of MPI ranks that
I/O will be forwarded to, called I/O concentrators. The code for this
had previously assumed that the parallel job launcher application (e.g.,
mpirun, srun, etc.) would distribute MPI ranks sequentially among a node
until all processors on that node have been assigned before going on to
the next node. When the launcher application mapped MPI ranks to nodes
in a different fashion, such as round-robin, this could cause the Subfiling
VFD to incorrectly map MPI ranks as I/O concentrators, leading to missing
subfiles.
- Fixed performance regression with some compound type conversions
In-place type conversion was introduced for most use cases in 1.14.2.
While being able to use the read buffer for type conversion potentially
improves performance by performing the entire I/O at once, it also
disables the optimized compound type conversion used when the destination
is a subset of the source. Disabled in-place type conversion when using
this optimized conversion and there is no benefit in terms of the I/O
size.
- Fixed an assertion in a previous fix for CVE-2016-4332
An assert could fail when processing corrupt files that have invalid
shared message flags (as in CVE-2016-4332).
The assert statement in question has been replaced with pointer checks
that don't raise errors. Since the function is in cleanup code, we do
our best to close and free things, even when presented with partially
initialized structs.
Fixes CVE-2016-4332 and HDFFV-9950 (confirmed via the cve_hdf5 repo)
- Fixed a file space allocation bug in the parallel library for chunked
datasets
With the addition of support for incremental file space allocation for
chunked datasets with filters applied to them that are created/accessed
in parallel, a bug was introduced to the library's parallel file space
allocation code. This could cause file space to not be allocated correctly
for datasets without filters applied to them that are created with serial
file access and later opened with parallel file access. In turn, this could
cause parallel writes to those datasets to place incorrect data in the file.
- Fixed an assertion failure in Parallel HDF5 when a file can't be created
due to an invalid library version bounds setting
An assertion failure could occur in H5MF_settle_raw_data_fsm when a file
can't be created with Parallel HDF5 due to specifying the use of a paged,
persistent file free space manager
(H5Pset_file_space_strategy(..., H5F_FSPACE_STRATEGY_PAGE, 1, ...)) with
an invalid library version bounds combination
(H5Pset_libver_bounds(..., H5F_LIBVER_EARLIEST, H5F_LIBVER_V18)). This
has now been fixed.
- Fixed bugs in selection I/O
Previously, the library could fail in some cases when performing selection
I/O with type conversion.
- Fixed CVE-2018-13867
A corrupt file containing an invalid local heap datablock address
could trigger an assert failure when the metadata cache attempted
to load the datablock from storage.
The local heap now verifies that the datablock address is valid
when the local heap header information is parsed.
- Fixed CVE-2018-11202
A malformed file could result in chunk index memory leaks. Under most
conditions (i.e., when the --enable-using-memchecker option is NOT
used), this would result in a small memory leak and and infinite loop
and abort when shutting down the library. The infinite loop would be
due to the "free list" package not being able to clear its resources
so the library couldn't shut down. When the "using a memory checker"
option is used, the free lists are disabled so there is just a memory
leak with no abort on library shutdown.
The chunk index resources are now correctly cleaned up when reading
misparsed files and valgrind confirms no memory leaks.
- Fixed an issue where an assert statement was converted to an
incorrect error check statement
An assert statement in the library dealing with undefined dataset data
fill values was converted to an improper error check that would always
trigger when a dataset's fill value was set to NULL (undefined). This
has now been fixed.
- Fixed an assertion failure when attempting to use the Subfiling IOC
VFD directly
The Subfiling feature makes use of two Virtual File Drivers, the
Subfiling VFD and the IOC (I/O Concentrator) VFD. The two VFDs are
intended to be stacked together such that the Subfiling VFD sits
"on top" of the IOC VFD and routes I/O requests through it; using the
IOC VFD alone is currently unsupported. The IOC VFD has been fixed so
that an error message is displayed in this situation rather than causing
an assertion failure.
- Fixed a potential bug when copying empty enum datatypes
Copying an empty enum datatype (including implicitly, as when an enum
is a part of a compound datatype) would fail in an assert in debug
mode and could fail in release mode depending on how the platform
handles undefined behavior regarding size 0 memory allocations and
using memcpy with a NULL src pointer.
The library is now more careful about using memory operations when
copying empty enum datatypes and will not error or raise an assert.
- Added an AAPL check to H5Acreate
A check was added to H5Acreate to ensure that a failure is correctly
returned when an invalid Attribute Access Property List is passed
in to the function. The HDF5 API tests were failing for certain
build types due to this condition not being checked previously.
- Fixed a bug in H5Ocopy that could generate invalid HDF5 files
H5Ocopy was missing a check to determine whether the new object's
object header version is greater than version 1. Without this check,
copying of objects with object headers that are smaller than a
certain size would cause H5Ocopy to create an object header for the
new object that has a gap in the header data. According to the
HDF5 File Format Specification, this is not allowed for version
1 of the object header format.
Fixes GitHub issue #2653
- Fixed H5Pget_vol_cap_flags and H5Pget_vol_id to accept H5P_DEFAULT
H5Pget_vol_cap_flags and H5Pget_vol_id were updated to correctly
accept H5P_DEFAULT for the 'plist_id' FAPL parameter. Previously,
they would fail if provided with H5P_DEFAULT as the FAPL.
- Fixed ROS3 VFD anonymous credential usage with h5dump and h5ls
ROS3 VFD anonymous credential functionality became broken in h5dump
and h5ls in the HDF5 1.14.0 release with the added support for VFD
plugins, which changed the way that the tools handled setting of
credential information that the VFD uses. The tools could be
provided the command-line option of "--s3-cred=(,,)" as a workaround
for anonymous credential usage, but the documentation for this
option stated that anonymous credentials could be used by simply
omitting the option. The latter functionality has been restored.
Fixes GitHub issue #2406
- Fixed memory leaks when processing malformed object header continuation messages
Malformed object header continuation messages can result in a too-small
buffer being passed to the decode function, which could lead to reading
past the end of the buffer. Additionally, errors in processing these
malformed messages can lead to allocated memory not being cleaned up.
This fix adds bounds checking and cleanup code to the object header
continuation message processing.
Fixes GitHub issue #2604
- Fixed memory leaks, aborts, and overflows in H5O EFL decode
The external file list code could call assert(), read past buffer
boundaries, and not properly clean up resources when parsing malformed
external data files messages.
This fix cleans up allocated memory, adds buffer bounds checks, and
converts asserts to HDF5 error checking.
Fixes GitHub issue #2605
- Fixed potential heap buffer overflow in decoding of link info message
Detections of buffer overflow were added for decoding version, index
flags, link creation order value, and the next three addresses. The
checkings will remove the potential invalid read of any of these
values that could be triggered by a malformed file.
Fixes GitHub issue #2603
- Memory leak
Memory leak was detected when running h5dump with "pov". The memory was allocated
via H5FL__malloc() in hdf5/src/H5FL.c
The fuzzed file "pov" was an HDF5 file containing an illegal continuation message.
When deserializing the object header chunks for the file, memory is allocated for the
array of continuation messages (cont_msg_info->msgs) in continuation message info struct.
As error is encountered in loading the illegal message, the memory allocated for
cont_msg_info->msgs needs to be freed.
Fixes GitHub issue #2599
- Fixed memory leaks that could occur when reading a dataset from a
malformed file
When attempting to read layout, pline, and efl information for a
dataset, memory leaks could occur if attempting to read pline/efl
information threw an error, which is due to the memory that was
allocated for pline and efl not being properly cleaned up on error.
Fixes GitHub issue #2602
- Fixed potential heap buffer overrun in group info header decoding from malformed file
H5O__ginfo_decode could sometimes read past allocated memory when parsing a
group info message from the header of a malformed file.
It now checks buffer size before each read to properly throw an error in these cases.
Fixes GitHub issue #2601
- Fixed potential buffer overrun issues in some object header decode routines
Several checks were added to H5O__layout_decode and H5O__sdspace_decode to
ensure that memory buffers don't get overrun when decoding buffers read from
a (possibly corrupted) HDF5 file.
- Fixed a heap buffer overflow that occurs when reading from
a dataset with a compact layout within a malformed HDF5 file
During opening of a dataset that has a compact layout, the
library allocates a buffer that stores the dataset's raw data.
The dataset's object header that gets written to the file
contains information about how large of a buffer the library
should allocate. If this object header is malformed such that
it causes the library to allocate a buffer that is too small
to hold the dataset's raw data, future I/O to the dataset can
result in heap buffer overflows. To fix this issue, an extra
check is now performed for compact datasets to ensure that
the size of the allocated buffer matches the expected size
of the dataset's raw data (as calculated from the dataset's
dataspace and datatype information). If the two sizes do not
match, opening of the dataset will fail.
Fixes GitHub issue #2606
- Fixed a memory corruption issue that can occur when reading
from a dataset using a hyperslab selection in the file
dataspace and a point selection in the memory dataspace
When reading from a dataset using a hyperslab selection in
the dataset's file dataspace and a point selection in the
dataset's memory dataspace where the file dataspace's "rank"
is greater than the memory dataspace's "rank", memory corruption
could occur due to an incorrect number of selection points
being copied when projecting the point selection onto the
hyperslab selection's dataspace.
- Fixed issues in the Subfiling VFD when using the SELECT_IOC_EVERY_NTH_RANK
or SELECT_IOC_TOTAL I/O concentrator selection strategies
Multiple bugs involving these I/O concentrator selection strategies
were fixed, including:
* A bug that caused the selection strategy to be altered when
criteria for the strategy was specified in the
H5FD_SUBFILING_IOC_SELECTION_CRITERIA environment variable as
a single value, rather than in the old and undocumented
'integer:integer' format
* Two bugs which caused a request for 'N' I/O concentrators to
result in 'N - 1' I/O concentrators being assigned, which also
lead to issues if only 1 I/O concentrator was requested
Also added a regression test for these two I/O concentrator selection
strategies to prevent future issues.
- Fix CVE-2021-37501 / GHSA-rfgw-5vq3-wrjf
Check for overflow when calculating on-disk attribute data size.
A bogus hdf5 file may contain dataspace messages with sizes
which lead to the on-disk data sizes to exceed what is addressable.
When calculating the size, make sure, the multiplication does not
overflow.
The test case was crafted in a way that the overflow caused the
size to be 0.
Fixes GitHub #2458
- Fixed an issue with collective metadata writes of global heap data
New test failures in parallel netCDF started occurring with debug
builds of HDF5 due to an assertion failure and this was reported in
GitHub issue #2433. The assertion failure began happening after the
collective metadata write pathway in the library was updated to use
vector I/O so that parallel-enabled HDF5 Virtual File Drivers (other
than the existing MPI I/O VFD) can support collective metadata writes.
The assertion failure was fixed by updating collective metadata writes
to treat global heap metadata as raw data, as done elsewhere in the
library.
Fixes GitHub issue #2433
- Fixed buffer overflow error in image decoding function.
The error occurred in the function for decoding address from the specified
buffer, which is called many times from the function responsible for image
decoding. The length of the buffer is known in the image decoding function,
but no checks are produced, so the buffer overflow can occur in many places,
including callee functions for address decoding.
The error was fixed by inserting corresponding checks for buffer overflow.
Fixes GitHub issue #2432
- Reading a H5std_string (std::string) via a C++ DataSet previously
truncated the string at the first null byte as if reading a C string.
Fixed length datasets are now read into H5std_string as a fixed length
string of the appropriate size. Variable length datasets will still be
truncated at the first null byte.
Fixes Github issue #3034
- Fixed write buffer overflow in H5O__alloc_chunk
The overflow was found by OSS-Fuzz https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58658
- Fixed a segfault when using a user-defined conversion function between compound datatypes
During type info initialization for compound datatype conversion, the library checked if the
datatypes are subsets of one another in order to perform special conversion handling.
This check uses information that is only defined if a library conversion function is in use.
The library now skips this check for user-defined conversion functions.
Fixes Github issue #3840
Java Library
------------
- Fixed switch case 'L' block missing a break statement.
The HDF5Array.arrayify method is missing a break statement in the case 'L': section
which causes it to fall through and throw an HDF5JavaException when attempting to
read an Array[Array[Long]].
The error was fixed by inserting a break statement at the end of the case 'L': sections.
Fixes GitHub issue #3056
Configuration
-------------
- Changed default of 'Error on HDF5 doxygen warnings' DOXYGEN_WARN_AS_ERROR option.
The default setting of DOXYGEN_WARN_AS_ERROR to 'FAIL_ON_WARNINGS' has been changed
to 'NO'. It was decided that the setting was too aggressive and should be a user choice.
The github actions and scripts have been updated to reflect this.
* HDF5_ENABLE_DOXY_WARNINGS: ON/OFF (Default: OFF)
* --enable-doxygen-errors: enable/disable (Default: disable)
- Removed an Autotools configure hack that causes problems on MacOS
A sed line in configure.ac was added in the past to paper over some
problems with older versions of the Autotools that would add incorrect
linker flags. This hack is not needed with recent versions of the
Autotools and the sed line errors on MacOS (though this was a silent
error that didn't break the build) so the hack has been removed.
Fixes GitHub issue #3843
- Fixed an issue where the h5tools_test_utils test program was being
installed on the system for Autotools builds of HDF5
The h5tools_test_utils test program was mistakenly added to bin_PROGRAMS
in its Makefile.am configuration file, causing the executable to be
installed on the system. The executable is now added to noinst_PROGRAMS
instead and will no longer be installed on the system for Autotools builds
of HDF5. The CMake configuration code already avoids installing the
executable on the system.
- Fixed a configuration issue that prevented building of the Subfiling VFD on macOS
Checks were added to the CMake and Autotools code to verify that CLOCK_MONOTONIC_COARSE,
PTHREAD_MUTEX_ADAPTIVE_NP and pthread_condattr_setclock() are available before attempting
to use them in Subfiling VFD-related utility code. Without these checks, attempting
to build the Subfiling VFD on macOS would fail.
- Fixes the ordering of INCLUDES when building with CMake
Include directories in the source or build tree should come before other
directories to prioritize headers in the sources over installed ones.
Fixes GitHub #1027
- The accum test now passes on macOS 12+ (Monterey) w/ CMake
Due to changes in the way macOS handles LD_LIBRARY_PATH, the accum test
started failing on macOS 12+ when building with CMake. CMake has been
updated to set DYLD_LIBRARY_PATH on macOS and the test now passes.
Fixes GitHub #2994, #2261, and #1289
- Changed the default settings used by CMake for the GZIP filter
The default for the option HDF5_ENABLE_Z_LIB_SUPPORT was OFF. Now the default is ON.
This was done to match the defaults used by the autotools configure.ac.
In addition, the CMake message level for not finding a suitable filter library was
changed from FATAL_ERROR (which would halt the build process) to WARNING (which
will print a message to stderr). Associated files and documentation were changed to match.
In addition, the default settings in the config/cmake/cacheinit.cmake file were changed to
allow CMake to disable building the filters if the tgz file could not be found. The option
to allow CMake to download the file from the original Github location requires setting
the ZLIB_USE_LOCALCONTENT option to OFF for gzip. And setting the LIBAEC_USE_LOCALCONTENT
option to OFF for libaec (szip).
Fixes GitHub issue #2926
- Fixed syntax of generator expressions used by CMake
Add quotes around the generator expression should allow CMake to
correctly parse the expression. Generator expressions are typically
parsed after command arguments. If a generator expression contains
spaces, new lines, semicolons or other characters that may be
interpreted as command argument separators, the whole expression
should be surrounded by quotes when passed to a command. Failure to
do so may result in the expression being split and it may no longer
be recognized as a generator expression.
Fixes GitHub issue #2906
- Fixed improper include of Subfiling VFD build directory
With the release of the Subfiling Virtual File Driver feature, compiler
flags were added to the Autotools build's CPPFLAGS and AM_CPPFLAGS
variables to always include the Subfiling VFD source code directory,
regardless of whether the VFD is enabled and built or not. These flags
are needed because the header files for the VFD contain macros that are
assumed to always be available, such as H5FD_SUBFILING_NAME, so the
header files are unconditionally included in the HDF5 library. However,
these flags are only needed when building HDF5, so they belong in the
H5_CPPFLAGS variable instead. Inclusion in the CPPFLAGS and AM_CPPFLAGS
variables would export these flags to the h5cc and h5c++ wrapper scripts,
as well as the libhdf5.settings file, which would break builds of software
that use HDF5 and try to use or parse information out of these files after
deleting temporary HDF5 build directories.
Fixes GitHub issue #2621
- Correct the CMake generated pkg-config file
The pkg-config file generated by CMake had the order and placement of the
libraries wrong. Also added support for debug library names.
Changed the order of Libs.private libraries so that dependencies come after
dependents. Did not move the compression libraries into Requires.private
because there was not a way to determine if the compression libraries had
supported pkconfig files. Still recommend that the CMake config file method
be used for building projects with CMake.
Fixes GitHub issues #1546 and #2259
- Force lowercase Fortran module file names
The Cray Fortran compiler uses uppercase Fortran module file names, which
caused CMake installs to fail. A compiler option was added to use lowercase
instead.
Tools
-----
- Renamed h5fuse.sh to h5fuse
Addresses Discussion #3791
- Fixed an issue with unmatched MPI messages in ph5diff
The "manager" MPI rank in ph5diff was unintentionally sending "program end"
messages to its workers twice, leading to an error from MPICH similar to the
following:
Abort(810645519) on node 1 (rank 1 in comm 0): Fatal error in internal_Finalize: Other MPI error, error stack:
internal_Finalize(50)...........: MPI_Finalize failed
MPII_Finalize(394)..............:
MPIR_Comm_delete_internal(1224).: Communicator (handle=44000000) being freed has 1 unmatched message(s)
MPIR_Comm_release_always(1250)..:
MPIR_finalize_builtin_comms(154):
- Fixed an issue in h5repack for variable-length typed datasets
When repacking datasets into a new file, h5repack tries to determine whether
it can use H5Ocopy to copy each dataset into the new file, or if it needs to
manually re-create the dataset, then read data from the old dataset and write
it to the new dataset. H5repack was previously using H5Ocopy for datasets with
variable-length datatypes, but this can be problematic if the global heap
addresses involved do not match exactly between the old and new files. These
addresses could change for a variety of reasons, such as the command-line options
provided to h5repack, how h5repack allocate space in the repacked file, etc.
Since H5Ocopy does not currently perform any translation when these addresses
change, datasets that were repacked with H5Ocopy could become unreadable in the
new file. H5repack has been fixed to repack variable-length typed datasets without
using H5Ocopy to ensure that the new datasets always have the correct global heap
addresses.
- Names of objects with square brackets will have trouble without the
special argument, --no-compact-subset, on the h5dump command line.
h5diff did not have this option and now it has been added.
Fixes GitHub issue #2682
- In the tools traverse function - an error in either visit call
will bypass the cleanup of the local data variables.
Replaced the H5TOOLS_GOTO_ERROR with just H5TOOLS_ERROR.
Fixes GitHub issue #2598
Performance
-------------
-
Fortran API
-----------
- Fixed: HDF5 fails to compile with -Werror=lto-type-mismatch
Removed the use of the offending C stub wrapper.
Fixes GitHub issue #3987
High-Level Library
------------------
- Fixed a memory leak in H5LTopen_file_image with H5LT_FILE_IMAGE_DONT_COPY flag
When the H5LT_FILE_IMAGE_DONT_COPY flag is passed to H5LTopen_file_image, the
internally-allocated udata structure gets leaked as the core file driver doesn't
have a way to determine when or if it needs to call the "udata_free" callback.
This has been fixed by freeing the udata structure when the "image_free" callback
gets made during file close, where the file is holding the last reference to the
udata structure.
Fixes GitHub issue #827
Fortran High-Level APIs
-----------------------
-
Documentation
-------------
-
F90 APIs
--------
-
C++ APIs
--------
-
Testing
-------
- Fixed a bug in the dt_arith test when H5_WANT_DCONV_EXCEPTION is not
defined
The dt_arith test program's test_particular_fp_integer sub-test tries
to ensure that the library correctly raises a datatype conversion
exception when converting a floating-point value to an integer overflows.
However, this test would run even when H5_WANT_DCONV_EXCEPTION isn't
defined, causing the test to fail due to the library not raising
datatype conversion exceptions. This has now been fixed by not running
the test when H5_WANT_DCONV_EXCEPTION is not defined.
- Disabled running of MPI Atomicity tests for OpenMPI major versions < 5
Support for MPI atomicity operations is not implemented for major
versions of OpenMPI less than version 5. This would cause the MPI
atomicity tests for parallel HDF5 to sporadically fail when run
with OpenMPI. Testphdf5 now checks if OpenMPI is being used and will
skip running the atomicity tests if the major version of OpenMPI is
< 5.
- Fixed a testing failure in testphdf5 on Cray machines
On some Cray machines, what appears to be a bug in Cray MPICH was causing
calls to H5Fis_accessible to create a 0-byte file with strange Unix
permissions. This was causing an H5Fdelete file deletion test in the
testphdf5 program to fail due to a just-deleted HDF5 file appearing to
still be accessible on the file system. The issue in Cray MPICH has been
worked around for the time being by resetting the MPI_Info object on the
File Access Property List used to MPI_INFO_NULL before passing it to the
H5Fis_accessible call.
- A bug was fixed in the HDF5 API test random datatype generation code
A bug in the random datatype generation code could cause test failures
when trying to generate an enumeration datatype that has duplicated
name/value pairs in it. This has now been fixed.
- A bug was fixed in the HDF5 API test VOL connector registration checking code
The HDF5 API test code checks to see if the VOL connector specified by the
HDF5_VOL_CONNECTOR environment variable (if any) is registered with the library
before attempting to run tests with it so that testing can be skipped and an
error can be returned when a VOL connector fails to register successfully.
Previously, this code didn't account for VOL connectors that specify extra
configuration information in the HDF5_VOL_CONNECTOR environment variable and
would incorrectly report that the specified VOL connector isn't registered
due to including the configuration information as part of the VOL connector
name being checked for registration status. This has now been fixed.
- Fixed Fortran 2003 test with gfortran-v13, optimization levels O2,O3
Fixes failing Fortran 2003 test with gfortran, optimization level O2,O3
with -fdefault-real-16. Fixes GH #2928.
Platforms Tested
===================
- HDF5 supports the latest macOS versions, including the current and two
preceding releases. As new major macOS versions become available, HDF5
will discontinue support for the oldest version and add the latest
version to its list of compatible systems, along with the previous two
releases.
Linux 5.16.14-200.fc35 GNU gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9)
#1 SMP x86_64 GNU/Linux GNU Fortran (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9)
Fedora35 clang version 13.0.0 (Fedora 13.0.0-3.fc35)
(cmake and autotools)
Linux 5.19.0-1027-aws GNU gcc (GCC) 11.3.0-1ubuntu1
#36-Ubuntu SMP x86_64 GNU/Linux GNU Fortran (GCC) 11.3.0-1ubuntu1
Ubuntu 22.04 Intel oneAPI DPC++/C++ Compiler, IFX 2023.1.0
Ubuntu clang version 14.0.0-1ubuntu1
(cmake and autotools)
Linux 5.15.0-1037-aws GNU gcc (GCC) 9.4.0-1ubuntu1
#36-Ubuntu SMP x86_64 GNU/Linux GNU Fortran (GCC) 9.4.0-1ubuntu1
Ubuntu 20.04 Intel oneAPI DPC++/C++ Compiler, IFX 2023.1.0
Ubuntu clang version 10.0.0-4ubuntu1
(cmake and autotools)
Linux 5.14.21-cray_shasta_c cray-mpich/8.1.25
#1 SMP x86_64 GNU/Linux cce 15.0.1
(perlmutter) GCC 12.2.0
intel-oneapi/2023.1.0
nvidia/22.7
(cmake)
Linux 5.14.21-cray_shasta_c cray-mpich/8.1.23
#1 SMP x86_64 GNU/Linux cce 15.0.1
(crusher) GCC 12.2.0
(cmake)
Linux-4.14.0-115.21.2 spectrum-mpi/rolling-release
#1 SMP ppc64le GNU/Linux clang 12.0.1, 14.0.5
(lassen) GCC 8.3.1
XL 16.1.1.2, 2021,09.22, 2022.08.05
(cmake)
Linux-4.12.14-197.99-default cray-mpich/7.7.14
#1 SMP x86_64 GNU/Linux cce 12.0.3
(theta) GCC 11.2.0
llvm 9.0
Intel 19.1.2
Linux 3.10.0-1160.36.2.el7.ppc64 gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
#1 SMP ppc64be GNU/Linux g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
Power8 (echidna) GNU Fortran (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
IBM XL C for Linux, V13.1
IBM XL Fortran for Linux, V15.1
Linux 3.10.0-1160.24.1.el7 GNU C (gcc), Fortran (gfortran), C++ (g++)
#1 SMP x86_64 GNU/Linux compilers:
Centos7 Version 4.8.5 20150623 (Red Hat 4.8.5-4)
(jelly/kituo/moohan) Version 4.9.3, Version 5.3.0, Version 6.3.0,
Version 7.2.0, Version 8.3.0, Version 9.1.0
Version 10.2.0
Intel(R) C (icc), C++ (icpc), Fortran (icc)
compilers:
Version 17.0.0.098 Build 20160721
GNU C (gcc) and C++ (g++) 4.8.5 compilers
with NAG Fortran Compiler Release 6.1(Tozai)
Intel(R) C (icc) and C++ (icpc) 17.0.0.098 compilers
with NAG Fortran Compiler Release 6.1(Tozai)
MPICH 3.3 compiled with GCC 7.2.0
MPICH 4.0.3 compiled with GCC 7.2.0
OpenMPI 3.1.3 compiled with GCC 7.2.0
OpenMPI 4.1.2 compiled with GCC 9.1.0
PGI C, Fortran, C++ for 64-bit target on
x86_64;
Version 19.10-0
NVIDIA C, Fortran, C++ for 64-bit target on
x86_64;
Version 22.5-0
(autotools and cmake)
Linux-3.10.0-1160.0.0.1chaos openmpi-4.1.2
#1 SMP x86_64 GNU/Linux clang 6.0.0, 11.0.1
(quartz) GCC 7.3.0, 8.1.0
Intel 19.0.4, 2022.2, oneapi.2022.2
macOS Apple M1 11.6 Apple clang version 12.0.5 (clang-1205.0.22.11)
Darwin 20.6.0 arm64 gfortran GNU Fortran (Homebrew GCC 11.2.0) 11.1.0
(macmini-m1) Intel icc/icpc/ifort version 2021.3.0 202106092021.3.0 20210609
macOS Big Sur 11.3.1 Apple clang version 12.0.5 (clang-1205.0.22.9)
Darwin 20.4.0 x86_64 gfortran GNU Fortran (Homebrew GCC 10.2.0_3) 10.2.0
(bigsur-1) Intel icc/icpc/ifort version 2021.2.0 20210228
macOS High Sierra 10.13.6 Apple LLVM version 10.0.0 (clang-1000.10.44.4)
64-bit gfortran GNU Fortran (GCC) 6.3.0
(bear) Intel icc/icpc/ifort version 19.0.4.233 20190416
Mac OS X El Capitan 10.11.6 Apple clang version 7.3.0 from Xcode 7.3
64-bit gfortran GNU Fortran (GCC) 5.2.0
(osx1011test) Intel icc/icpc/ifort version 16.0.2
Linux 2.6.32-573.22.1.el6 GNU C (gcc), Fortran (gfortran), C++ (g++)
#1 SMP x86_64 GNU/Linux compilers:
Centos6 Version 4.4.7 20120313
(platypus) Version 4.9.3, 5.3.0, 6.2.0
MPICH 3.1.4 compiled with GCC 4.9.3
PGI C, Fortran, C++ for 64-bit target on
x86_64;
Version 19.10-0
Windows 10 x64 Visual Studio 2019 w/ clang 12.0.0
with MSVC-like command-line (C/C++ only - cmake)
Visual Studio 2019 w/ Intel (C/C++ only - cmake)
Visual Studio 2022 w/ clang 15.0.1
with MSVC-like command-line (C/C++ only - cmake)
Visual Studio 2022 w/ Intel C/C++/Fortran oneAPI 2023 (cmake)
Visual Studio 2019 w/ MSMPI 10.1 (C only - cmake)
Known Problems
==============
When building with the NAG Fortran compiler using the Autotools and libtool
2.4.2 or earlier, the -shared flag will be missing '-Wl,', which will cause
compilation to fail. This is due to a bug in libtool that was fixed in 2012
and released in 2.4.4 in 2014.
When HDF5 is compiled with NVHPC versions 23.5 - 23.9 (additional versions may
also be applicable) and with -O2 (or higher) and -DNDEBUG, test failures occur
in the following tests:
H5PLUGIN-filter_plugin
H5TEST-flush2
H5TEST-testhdf5-base
MPI_TEST_t_filters_parallel
Since these tests pass with an optimization level of -O1 (and -O0) and it is
currently unclear whether the test failures are due to issues in HDF5 or issues
in the 'nvc' compiler, the maximum optimization level for NVHPC has been set
to -O1 until the test failures can be resolved. Note that even at -O1 optimization
level, there still appears to be a sporadic test failure in the Java JUnit tests
that has occasionally been seen in JUnit-TestH5Pfapl and JUnit-TestH5D. It is also
unclear whether this is an issue in HDF5 or with the 'nvc' compiler. Finally, note
that NVHPC 23.9 will fail to compile the test/tselect.c test file with a compiler
error of 'use of undefined value' when the optimization level is -O2 or higher.
Nvidia is aware of this issue and has suggested lowering the optimization level to
-O1 for the time being:
https://forums.developer.nvidia.com/t/hdf5-no-longer-compiles-with-nv-23-9/269045.
CMake files do not behave correctly with paths containing spaces.
Do not use spaces in paths because the required escaping for handling spaces
results in very complex and fragile build files.
At present, metadata cache images may not be generated by parallel
applications. Parallel applications can read files with metadata cache
images, but since this is a collective operation, a deadlock is possible
if one or more processes do not participate.
The subsetting option in ph5diff currently will fail and should be avoided.
The subsetting option works correctly in serial h5diff.
Flang Fortran compilation will fail (last check version 17) due to not yet
implemented: (1) derived type argument passed by value (H5VLff.F90),
and (2) support for REAL with KIND = 2 in intrinsic SPACING used in testing.
Fortran tests HDF5_1_8.F90 and HDF5_F03.F90 will fail with Cray compilers greater than
version 16.0 due to a compiler bug. The latest version verified as failing was version 17.0.
Several tests currently fail on certain platforms:
MPI_TEST-t_bigio fails with spectrum-mpi on ppc64le platforms.
MPI_TEST-t_subfiling_vfd and MPI_TEST_EXAMPLES-ph5_subfiling fail with
cray-mpich on theta and with XL compilers on ppc64le platforms.
MPI_TEST_testphdf5_tldsc fails with cray-mpich 7.7 on cori and theta.
Known problems in previous releases can be found in the HISTORY*.txt files
in the HDF5 source. Please report any new problems found to
help@hdfgroup.org.
File space may not be released when overwriting or deleting certain nested
variable length or reference types.
CMake vs. Autotools installations
=================================
While both build systems produce similar results, there are differences.
Each system produces the same set of folders on Linux (only CMake works
on standard Windows); bin, include, lib and share. Autotools places the
COPYING and RELEASE.txt file in the root folder, CMake places them in
the share folder.
The bin folder contains the tools and the build scripts. Additionally, CMake
creates dynamic versions of the tools with the suffix "-shared". Autotools
installs one set of tools depending on the "--enable-shared" configuration
option.
build scripts
-------------
Autotools: h5c++, h5cc, h5fc
CMake: h5c++, h5cc, h5hlc++, h5hlcc
The include folder holds the header files and the fortran mod files. CMake
places the fortran mod files into separate shared and static subfolders,
while Autotools places one set of mod files into the include folder. Because
CMake produces a tools library, the header files for tools will appear in
the include folder.
The lib folder contains the library files, and CMake adds the pkgconfig
subfolder with the hdf5*.pc files used by the bin/build scripts created by
the CMake build. CMake separates the C interface code from the fortran code by
creating C-stub libraries for each Fortran library. In addition, only CMake
installs the tools library. The names of the szip libraries are different
between the build systems.
The share folder will have the most differences because CMake builds include
a number of CMake specific files for support of CMake's find_package and support
for the HDF5 Examples CMake project.
The issues with the gif tool are:
HDFFV-10592 CVE-2018-17433
HDFFV-10593 CVE-2018-17436
HDFFV-11048 CVE-2020-10809
These CVE issues have not yet been addressed and are avoided by not building
the gif tool by default. Enable building the High-Level tools with these options:
autotools: --enable-hlgiftools
cmake: HDF5_BUILD_HL_GIF_TOOLS=ON