Commit Graph

17 Commits

Author SHA1 Message Date
Dave Allured
9f461848b5
Format compatibility when re-opening files
This commit selects the best HDF5 format compatibility options when re-opening an existing netCDF-4 file for writing, such as appending, or adding new groups or variables.

The general objective is to make netCDF-4 files that can be read and written by all previous library  versions.  Optimal HDF5 v1.8 compatibility is selected whenever possible.  Otherwise this falls back to the adequate v1.6 compatibility.

Format compatibility is a transient property of the HDF5 library, rather than baked in at file creation time.  Therefore, compatibility options must be re-selected every time a netCDF-4 file is re-opened for writing.

This builds on the previous update for initial file creation, PR #1931, by @brtnfld, released in netcdf-c version 4.8.1.

In particular, this commit moves compatibility controls into a single central location, a new common function that is shared by both create and open functions.

For more details, see issue #951, also documentation at the top of libhdf5/hdf5set_format_compatibility.c.

This commit also makes several corrections and cleanups to previous comments about the use of related property lists.
2022-01-07 18:34:52 -07:00
Dan Ibanez
69182dc438 Ensure MPI header found without wrapper 2021-01-19 09:38:07 -07:00
Dennis Heimbigner
d2316f866c Additional Fixes to NCZarr
Primary Fixes:
* Add a whole variable optimization -- used in the rare case that nc_get/put_vara covers the whole of a variable and the variable has a single chunk.
* Fix chunking error when stride causes whole chunks to be skipped.
* Fix some memory leaks
* Add test cases
* Add one performance test to nczarr_test/. This uses the timer utils from unit_test: timer_utils.[ch].
* Move ncdumpchunks utility from ncdump to nczarr_test

Misc. Other Changes:
* Make check for aws libraries conditional on --enable-nczarr-s3
* Remove all but one bm tests from nczarr_test until they are working.
* Remove another dependency on HDF5 from supposedly non-HDF5 specific code; specifically hdf5_log_hdf5.
* Make the BAIL2 macro be hdf5 specific and replace elsewhere with an HDF5 independent equivalent.
* Move hdf5cache.c to libsrc4/nc4cache.c because it is used by nczarr.
* Modify unit_tests so that some of them are run even if using Windows.
* Misc. small bug fixes and refactors and memory leaks.
* Rename some conflicting tests for cmake.
* Attempted to make nc_perf work with cmake and failed.
2020-12-16 20:48:02 -07:00
Dennis Heimbigner
44d0dcaad2 Add support for multiple filters per variable.
re: https://github.com/Unidata/netcdf-c/issues/1584

Support has been added for multiple filters per variable.  This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.

The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
    - Obtain a list of the filters associated with a variable
    - Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
  about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
  an extended format for specifying a sequence of filters.
  The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
  filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
  for filters known to the netcdf library, which in turn is taken
  from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
  is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
  to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.

Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.

Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
2020-02-16 12:59:33 -07:00
Dennis Heimbigner
0c59e13bf7 Master merge, conflict resolution, cleanup 2019-02-24 16:54:13 -07:00
Dennis Heimbigner
bf2746b8ea Provide byte-range reading of remote datasets
re: issue https://github.com/Unidata/netcdf-c/issues/1251

Assume that you have the URL to a remote dataset
which is a normal netcdf-3 or netcdf-4 file.

This PR allows the netcdf-c to read that dataset's
contents as a netcdf file using HTTP byte ranges
if the remote server supports byte-range access.

Originally, this PR was set up to access Amazon S3 objects,
but it can also access other remote datasets such as those
provided by a Thredds server via the HTTPServer access protocol.
It may also work for other kinds of servers.

Note that this is not intended as a true production
capability because, as is known, this kind of access to
can be quite slow. In addition, the byte-range IO drivers
do not currently do any sort of optimization or caching.

An additional goal here is to gain some experience with
the Amazon S3 REST protocol.

This architecture and its use documented in
the file docs/byterange.dox.

There are currently two test cases:

1. nc_test/tst_s3raw.c - this does a simple open, check format, close cycle
   for a remote netcdf-3 file and a remote netcdf-4 file.
2. nc_test/test_s3raw.sh - this uses ncdump to investigate some remote
   datasets.

This PR also incorporates significantly changed model inference code
(see the superceded PR https://github.com/Unidata/netcdf-c/pull/1259).

1. It centralizes the code that infers the dispatcher.
2. It adds support for byte-range URLs

Other changes:

1. NC_HDF5_finalize was not being properly called by nc_finalize().
2. Fix minor bug in ncgen3.l
3. fix memory leak in nc4info.c
4. add code to walk the .daprc triples and to replace protocol=
   fragment tag with a more general mode= tag.

Final Note:
Th inference code is still way too complicated. We need to move
to the validfile() model used by netcdf Java, where each
dispatcher is asked if it can process the file. This decentralizes
the inference code. This will be done after all the major new
dispatchers (PIO, Zarr, etc) have been implemented.
2019-01-01 18:27:36 -07:00
Ed Hartnett
8bb644204e added hdf5dispatch.c to cmake build 2018-11-26 06:00:38 -07:00
Ed Hartnett
d9ef143d1e separated cache code from hdf5file.c 2018-09-14 13:33:22 -06:00
Ed Hartnett
9b0192fe94 moved memfile code to libhdf5 2018-07-19 07:26:27 -06:00
Ed Hartnett
3c0abc3d28 moved hdf5 var code to hdf5var.c 2018-07-19 07:23:03 -06:00
Ed Hartnett
bdca4313c4 split nc4var.c 2018-07-19 07:05:55 -06:00
Ed Hartnett
b0f9f965b7 clean up, moved hdf5open and hdf5create code to their own code files 2018-07-17 08:29:47 -06:00
Ed Hartnett
9354535ae4 moved code to hdf5create.c 2018-07-17 07:55:27 -06:00
Ed Hartnett
f2cb4678ee moving HDF5 functions to libhdf5 2018-05-24 14:27:16 -06:00
Ed Hartnett
2e181fbb88 fixed cmake build 2018-05-15 07:12:56 -06:00
Ed Hartnett
3e320a5bfb moved more HDF5 functions to libhdf5 2018-05-15 06:47:52 -06:00
Ed Hartnett
a68f57a0e5 added missing files 2018-05-08 12:20:55 -06:00