Commit Graph

9442 Commits

Author SHA1 Message Date
Dennis Heimbigner
df3636b959 Mitigate S3 test interference + Unlimited Dimensions in NCZarr
This PR started as an attempt to add unlimited dimensions to NCZarr.
It did that, but this exposed significant problems with test interference.
So this PR is mostly about fixing -- well mitigating anyway -- test
interference.

The problem of test interference is now documented in the document docs/internal.md.
The solutions implemented here are also describe in that document.
The solution is somewhat fragile but multiple cleanup mechanisms
are provided. Note that this feature requires that the
AWS command line utility must be installed.

## Unlimited Dimensions.
The existing NCZarr extensions to Zarr are modified to support unlimited dimensions.
NCzarr extends the Zarr meta-data for the ".zgroup" object to include netcdf-4 model extensions. This information is stored in ".zgroup" as dictionary named "_nczarr_group".
Inside "_nczarr_group", there is a key named "dims" that stores information about netcdf-4 named dimensions. The value of "dims" is a dictionary whose keys are the named dimensions. The value associated with each dimension name has one of two forms
Form 1 is a special case of form 2, and is kept for backward compatibility. Whenever a new file is written, it uses format 1 if possible, otherwise format 2.
* Form 1: An integer representing the size of the dimension, which is used for simple named dimensions.
* Form 2: A dictionary with the following keys and values"
   - "size" with an integer value representing the (current) size of the dimension.
   - "unlimited" with a value of either "1" or "0" to indicate if this dimension is an unlimited dimension.

For Unlimited dimensions, the size is initially zero, and as variables extend the length of that dimension, the size value for the dimension increases.
That dimension size is shared by all arrays referencing that dimension, so if one array extends an unlimited dimension, it is implicitly extended for all other arrays that reference that dimension.
This is the standard semantics for unlimited dimensions.

Adding unlimited dimensions required a number of other changes to the NCZarr code-base. These included the following.
* Did a partial refactor of the slice handling code in zwalk.c to clean it up.
* Added a number of tests for unlimited dimensions derived from the same test in nc_test4.
* Added several NCZarr specific unlimited tests; more are needed.
* Add test of endianness.

## Misc. Other Changes
* Modify libdispatch/ncs3sdk_aws.cpp to optionally support use of the
   AWS Transfer Utility mechanism. This is controlled by the
   ```#define TRANSFER```` command in that file. It defaults to being disabled.
* Parameterize both the standard Unidata S3 bucket (S3TESTBUCKET) and the netcdf-c test data prefix (S3TESTSUBTREE).
* Fixed an obscure memory leak in ncdump.
* Removed some obsolete unit testing code and test cases.
* Uncovered a bug in the netcdf-c handling of big-endian floats and doubles. Have not fixed yet. See tst_h5_endians.c.
* Renamed some nczarr_tests testcases to avoid name conflicts with nc_test4.
* Modify the semantics of zmap\#ncsmap_write to only allow total rewrite of objects.
* Modify the semantics of zodom to properly handle stride > 1.
* Add a truncate operation to the libnczarr zmap code.
2023-09-26 16:56:48 -06:00
Ward Fisher
3c789c6899
Merge pull request #2749 from WardF/fix_nc-config.wif
Fix --has-quantize in autotools-generated nc-config.
2023-09-06 10:46:38 -06:00
Ward Fisher
6d426ce006 Fix --has-quantize in autotools-generated nc-config. 2023-09-05 18:42:47 -06:00
Ward Fisher
ef94285ac1
Merge pull request #2737 from DennisHeimbigner/cachesizes2.dmh
Fix major bug in the NCZarr cache management
2023-08-17 14:23:57 -06:00
Dennis Heimbigner
c5b5a8a17e Update release notes 2023-08-16 23:09:26 -06:00
Dennis Heimbigner
9094d25409 Fix major bug in the NCZarr cache management
re: PR https://github.com/Unidata/netcdf-c/pull/2734
re: Issue https://github.com/Unidata/netcdf-c/issues/2733

As a result of an investigation by https://github.com/uweschulzweida,
I discovered a significant bug in the NCZarr cache management.
This PR extends the above PR to fix that bug.

## Change Overview
* Insert extra checks for cache overflow.
* Added test cases contingent on the --enable-large-file-tests option.
* The Columbia server is down, so it has been temporarily disabled.
2023-08-16 23:07:05 -06:00
Ward Fisher
032b910edf
Merge pull request #2726 from DennisHeimbigner/shifterr.dmh
Fix a number of minor bugs
2023-08-11 13:32:55 -06:00
Ward Fisher
ccfe62de72 Fix a missing 'fi' 2023-08-11 11:29:30 -06:00
Ward Fisher
939245ca4a
Merge branch 'main' into shifterr.dmh 2023-08-11 11:02:55 -06:00
Ward Fisher
f72e3cbdfe
Merge pull request #2734 from DennisHeimbigner/cachesizes.dmh
Cleanup the handling of cache parameters.
2023-08-11 10:26:46 -06:00
Dennis Heimbigner
ad1e16a7ae Update release notes 2023-08-10 17:00:22 -06:00
Dennis Heimbigner
f1a3a64b65 Cleanup the handling of cache parameters.
re: https://github.com/Unidata/netcdf-c/issues/2733

When addressing the above issue, I noticed that there was a disconnect
in NCZarr between nc_set_chunk_cache and nc_set_var_chunk cache.
Specifically, setting nc_set_chunk_cache had no impact on the per-variable cache parameters when nc_set_var_chunk_cache was not used.

So, modified the NCZarr code so that the per-variable cache parameters are set in this order (#1 is first choice):
1. The values set by nc_set_var_chunk_cache
2. The values set by nc_set_chunk_cache
3. The defaults set by configure.ac
2023-08-10 16:57:57 -06:00
Ward Fisher
c798bb6405
Merge pull request #2730 from DennisHeimbigner/filtervlen2.dmh
Explicitly suppress variable length type compression
2023-08-10 16:50:21 -06:00
Ward Fisher
c374536240
Merge pull request #2732 from DennisHeimbigner/corrupt.dmh
Fix a crash when accessing a corrupted classic file.
2023-08-09 15:48:31 -06:00
Dennis Heimbigner
bb8aa25348 ~S3 fix 2023-08-08 16:49:33 -06:00
Dennis Heimbigner
5cee82fd66 reversed conditional 2023-08-08 15:14:04 -06:00
Dennis Heimbigner
15c31b9eb6 Fix a crash when accessing a corrupted classic file.
re: Issue https://github.com/Unidata/netcdf-c/issues/2731

A corrupted classic file is causing the library to crash.
Fix so that it returns NC_ENOTNC instead.
2023-08-08 12:51:42 -06:00
Dennis Heimbigner
db772ce34c Explicitly suppress variable length type compression
re: PR https://github.com/Unidata/netcdf-c/pull/2716).
re: Issue https://github.com/Unidata/netcdf-c/issues/2189

The basic change is to make use of the fact that HDF5 automatically suppresses optional filters when an attempt is made to apply them to variable-length typed arrays.
This means that e.g. ncdump or nccopy will properly see meaningful data.
Note that if a filter is defined as HDF5 mandatory, then the corresponding variable will be suppressed and will be invisible to ncdump and nccopy.
This functionality is also propagated to NCZarr.

This PR makes some minor changes to PR https://github.com/Unidata/netcdf-c/pull/2716 as follows:
* Move the test for filter X variable-length from dfilter.c down into the dispatch table functions.
* Make all filters for HDF5 optional rather than mandatory so that the built-in HDF5 test for filter X variable-length will be invoked.

The test case for this was expanded to verify that the filters are defined, but suppressed.
2023-08-03 15:47:28 -06:00
Ward Fisher
2cd9d2674a
Merge pull request #2646 from ZhipengXue97/next
Fix potential null dereference
2023-08-01 14:00:30 -06:00
Ward Fisher
dc3c45e5b0
Merge branch 'main' into next 2023-07-31 17:33:56 -06:00
Ward Fisher
aee19e263f
Merge pull request #2684 from Dave-Allured/release-notes.minor-fixes
Release notes:  Minor.  Add historical tag, and spell fix.
2023-07-31 17:26:23 -06:00
Ward Fisher
6e1895fee8
Merge pull request #2722 from WardF/gh2712-parity.wif
Enable/Disable some plugins at configure time
2023-07-26 16:54:40 -06:00
Ward Fisher
4c71b59b52 Update Release Notes 2023-07-26 14:33:48 -06:00
Ward Fisher
18b04b4c9e Merge branch 'main' into gh2712-parity.wif 2023-07-25 15:23:46 -06:00
Ward Fisher
dac29bb15b
Merge pull request #2725 from DennisHeimbigner/plistfix.dmh
Fix memory leak
2023-07-25 15:22:56 -06:00
Ward Fisher
8a45d26c78
Merge branch 'main' into plistfix.dmh 2023-07-25 15:22:38 -06:00
Dennis Heimbigner
c4ecdd6403 Fix a number of minor bugs
1. Fix a shift bug in ncexhash.c (Issue https://github.com/Unidata/netcdf-c/issues/2702)
2. Fix an S3 related error in test_byterange.sh
3. Fix bz2/bzip2 handling in configure.ac
2023-07-24 16:20:26 -06:00
Ward Fisher
db2519cf89
Merge pull request #2724 from DennisHeimbigner/transientname.dmh
Modify PR 2655 to ensure transient types have names.
2023-07-24 11:01:14 -06:00
Ward Fisher
11e589d394 Merge branch 'gh2712-parity.wif' of github.com:WardF/netcdf-c into gh2712-parity.wif 2023-07-24 09:32:42 -06:00
Ward Fisher
b65bba0b79 Additional cmake-based logic. 2023-07-24 09:32:39 -06:00
Dennis Heimbigner
65f866fff6 Fix memory leak
re: Issue https://github.com/Unidata/netcdf-c/issues/2723

H/T to Roland Ambs for finding a memory leak where an allocated
HDF5 plist is not being reclaimed.
2023-07-23 17:25:30 -06:00
Dennis Heimbigner
a446ebfc29 Update release notes 2023-07-23 12:44:00 -06:00
Dennis Heimbigner
a37ca49d25 Modify PR https://github.com/Unidata/netcdf-c/pull/2655 to ensure transient types have names.
re: PR https://github.com/Unidata/netcdf-c/pull/2655

This PR modifies the transient types PR so that all created
transient types are given a created unique name (within a
group). The form of the name is "_Anonymous<Class>NN". The class
is the user-defined type class: Enum, Compound, Opaque, or
Vlen. NN is an integer identifier to ensure uniqueness.
Additionally, this was applied to DAP/4 anonymous dimensions.
This also required some test baseline data changes.

The transient test case is modified to verify that the name exists.
2023-07-22 20:40:53 -06:00
Ward Fisher
c6b853a860 Logic to ensure libsz is searched for if Zarr is enabled but enable_filter_szip is false. 2023-07-21 14:46:45 -06:00
Ward Fisher
ae28dd36e6 Additional tweaking of search logic. 2023-07-21 14:32:25 -06:00
Ward Fisher
890251c611 String handling in CMakeLists.txt 2023-07-21 14:19:50 -06:00
Ward Fisher
9787de121c Small change in CMakeLists.txt 2023-07-21 14:02:30 -06:00
Ward Fisher
4a61f4771b Add autotools option to disable checking for libzstd. 2023-07-20 16:08:07 -06:00
Ward Fisher
dc7da87e7c Add option for blosc filter. 2023-07-20 15:59:53 -06:00
Ward Fisher
401bdd5541 Parity for enable_bz2. BZ2 cannot be disabled altogether, but can fall back to inbternal implementation. 2023-07-20 15:54:56 -06:00
Ward Fisher
4a092c7f5d Merge branch 'patch-57' of https://github.com/gsjaardema/netcdf-c into gh2712-parity.wif 2023-07-20 13:51:12 -06:00
Ward Fisher
dc2b0f7608
Merge pull request #2655 from ZedThree/hdf5-transient-types
Add support for HDF5 transient types
2023-07-18 16:49:33 -06:00
Ward Fisher
9137873b3d
Merge pull request #2707 from WardF/remove_fortran_bootstrap.wif
Remove fortran bootstrap option
2023-06-26 10:31:09 -06:00
Ward Fisher
6b430c92d9
Merge pull request #2716 from DennisHeimbigner/filtervlen.dmh
Suppress filters on variables with non-fixed-size types.
2023-06-26 10:28:44 -06:00
Dennis Heimbigner
fb422e696b Update docs/filters.md and RELEASENOTES.md 2023-06-23 13:42:16 -06:00
Ward Fisher
ddc67c708f
Merge pull request #2711 from DennisHeimbigner/xml2osx.dmh
Update tinyxml and allow its use under OS/X.
2023-06-22 10:24:36 -06:00
Dennis Heimbigner
8cab468169 Suppress filters on variables with non-fixed-size types.
re: Discussion https://github.com/Unidata/netcdf-c/discussions/2554
re: PR https://github.com/Unidata/netcdf-c/pull/2231
re: Issue https://github.com/Unidata/netcdf-c/issues/2189

After some discussion, the issue of applying filters on variables
whose type is not fixed size, was resolved as follows:
1. A call to nc_def_var_filter will ignore such filters, but will issue a log warning.
2. Loading (from an existing file) a variable whose type is not fixed-size and which has filters, will cause the variable to be suppressed.

This PR enforces those rules.

### Misc. Other changes
* Add a test case to test the vlen change.
* Make some minor clean-ups in various cmake and automake files.
* Remove unused test
2023-06-21 14:46:22 -06:00
Dennis Heimbigner
9a5a6aa961 revert 2023-06-19 18:49:36 -06:00
Dennis Heimbigner
878dc465c3 mam2 2023-06-19 18:47:02 -06:00
Greg Sjaardema
05d5b3c130
Don't call find_package if not enabled
The `FIND_PACKAGE` should not be called if the filter/compression library is not enabled.  It was causing some inconsistencied in link libraries and CMake configure output...
2023-06-14 08:22:01 -06:00