NOTE: it is important that this fix gets into 4.9.3
re: Issue https://github.com/Unidata/netcdf-c/issues/2798
## Modifications
* This PR includes PR https://github.com/Unidata/netcdf-c/pull/2813
* Support the following AWS environment variables in the internal S3 library
(they are already supported by aws-sdk-cpp).
- AWS_REGION
- AWS_DEFAULT_REGION
- AWS_ACCESS_KEY_ID
- AWS_CONFIG_FILE
- AWS_PROFILE
- AWS_SECRET_ACCESS_KEY
- (source https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html).
* Support an empty region when specifying s3.amazonaws.com as the host.
* Move some S3/AWS related functions to ds3util.c
* Add a test case to test empty region and AWS_[DEFAULT]_REGION.
The most critical bug is in nch5s3comms.c.
I for some reason assumed that signing keys
did not contain any zero bytes. But obviously
it can, so a test was removed.
Other fixes:
1. Guarantee allocated memory is initialized to all zeros.
2. Cleanup errmsg handling in libncpoco.
3. Fix processing of aws list-objects-v2 because I misread the syntax.
Use memcpy to copy correctly even for unaligned memory. This was already done for some functions here, but not all.
Also took the oppurtunity to remove a bunch of seemingly obsolete/commented code.
This commit fixes an issue with unescaped paths while building the
manpages for the netcdf.3 target. This causes problems with a building
path that contains spaces.
This PR started as an attempt to add unlimited dimensions to NCZarr.
It did that, but this exposed significant problems with test interference.
So this PR is mostly about fixing -- well mitigating anyway -- test
interference.
The problem of test interference is now documented in the document docs/internal.md.
The solutions implemented here are also describe in that document.
The solution is somewhat fragile but multiple cleanup mechanisms
are provided. Note that this feature requires that the
AWS command line utility must be installed.
## Unlimited Dimensions.
The existing NCZarr extensions to Zarr are modified to support unlimited dimensions.
NCzarr extends the Zarr meta-data for the ".zgroup" object to include netcdf-4 model extensions. This information is stored in ".zgroup" as dictionary named "_nczarr_group".
Inside "_nczarr_group", there is a key named "dims" that stores information about netcdf-4 named dimensions. The value of "dims" is a dictionary whose keys are the named dimensions. The value associated with each dimension name has one of two forms
Form 1 is a special case of form 2, and is kept for backward compatibility. Whenever a new file is written, it uses format 1 if possible, otherwise format 2.
* Form 1: An integer representing the size of the dimension, which is used for simple named dimensions.
* Form 2: A dictionary with the following keys and values"
- "size" with an integer value representing the (current) size of the dimension.
- "unlimited" with a value of either "1" or "0" to indicate if this dimension is an unlimited dimension.
For Unlimited dimensions, the size is initially zero, and as variables extend the length of that dimension, the size value for the dimension increases.
That dimension size is shared by all arrays referencing that dimension, so if one array extends an unlimited dimension, it is implicitly extended for all other arrays that reference that dimension.
This is the standard semantics for unlimited dimensions.
Adding unlimited dimensions required a number of other changes to the NCZarr code-base. These included the following.
* Did a partial refactor of the slice handling code in zwalk.c to clean it up.
* Added a number of tests for unlimited dimensions derived from the same test in nc_test4.
* Added several NCZarr specific unlimited tests; more are needed.
* Add test of endianness.
## Misc. Other Changes
* Modify libdispatch/ncs3sdk_aws.cpp to optionally support use of the
AWS Transfer Utility mechanism. This is controlled by the
```#define TRANSFER```` command in that file. It defaults to being disabled.
* Parameterize both the standard Unidata S3 bucket (S3TESTBUCKET) and the netcdf-c test data prefix (S3TESTSUBTREE).
* Fixed an obscure memory leak in ncdump.
* Removed some obsolete unit testing code and test cases.
* Uncovered a bug in the netcdf-c handling of big-endian floats and doubles. Have not fixed yet. See tst_h5_endians.c.
* Renamed some nczarr_tests testcases to avoid name conflicts with nc_test4.
* Modify the semantics of zmap\#ncsmap_write to only allow total rewrite of objects.
* Modify the semantics of zodom to properly handle stride > 1.
* Add a truncate operation to the libnczarr zmap code.
## Improvements to S3 Documentation
* Create a new document *quickstart_paths.md* that give a summary of the legal path formats used by netcdf-c. This includes both file paths and URL paths.
* Modify *nczarr.md* to remove most of the S3 related text.
* Move the S3 text from *nczarr.md* to a new document *cloud.md*.
* Add some S3-related text to the *byterange.md* document.
Hopefully, this will make it easier for users to find the information they want.
## Rebuild NCZarr Testing
In order to avoid problems with running make check in parallel, two changes were made:
1. The *nczarr_test* test system was rebuilt. Now, for each test.
any generated files are kept in a test-specific directory, isolated
from all other test executions.
2. Similarly, since the S3 test bucket is shared, any generated S3 objects
are isolated using a test-specific key path.
## Other S3 Related Changes
* Add code to ensure that files created on S3 are reclaimed at end of testing.
* Used the bash "trap" command to ensure S3 cleanup even if the test fails.
* Cleanup the S3 related configure.ac flag set since S3 is used in several places. So now one should use the option *--enable-s3* instead of *--enable-nczarr-s3*, although the latter is still kept as a deprecated alias for the former.
* Get some of the github actions yml to work with S3; required fixing various test scripts adding a secret to access the Unidata S3 bucket.
* Cleanup S3 portion of libnetcdf.settings.in and netcdf_meta.h.in and test_common.in.
* Merge partial S3 support into dhttp.c.
* Create an experimental s3 access library especially for use with Windows. It is enabled by using the options *--enable-s3-internal* (automake) or *-DENABLE_S3_INTERNAL=ON* (CMake). Also add a unit-test for it.
* Move some definitions from ncrc.h to ncs3sdk.h
## Other Changes
* Provide a default implementation of strlcpy and move this and similar defaults into *dmissing.c*.
When "getopt()" is not available, various of the netcdf-c utilities
use XGetopt instead. This occurs primarily when building under Window,
so the build changes are restricted to CMake.
This PR tries to isolate XGetopt.c to the libdispatch directory
and then builds the various utilities using this cliche:
````
IF(USE_X_GETOPT)
SET(XGETOPTSRC "${CMAKE_CURRENT_SOURCE_DIR}/../libdispatch/XGetopt.c")
ENDIF()
````
This avoids the need to copy XGetopt.c to all the directories that
use it.
re: Discussion https://github.com/Unidata/netcdf-c/discussions/2214
The primary change is to support so-called "standard filters".
A standard filter is one that is defined by the following
netcdf-c API:
````
int nc_def_var_XXX(int ncid, int varid, size_t nparams, unsigned* params);
int nc_inq_var_XXXX(int ncid, int varid, int* usefilterp, unsigned* params);
````
So for example, zstandard would be a standard filter by defining
the functions *nc_def_var_zstandard* and *nc_inq_var_zstandard*.
In order to define these functions, we need a new dispatch function:
````
int nc_inq_filter_avail(int ncid, unsigned filterid);
````
This function, combined with the existing filter API can be used
to implement arbitrary standard filters using a simple code pattern.
Note that I would have preferred that this function return a list
of all available filters, but HDF5 does not support that functionality.
So this PR implements the dispatch function and implements
the following standard functions:
+ bzip2
+ zstandard
+ blosc
Specific test cases are also provided for HDF5 and NCZarr.
Over time, other specific standard filters will be defined.
## Primary Changes
* Add nc_inq_filter_avail() to netcdf-c API.
* Add standard filter implementations to test use of *nc_inq_filter_avail*.
* Bump the dispatch table version number and add to all the relevant
dispatch tables (libsrc, libsrcp, etc).
* Create a program to invoke nc_inq_filter_avail so that it is accessible
to shell scripts.
* Cleanup szip support to properly support szip
when HDF5 is disabled. This involves detecting
libsz separately from testing if HDF5 supports szip.
* Integrate shuffle and fletcher32 into the existing
filter API. This means that, for example, nc_def_var_fletcher32
is now a wrapper around nc_def_var_filter.
* Extend the Codec defaulting to allow multiple default shared libraries.
## Misc. Changes
* Modify configure.ac/CMakeLists.txt to look for the relevant
libraries implementing standard filters.
* Modify libnetcdf.settings to list available standard filters
(including deflate and szip).
* Add CMake test modules to locate libbz2 and libzstd.
* Cleanup the HDF5 memory manager function use in the plugins.
* remove unused file include//ncfilter.h
* remove tests for the HDF5 memory operations e.g. H5allocate_memory.
* Add flag to ncdump to force use of _Filter instead of _Deflate
or _Shuffle or _Fletcher32. Used for testing.
re: Issue https://github.com/Unidata/netcdf-c/issues/2190
The primary purpose of this PR is to improve the utf8 support
for windows. This is persuant to a change in Windows that
supports utf8 natively (almost). The almost means that it is
still utf16 internally and the set of characters representable
by utf8 is larger than those representable by utf16.
This leaves open the question in the Issue about handling
the Windows 1252 character set.
This required the following changes:
1. Test the Windows build and major version in order to see if
native utf8 is supported.
2. If native utf8 is supported, Modify dpathmgr.c to call the 8-bit
version of the windows fopen() and open() functions.
3. In support of this, programs that use XGetOpt (Windows versions)
need to get the command line as utf8 and then parse to
arc+argv as utf8. This requires using a homegrown command line parser
named XCommandLineToArgvA.
4. Add a utility program called "acpget" that prints out the
current Windows code page and locale.
Additionally, some technical debt was cleaned up as follows:
1. Unify all the places which attempt to read all or a part
of a file into the dutil.c#NC_readfile code.
2. Similary unify all the code that creates temp files into
dutil.c#NC_mktmp code.
3. Convert almost all remaining calls to fopen() and open()
to NCfopen() and NCopen3(). This is to ensure that path management
is used consistently. This touches a number of files.
4. extern->EXTERNL as needed to get it to work under Windows.
re: PR https://github.com/Unidata/netcdf-c/pull/2088
re: PR https://github.com/Unidata/netcdf-c/pull/2130
replaces: https://github.com/Unidata/netcdf-c/pull/2140
Changes:
* Add NCZarr-specific quantize functions to the dispatch table.
* Copy (modified) quantize code from libhdf5 to NCZarr
* Add quantize invocation to zvar.c
* Add support for _QuantizeBitgroomNumberOfSignificantDigits
and _QuantizeGranularBitgroomNumberOfSignificantDigits to ncgen.
* Modify nc_test4/tst_quantize.c to allow it to be used both for hdf5
and for nczarr.
* Make dap4 properly handle quantize functions in dispatch table.
* Add quantize attribute support to ncgen.
Other changes:
* Caught and fixed some S3 problems
* Fixed some nczarr fillvalue problems.
* Fixed some nczarr cache problems.
* Cleanup some flaws in libdispatch/dinfermodel.c
* Allow byterange requests to S3 be readable by dinfermodel.c/check_file_type
* Remove the libnczarr ztracedispatch code (big change).