re: Issue
The byterange handling of the following URLS fails.
### Problem 1: "https://crudata.uea.ac.uk/cru/data/temperature/HadCRUT.4.6.0.0.median.nc#mode=bytes"
It turns out that byterange in hdf5 has two possible targets: S3 and not-S3 (e.g. a thredds server or the crudata URL above). Each uses a different HDF5 Virtual File Driver (VFD).
I incorrectly set up the byterange code in libhdf5 so that it would choose one or the other of the two VFD's for any netcdf-c library build. The fix is to allow it to choose either one at run-time.
### Problem 2: "https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc#mode=bytes,s3"
When given what appears to be an S3-related URL, the netcdf-c library code converts it into a canonical, so-called "path" format. In casing out the possible input URL formats, I missed the case where the host contains the bucket ("noaa-goes16"), but not the region. So the fix was to check for this case.
## Misc. Related Changes
1. Since S3 is used in more than just NCZarr, I changed the automake/cmake options to replace "--enable-nczarr-s3" with "--enable-s3", but keeping the former option as a synonym for the latter. This also entailed cleaning up libnetcdf.settings WRT S3 support
2. Added the above URLS as additional test cases
## Misc. Un-Related Changes
1. CURLOPT_PUT is deprecated in favor to CURLOPT_UPLOAD
2. Fix some minor warnings
## Open Problems
* Under Ubuntu, either libcrypto or aws-sdk-cpp has a memory leak.
re: Issue https://github.com/Unidata/netcdf-c/issues/2551
Ryan May identified the use of a common scratch file (tmp.cdl)
across multiple test shell scripts in ncdump directory
and the nczarr_test directory.
This sometimes causes errors because of race conditions
between those scripts.
I renamed those common files to avoid the race condition. I
also did some further checking and found some additional,
similar conflicts and fixed those. Also did some minor cleanup
of unused files.
Tests fixed:
ncdump: run_back_comp_tests.sh tst_bom.sh tst_nccopy4.sh tst_nccopy5.sh
nczarr_test: git df master -- run_nccopyz.sh run_nczarr_fill.sh run_scalar.sh
re: https://github.com/Unidata/netcdf-c/issues/2521
Charlie Zender has discovered that the netcdf created file VERSION
conflicts with the C++ version file on OSX case-insensitive file systems,
and maybe other case-insensitvie file systems.
Note:
1. Cmake does not create the VERSION file
2. The VERSION file is not installed
3. It turns out that the VERSION file is not required by the autoconf build.
It is possible that clients or package build system (e.g apt or brew)
might use the VERSION file, so we cannot delete it altogether.
So as a fix, we move the creation of the VERSION file to after the
build is complete by inserting a all-local hook into netcdf-c/Makefile.am.
# Misc. other changes
1. Suppressed warning by making use of the systeminfo command contingent on the platform being Windows.
* re: https://github.com/Unidata/netcdf-c/pull/2278
* re: https://github.com/Unidata/netcdf-c/issues/2485
* re: https://github.com/Unidata/netcdf-c/issues/2474
This PR subsumes PR https://github.com/Unidata/netcdf-c/pull/2278.
Actually is a bit an omnibus covering several issues.
## PR https://github.com/Unidata/netcdf-c/pull/2278
Add support for the Zarr string type.
Zarr strings are restricted currently to be of fixed size.
The primary issue to be addressed is to provide a way for user to
specify the size of the fixed length strings. This is handled by providing
the following new attributes special:
1. **_nczarr_default_maxstrlen** —
This is an attribute of the root group. It specifies the default
maximum string length for string types. If not specified, then
it has the value of 64 characters.
2. **_nczarr_maxstrlen** —
This is a per-variable attribute. It specifies the maximum
string length for the string type associated with the variable.
If not specified, then it is assigned the value of
**_nczarr_default_maxstrlen**.
This PR also requires some hacking to handle the existing netcdf-c NC_CHAR
type, which does not exist in zarr. The goal was to choose numpy types for
both the netcdf-c NC_STRING type and the netcdf-c NC_CHAR type such that
if a pure zarr implementation read them, it would still work and an
NC_CHAR type would be handled by zarr as a string of length 1.
For writing variables and NCZarr attributes, the type mapping is as follows:
* "|S1" for NC_CHAR.
* ">S1" for NC_STRING && MAXSTRLEN==1
* ">Sn" for NC_STRING && MAXSTRLEN==n
Note that it is a bit of a hack to use endianness, but it should be ok since for
string/char, the endianness has no meaning.
For reading attributes with pure zarr (i.e. with no nczarr
atribute types defined), they will always be interpreted as of
type NC_CHAR.
## Issue: https://github.com/Unidata/netcdf-c/issues/2474
This PR partly fixes this issue because it provided more
comprehensive support for Zarr attributes that are JSON valued expressions.
This PR still does not address the problem in that issue where the
_ARRAY_DIMENSION attribute is incorrectly set. Than can only be
fixed by the creator of the datasets.
## Issue: https://github.com/Unidata/netcdf-c/issues/2485
This PR also fixes the scalar failure shown in this issue.
It generally cleans up scalar handling.
It also adds a note to the documentation describing that
NCZarr supports scalars while Zarr does not and also how
scalar interoperability is achieved.
## Misc. Other Changes
1. Convert the nczarr special attributes and keys to be all lower case. So "_NCZARR_ATTR" now used "_nczarr_attr. Support back compatibility for the upper case names.
2. Cleanup my too-clever-by-half handling of scalars in libnczarr.
re: Issue https://github.com/Unidata/netcdf-c/issues/2458
The above Github Issue revealed some bugs in the file netcdf-c/plugins/H5Zblosc.c. Fixed and added a testcase. Also discovered that the Blosc LZ sub-compressors do not work well with small datasets.
Misc. Other Change(s): I noticed that the file "dap4_test/baselinethredds/GOES16_CONUS_20170821_020218_0.47_1km_33.3N_91.4W.nc4.thredds" is still causing tar errors during "make distcheck", so I made some changes to do rename at test-time.
re: https://github.com/Unidata/netcdf-c/issues/2420
nc_test4/tst_vars3.c has the wrong conditional
around the szip tests (did I do that?).
Anyway, the current test is
> #ifdef HAVE_SZ
and it should be
> #ifdef HAVE_H5Z_SZIP
because the only thing that matters is that HDF5 lib has szip support.
re: https://github.com/Unidata/netcdf-c/issues/2337
re: https://github.com/Unidata/netcdf-c/issues/2407
Add two functions to netcdf.h to allow programs to get/set
selected entries into the internal .rc tables. This should fix
the above issues by allowing HTTP.CAINFO to be set to the
certificates directory. Note that the changes should be
performed as early as possible in the program because some of
the .rc table entries may get cached internally and changing the
entry after that caching occurs may have no effect.
The new signatures are as follows:
1. Get the value of a simple .rc entry of the form "key=value".
Note that caller must free the returned value, which might be NULL.
````
char* nc_rc_get(char* const * key);
@param key table entry key
@return value if .rc table has entry of the form key=value
@return NULL if no such entry is found.
````
2. Insert/Overwrite the specified key=value pair in the .rc table.
````
int nc_rc_set(const char* key, const char* value);
@param key table entry key -- may not be NULL
@param value table entry value -- may not be NULL
@return NC_NOERR if no error
@return NC_EINVAL if error
````
Addendum:
re: https://github.com/Unidata/netcdf-c/issues/2407
Modify dhttp.c to use the .rc entry HTTP.CAINFO if defined.
## Overwriting
I think I solved the file overwrite problem by doing light name
mangling of the shared library names. With this change the probabilty
is very small that installing our filter wrappers in a directory will
overwrite code produced by others.
## Default Install Location
I have setup the --with-plugin-dir option default to install in
the following locations in order of preference
1. If HDF5_PLUGIN_PATH is defined (at build time remember), then the last directory in that path will be where the filter wrapper shared libraries will be installed.
2. Otherwise the default is "/usr/local/hdf5/lib/plugin" (on *nix*) or "%ALLUSERSPROFILE%\\hdf5\\lib\\plugin" for Windows or Mingw.
Currently, --with-plugin-dir is disabled by default.
I should note that even if I enable it by default, installing
netcdf-c will still not run "out of the box" because the hypothetical
naive user will not know which compressor libraries need to be
pre-installed before netcdf is installed. Nor will that user have any
way to find out what needs to be installed.
re: https://github.com/Unidata/netcdf-c/issues/2338
re: https://github.com/Unidata/netcdf-c/issues/2294
In issue https://github.com/Unidata/netcdf-c/issues/2338,
Ed Hartnett suggested a better way to install filters to a user
defined location -- for Automake, anyway.
This PR implements that suggestion. It turns out to be more
complicated than it appears, so there are fair number of changes;
mostly to shell scripts. Most of the change is in plugins/Makefile.am.
NOTE: this PR still does NOT address the use of HDF5_PLUGIN_PATH
as the default; this turns out to be complex when dealing with NCZarr.
So this will be addressed in a subsequent post 4.9.0 PR.
## Misc. Changes
1. Record the occurrences of incomplete codecs in libnczarr so that
they can be included in _Codecs attribute correctly. This allows
users to see what missing filters are referenced in the Zarr file.
Primarily affects libnczarr/zfilter.[ch]. Also required creating a
new no-effect filter: H5Zunknown.c.
2. Move the unknown filter test to a separate test file.
3. Incorporates PR https://github.com/Unidata/netcdf-c/pull/2343