cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
re: https://github.com/Unidata/netcdf-c/issues/1584
Support has been added for multiple filters per variable. This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.
The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
- Obtain a list of the filters associated with a variable
- Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
an extended format for specifying a sequence of filters.
The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
for filters known to the netcdf library, which in turn is taken
from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.
Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.
Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
This is a follow up to PR https://github.com/Unidata/netcdf-c/pull/1173
Sorry that it is so big, but leak suppression can be complex.
This PR fixes all remaining memory leaks -- as determined by
-fsanitize=address, and with the exceptions noted below.
Unfortunately. there remains a significant leak that I cannot
solve. It involves vlens, and it is unclear if the leak is
occurring in the netcdf-c library or the HDF5 library.
I have added a check_PROGRAM to the ncdump directory to show the
problem. The program is called tst_vlen_demo.c To exercise it,
build the netcdf library with -fsanitize=address enabled. Then
go into ncdump and do a "make clean check". This should build
tst_vlen_demo without actually executing it. Then do the
command "./tst_vlen_demo" to see the output of the memory
checker. Note the the lost malloc is deep in the HDF5 library
(in H5Tvlen.c).
I am temporarily working around this error in the following way.
1. I modified several test scripts to not execute known vlen tests
that fail as described above.
2. Added an environment variable called NC_VLEN_NOTEST.
If set, then those specific tests are suppressed.
This should mean that the --disable-utilities option to
./configure should not need to be set to get a memory leak clean
build. This should allow for detection of any new leaks.
Note: I used an environment variable rather than a ./configure
option to control the vlen tests. This is because it is
temporary (I hope) and because it is a bit tricky for shell
scripts to access ./configure options.
Finally, as before, this only been tested with netcdf-4 and hdf5 support.
2. Factored out the parameter string parsing for ncgen and nccopy
int libdispatch/dfilter.c + include/ncfilter.h
3. Allow a parameter string to use constant types other than
unsigned int. See docs/filters.md for details.
4. Moved the old content of include/netcdf_filter.h into include/netcdf.h
and removed include/netcdf_filter.h as no longer needed.
5. Force the test filter (bzip2) in nc_test4/filter_test to
be built using BUILT_SOURCES.
This relies on the HDF5 capability to
dynamically load compression filters.
Note that a compression filter is just
a subcase of filters.
The primary user-visible changes are as follows:
1. Add a standard header "netcdf_filter.h" that defines
the necessary API extensions
2. Modify ncgen to support two new special attributes
"_Filter_ID" and "_Filter_Parameters" so that compression
can be turned on when creating a file using ncgen.
4. Add a detailed description of filtering support
to the user's guide; see the file filters.md
5. Add a test case directory for this: nc_test4/filter_test.
It is fragile and a ./configure flags (-enable-filter-test)
is defined (default disabled) to shut this off this test
to avoid spurious 'make check' failures.
Note that the HDF5 documentation is not up-to-date, so
much of what is encoded here comes from examining the
actual code in the file H5PL.c in the HDF5 source code.
This consists of a persistent attribute named
_NCProperties plus two computed attributes
_IsNetcdf4 and _SuperblockVersion.
See the 'Provenance Attributes' section
of docs/attribute_conventions.md for details.
- Fix NCF-157 to modify DAP code to support
partial variable retrieval.
- Fix of NCF-154 to solve problem of ncgen
improperly processing data lists for variables
of size greater than 2**18 bytes.
- Fix ncgen processing of char variables that have
multiple unlimited dimensions.
- Partly fix Jira issue: NCF-145 (vlen issues).
- Benchmark program nc_test4/tst_ar4_*) requires arguments
and should only be invoked inside a shell
script; fixed so that they terminate cleanly
if invoked with no arguments.
- Fix the Doxygen processing so it will work
with make distcheck.
- Begin switchover to using an alternative to ncio.
- Begin support for in-memory (diskless) files.
ncgen: nan bug
made semicolons optional after type decls
libncdap{3,4}: revamped the NC surrogate to better match
with libsrc
libdispatch: Added a new_nc function to the dispatch table; its purpose
is to allow hierarchical use of NC compatible data structures.
libsrc: cleaned up the NC structure by removing drno field
general: removed --with-oc options