Commit Graph

6 Commits

Author SHA1 Message Date
Dennis Heimbigner
11fe00ea05 Add filter support to NCZarr
Filter support has three goals:

1. Use the existing HDF5 filter implementations,
2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr,
3. Allow filters to be used even when HDF5 is disabled

Detailed usage directions are define in docs/filters.md.

For now, the existing filter API is left in place. So filters
are defined using ''nc_def_var_filter'' using the HDF5 style
where the id and parameters are unsigned integers.

This is a big change since filters affect many parts of the code.

In the following, the terms "compressor" and "filter" and "codec" are generally
used synonomously.

### Filter-Related Changes:
* In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms.
* Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h.
* Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out.
* Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h.
* Add a number of new test to test the new nczarr filters.
* Let ncgen parse _Codecs attribute, although it is ignored.

### Plugin directory changes:
* Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file
* Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip
* Add a Codec defaulter (see docs/filters.md) for the big four filters.
* Make plugins work with windows by properly adding __declspec declaration.

### Misc. Non-Filter Changes
* Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5.
* Improve support for caching
* More fixes for path conversion code
* Fix misc. memory leaks
* Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath.
* Add a number of new test to test the non-filter fixes.
* Update the parsers
* Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'
2021-09-02 17:04:26 -06:00
Dennis Heimbigner
aeb3ac2809 Mostly revert the filter code to reduce its complexity of use.
re: https://github.com/Unidata/netcdf-c/issues/1836

Revert the internal filter code to simplify it. From the user's
point of view, the only visible changes should be:

1. The functions that convert text to filter specs have had their signature reverted and have been moved to netcdf_aux.h
2. Some filter API functions now return NC_ENOFILTER when inquiry is made about some filter.

Internally,the dispatch table has been modified to get rid of the filter_actions
entry and associated complex structures. It has been replaced with
inq_var_filter_ids and inq_var_filter_info entries and the dispatch table
version has been bumped to 3. Corresponding NOOP and NOTNC4 functions
were added to libdispatch/dnotnc4.c. Also, the filter_action entries
in dispatch tables were replaced for all dispatch code bases (HDF5, DAP2,
etc). This should only impact UDF users.

In the process, it became clear that the form of the filters
field in NC_VAR_INFO_T was format dependent, so I converted it to
be of type void* and pushed its management into the various dispatch
code bases. Specifically libhdf5 and libnczarr now manage the filters
field in their own way.

The auxilliary functions for parsing textual filter specifications
were moved to netcdf_aux.h and were renamed to the following:
* ncaux_h5filterspec_parse
* ncaux_h5filterspec_parselist
* ncaux_h5filterspec_free
* ncaux_h5filter_fix8

Misc. Other Changes:

1. Document NUG/filters.md updated to reflect the changes above.
2. All the old data types (structs and enums)
   used by filter_actions actions were deleted.
   The exception is the NC_H5_Filterspec because it is needed
   by ncaux_h5filterspec_parselist.
3. Clientside filters were removed -- another enhancement
   for which no-one ever asked.
4. The ability to remove filters was itself removed.
5. Some functionality needed by nczarr was moved from libhdf5
   to libsrc4 e.g. nc4_find_default_chunksizes
6. All the filterx code was removed
7. ncfilter.h and nc4filter.c no longer used

Misc. Unrelated Changes:

1. The nczarr_test makefile clean was leaving some directories; so
   add clean-local to take care of them.
2020-09-27 12:43:46 -06:00
Dennis Heimbigner
6f86660da8 Fix missing forward declarations
re: issue https://github.com/Unidata/netcdf-c/issues/1687

static functions are being used before decl and it causes
errors. Only occurs when BIG_ENDIAN is defined.
Solution is to add the forward declarations.
2020-04-03 20:15:34 -06:00
Dennis Heimbigner
4c1ed0144b Fix nc_test4/tst_filter.sh for big endian
re: issue https://github.com/Unidata/netcdf-c/issues/1338

Changes:

1. nc_test4/tst_filter.sh + nc_test4/ref_filteredvv.cdl --
   properly suppress _Endianness attribute
2. fix some warnings
2019-02-24 22:20:01 -07:00
Ward Fisher
e69432704a Corrected cmake error on OSX in plugins compilation. 2019-02-15 16:07:22 -07:00
Dennis Heimbigner
8714066b18 Fix errors when building on big-endian machine
re: issue https://github.com/Unidata/netcdf-c/issues/1278
re: issue https://github.com/Unidata/netcdf-c/issues/876
re: issue https://github.com/Unidata/netcdf-c/issues/806

* Major change to the handling of 8-byte parameters for nc_def_var_filter.
  The old code was not well thought out.
  * The new algorithm is documented in docs/filters.md.
  * Added new utility file plugins/H5Zutil.c to support
  * Modified plugins/H5Zmisc.c to use new algorithm
  the new algorithm.
  * Renamed include/ncfilter.h to include/netcdf_filter.h
    and made it an installed header so clients can access the
    new algorithm utility.
  * Fixed nc_test4/tst_filterparser.c and nc_test4/test_filter_misc.c
    to use the new algorithm
* libdap4/ fixes:
  * d4swap.c has an error in the endian pre-processing such
    that record counts were not being swapped correctly.
  * d4data.c had an error in that checksums were being computed
    after endian swapping rather than before.
* ocinitialize() was never being called, so xxdr bigendian handling
  was never set correctly.
  * Required adding debug statements to occompile
* Found and fixed memory leak in ncdump.c

Not tested:
* HDF4
* Pnetcdf
* parallel HDF5
2019-01-31 21:13:06 -07:00