Commit Graph

46 Commits

Author SHA1 Message Date
Edward Hartnett
d1cbd60960 fixed missing szip constants in netcdf.h 2022-05-04 09:48:46 -06:00
Dennis Heimbigner
3ffe7be446 Enhance/Fix filter support
re: Discussion https://github.com/Unidata/netcdf-c/discussions/2214

The primary change is to support so-called "standard filters".
A standard filter is one that is defined by the following
netcdf-c API:
````
int nc_def_var_XXX(int ncid, int varid, size_t nparams, unsigned* params);
int nc_inq_var_XXXX(int ncid, int varid, int* usefilterp, unsigned* params);
````
So for example, zstandard would be a standard filter by defining
the functions *nc_def_var_zstandard* and *nc_inq_var_zstandard*.

In order to define these functions, we need a new dispatch function:
````
int nc_inq_filter_avail(int ncid, unsigned filterid);
````
This function, combined with the existing filter API can be used
to implement arbitrary standard filters using a simple code pattern.
Note that I would have preferred that this function return a list
of all available filters, but HDF5 does not support that functionality.

So this PR implements the dispatch function and implements
the following standard functions:
    + bzip2
    + zstandard
    + blosc
Specific test cases are also provided for HDF5 and NCZarr.
Over time, other specific standard filters will be defined.

## Primary Changes
* Add nc_inq_filter_avail() to netcdf-c API.
* Add standard filter implementations to test use of *nc_inq_filter_avail*.
* Bump the dispatch table version number and add to all the relevant
   dispatch tables (libsrc, libsrcp, etc).
* Create a program to invoke nc_inq_filter_avail so that it is accessible
  to shell scripts.
* Cleanup szip support to properly support szip
  when HDF5 is disabled. This involves detecting
  libsz separately from testing if HDF5 supports szip.
* Integrate shuffle and fletcher32 into the existing
  filter API. This means that, for example, nc_def_var_fletcher32
  is now a wrapper around nc_def_var_filter.
* Extend the Codec defaulting to allow multiple default shared libraries.

## Misc. Changes
* Modify configure.ac/CMakeLists.txt to look for the relevant
  libraries implementing standard filters.
* Modify libnetcdf.settings to list available standard filters
  (including deflate and szip).
* Add CMake test modules to locate libbz2 and libzstd.
* Cleanup the HDF5 memory manager function use in the plugins.
* remove unused file include//ncfilter.h
* remove tests for the HDF5 memory operations e.g. H5allocate_memory.
* Add flag to ncdump to force use of _Filter instead of _Deflate
  or _Shuffle or _Fletcher32. Used for testing.
2022-03-14 12:39:37 -06:00
Dennis Heimbigner
11fe00ea05 Add filter support to NCZarr
Filter support has three goals:

1. Use the existing HDF5 filter implementations,
2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr,
3. Allow filters to be used even when HDF5 is disabled

Detailed usage directions are define in docs/filters.md.

For now, the existing filter API is left in place. So filters
are defined using ''nc_def_var_filter'' using the HDF5 style
where the id and parameters are unsigned integers.

This is a big change since filters affect many parts of the code.

In the following, the terms "compressor" and "filter" and "codec" are generally
used synonomously.

### Filter-Related Changes:
* In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms.
* Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h.
* Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out.
* Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h.
* Add a number of new test to test the new nczarr filters.
* Let ncgen parse _Codecs attribute, although it is ignored.

### Plugin directory changes:
* Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file
* Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip
* Add a Codec defaulter (see docs/filters.md) for the big four filters.
* Make plugins work with windows by properly adding __declspec declaration.

### Misc. Non-Filter Changes
* Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5.
* Improve support for caching
* More fixes for path conversion code
* Fix misc. memory leaks
* Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath.
* Add a number of new test to test the non-filter fixes.
* Update the parsers
* Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'
2021-09-02 17:04:26 -06:00
Dennis Heimbigner
aeb3ac2809 Mostly revert the filter code to reduce its complexity of use.
re: https://github.com/Unidata/netcdf-c/issues/1836

Revert the internal filter code to simplify it. From the user's
point of view, the only visible changes should be:

1. The functions that convert text to filter specs have had their signature reverted and have been moved to netcdf_aux.h
2. Some filter API functions now return NC_ENOFILTER when inquiry is made about some filter.

Internally,the dispatch table has been modified to get rid of the filter_actions
entry and associated complex structures. It has been replaced with
inq_var_filter_ids and inq_var_filter_info entries and the dispatch table
version has been bumped to 3. Corresponding NOOP and NOTNC4 functions
were added to libdispatch/dnotnc4.c. Also, the filter_action entries
in dispatch tables were replaced for all dispatch code bases (HDF5, DAP2,
etc). This should only impact UDF users.

In the process, it became clear that the form of the filters
field in NC_VAR_INFO_T was format dependent, so I converted it to
be of type void* and pushed its management into the various dispatch
code bases. Specifically libhdf5 and libnczarr now manage the filters
field in their own way.

The auxilliary functions for parsing textual filter specifications
were moved to netcdf_aux.h and were renamed to the following:
* ncaux_h5filterspec_parse
* ncaux_h5filterspec_parselist
* ncaux_h5filterspec_free
* ncaux_h5filter_fix8

Misc. Other Changes:

1. Document NUG/filters.md updated to reflect the changes above.
2. All the old data types (structs and enums)
   used by filter_actions actions were deleted.
   The exception is the NC_H5_Filterspec because it is needed
   by ncaux_h5filterspec_parselist.
3. Clientside filters were removed -- another enhancement
   for which no-one ever asked.
4. The ability to remove filters was itself removed.
5. Some functionality needed by nczarr was moved from libhdf5
   to libsrc4 e.g. nc4_find_default_chunksizes
6. All the filterx code was removed
7. ncfilter.h and nc4filter.c no longer used

Misc. Unrelated Changes:

1. The nczarr_test makefile clean was leaving some directories; so
   add clean-local to take care of them.
2020-09-27 12:43:46 -06:00
Dennis Heimbigner
44d0dcaad2 Add support for multiple filters per variable.
re: https://github.com/Unidata/netcdf-c/issues/1584

Support has been added for multiple filters per variable.  This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.

The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
    - Obtain a list of the filters associated with a variable
    - Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
  about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
  an extended format for specifying a sequence of filters.
  The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
  filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
  for filters known to the netcdf library, which in turn is taken
  from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
  is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
  to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.

Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.

Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
2020-02-16 12:59:33 -07:00
Edward Hartnett
558988bb18 fixed docs, removed unneeded defines in test 2020-02-07 07:54:12 -07:00
Edward Hartnett
a5079585a0 test more values for szip 2020-02-07 07:05:29 -07:00
Edward Hartnett
b9aa2e76db test more values for szip 2020-02-07 07:04:48 -07:00
Edward Hartnett
c4d3937099 now check number of elements in chunk against pixels_per_block for szip compression 2020-02-07 07:03:40 -07:00
Edward Hartnett
ff7280512e checking for some bad pixels_per_block values for szip 2020-02-07 06:53:52 -07:00
Edward Hartnett
81a079997e test development for different szip option mask settings 2020-02-07 06:02:02 -07:00
Edward Hartnett
67a562393b adding test of different szip option mask settings 2020-02-07 05:58:59 -07:00
Edward Hartnett
6ae81a17c7 starting to add test for szip param values 2020-02-07 05:16:21 -07:00
Edward Hartnett
6d2d751e4e disallow zlib if szip already in use 2020-02-07 05:01:06 -07:00
Edward Hartnett
78cd7f7512 removed unneeded var 2020-02-07 04:37:04 -07:00
Edward Hartnett
3d169c432b removed redundant test 2020-02-07 04:32:57 -07:00
Edward Hartnett
fb2a1048bb documentation improvements for nc_inq_var_szip() 2020-02-06 07:42:53 -07:00
Edward Hartnett
6d50ba67e8 turned on some previously failing tests for nc_inq_var_szip() 2020-02-06 07:39:01 -07:00
Edward Hartnett
ca5c0234d4 renamed var for clarity 2020-02-06 07:36:10 -07:00
Edward Hartnett
d5859e91b7 not return 0 for parameters to nc_inq_var_szip if szip is not turned on for var 2020-02-06 07:35:07 -07:00
Edward Hartnett
b09446b0da test nc_inq_var_szip after call to nc_def_var_szip, but before enddef 2020-02-04 08:47:18 -07:00
Edward Hartnett
e363bd59c7 testing nc_def_var_szip when szip has not been built into HDF5 2020-02-04 05:33:52 -07:00
Edward Hartnett
7e332690d2 more szip testing 2020-01-06 08:04:27 -07:00
Edward Hartnett
9c8719bd00 more szip tests 2020-01-06 08:01:28 -07:00
Edward Hartnett
65e5897533 whitespace clean up 2020-01-06 07:58:11 -07:00
Edward Hartnett
8a341372ea clean up 2020-01-06 07:57:50 -07:00
Edward Hartnett
1c3edc503d now checking for proper error from nc_def_var_szip() when szip is not present 2020-01-06 07:53:31 -07:00
Edward Hartnett
c47b404531 added test for when szip is missing from HDF5 2020-01-06 07:50:02 -07:00
edwardhartnett
965da1de01 now testing that endianness can only be set on atomic ints and floats 2019-11-15 11:10:10 -07:00
Ed Hartnett
091880fd4d fixed null stride problem for vars calls 2018-12-31 07:36:39 -07:00
Ward Fisher
462ec93913 Whew! Updated copyright stanza in nc_test4. 2018-12-06 15:27:32 -07:00
Dennis Heimbigner
108dc0f01d Fix szip filter handling code and correspondingtests
re: https://github.com/Unidata/netcdf-c/issues/972

The current szip plugin code in the HDF5 library has some
unexpected behaviors that require some changes to how
nc_inq_var_szip is implemented and to the corresponding tests:
nc_test4/{test_szip,tst_vars3}.

Specifically, the following can happen:

1. The number of parameters provided by the user will be two,
   but the number of parameters returned by nc_inq_var_filter
   will be four because the HDF5 code (H5Zszip) will add two
   extra parameters for internal use. It turns out that the two
   parameters provided when calling nc_def_var_filter correspond
   to the first two parameters of the four parameters returned
   by nc_inq_var_filter.

2. The nc_inq_var_szip values corresponding to the ones provided
   by the caller may be different than those provided by
   nc_def_var_filter.  The value of the options_mask argument is
   known to add additional flag bits, and the pixels_per_block
   parameter may be modified.
2018-09-15 15:21:51 -06:00
Ed Hartnett
0fed77be82 fixed warning 2018-07-06 06:32:09 -06:00
Ed Hartnett
c1106c5f18 szip may not be present 2018-06-08 04:46:33 -06:00
Ed Hartnett
2d0ef80cc4 added tests 2018-05-13 07:31:46 -06:00
Ed Hartnett
bf6dab14e0 fixed error value from nc4_find_g_var_nc() when ncid is bad 2017-12-03 04:06:56 -07:00
Ed Hartnett
9e4c564656 more tests 2017-12-02 18:00:22 -07:00
Ed Hartnett
8055c1cede more tests 2017-12-02 17:43:46 -07:00
Ed Hartnett
d840bc7160 starting adding testing for _int() functions for caching 2017-12-02 06:52:13 -07:00
Ed Hartnett
300236cd8c more tests 2017-12-01 15:08:22 -07:00
Ward Fisher
1c11fc5d59 More refactoring, pushing to avoid accidental loss. 2016-10-21 19:24:40 +00:00
tbeu
e2820e4d8a Fix common typos
Detected by https://github.com/vlajos/misspell_fixer
2015-08-20 11:42:05 +02:00
Russ Rew
5c9368a7fa Fix bug NCF-222, scalar non-coordinate variables with same name as dimensions 2013-02-04 15:48:47 +00:00
Russ Rew
fab1c8675c Fix NCF-217, a bug depending on order of calls to netCDF-4 functions 2013-02-01 17:38:26 +00:00
Dennis Heimbigner
7e27052f87 - Implemented diskless files for both netcdf classic and extended.
The in-memory files can be made persistent if nc_create is called with
  NC_DISKLESS|NC_WRITE flags set. Initial test case also included.
- Modified ncio mechanism to support
  multiple ncio packages; this is so we
  can have posixio and memio operating
  at the same time.
- cleanup up a bunch of lint issues (unused variables, etc).
2012-03-26 01:34:32 +00:00
Ed Hartnett
965a3aac70 minor refactor of the build system to work better for cross-compiling 2011-03-15 10:19:08 +00:00