Commit Graph

127 Commits

Author SHA1 Message Date
Edward Hartnett
0ce463761c
Merge branch 'main' into ejh_quantize_2 2021-09-07 10:44:45 -06:00
Dennis Heimbigner
11fe00ea05 Add filter support to NCZarr
Filter support has three goals:

1. Use the existing HDF5 filter implementations,
2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr,
3. Allow filters to be used even when HDF5 is disabled

Detailed usage directions are define in docs/filters.md.

For now, the existing filter API is left in place. So filters
are defined using ''nc_def_var_filter'' using the HDF5 style
where the id and parameters are unsigned integers.

This is a big change since filters affect many parts of the code.

In the following, the terms "compressor" and "filter" and "codec" are generally
used synonomously.

### Filter-Related Changes:
* In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms.
* Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h.
* Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out.
* Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h.
* Add a number of new test to test the new nczarr filters.
* Let ncgen parse _Codecs attribute, although it is ignored.

### Plugin directory changes:
* Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file
* Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip
* Add a Codec defaulter (see docs/filters.md) for the big four filters.
* Make plugins work with windows by properly adding __declspec declaration.

### Misc. Non-Filter Changes
* Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5.
* Improve support for caching
* More fixes for path conversion code
* Fix misc. memory leaks
* Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath.
* Add a number of new test to test the non-filter fixes.
* Update the parsers
* Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'
2021-09-02 17:04:26 -06:00
Edward Hartnett
f880a63f73 added parallel I/O quantize test 2021-09-02 10:18:42 -06:00
Edward Hartnett
9a18689ffa getting ready for next try at quantization code 2021-08-24 00:45:38 -06:00
Ward Fisher
9f798e2ed6 Merge branch 'virtual_datasets' of https://github.com/d70-t/netcdf-c into gh1983.wif 2021-07-19 09:44:35 -07:00
Dennis Heimbigner
c984b3a428 fix CMake error 2021-04-16 18:54:35 -06:00
Dennis Heimbigner
e07e32d552 Add test case 2021-04-16 16:12:53 -06:00
Tobias Kölling
a80d473ca8 Merge remote-tracking branch 'upstream/master' into virtual_datasets 2020-09-15 09:50:25 +02:00
Tobias Kölling
b6b9f2fd6f enable test for virtual datasets only for HDF5 >= 1.10.0 2020-09-14 18:06:34 +02:00
Dennis Heimbigner
2f0a6d22e9 Fix error where not converting fill data
re: Github Issue https://github.com/Unidata/netcdf-c/issues/1826

It turns out that the common get code (NC4_get_vars) in libhdf5
(and libnczarr) has an optimization where it does not attempt to
read from the file if the file is all fill values. Rather it
just fills the output buffer with the fill value.  The problem
is that -- in that case -- it forgets that conversion might still be
needed.  So the conversion never occurs and the raw bits of
the fill data are stored directly into the memory space.

Solution: move some code around to properly do the
conversion no matter how the data was obtained.

Added a test cases nc_test4/test_fillonly.sh and
nczarr_test/test_fillonlyz.sh
2020-09-12 14:49:59 -06:00
Tobias Kölling
1a7dd2332d added test for reading virtual datasets 2020-07-22 17:55:28 +02:00
Greg Sjaardema
102758d3ce
Remove test since file was moved to nc_perf
In commit ba6ab3, the `tst_gfs_data_1.c` file was moved from `nc_test4` to `nc_perf`, but the test/executable that uses that file was not removed from nc_test4 CMakeLists.txt
2020-07-10 15:14:12 -06:00
Edward Hartnett
b13fcf96e7 added tst_gfs_data1 2020-06-28 19:38:18 -06:00
Dennis Heimbigner
84c69afca7 Allow redefinition of variable filters
re: Github issue https://github.com/Unidata/netcdf-c/issues/1713

If nc_def_var_filter or nc_def_var_deflate or nc_def_var_szip is
called multiple times with the same filter id, but possibly with
different sets of parameters, then the first invocation is
sticky and later invocations are ignored. The desired behavior
is to have the last invocation be used.

This PR implements that desired behavior, with some special
cases.  If you call nc_def_var_deflate multiple times, then the
last invocation rule applies with respect to deflate. However,
the shuffle filter, if enabled, is always applied just before
applying deflate.

Misc unrelated changes:
1. Make client-side filters be disabled by default
2. Fix the definition of uintptr_t and use in oc2 and libdap4
3. Add some test cases
4. modify filter order tests to use plugin filters rather
   than client-side filters
2020-05-11 09:42:31 -06:00
Dennis Heimbigner
44d0dcaad2 Add support for multiple filters per variable.
re: https://github.com/Unidata/netcdf-c/issues/1584

Support has been added for multiple filters per variable.  This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.

The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
    - Obtain a list of the filters associated with a variable
    - Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
  about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
  an extended format for specifying a sequence of filters.
  The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
  filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
  for filters known to the netcdf library, which in turn is taken
  from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
  is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
  to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.

Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.

Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
2020-02-16 12:59:33 -07:00
Edward Hartnett
cf74f49fdb moved tst_parallel_zlib2 to tst_parallel_compress 2020-02-05 17:40:31 -07:00
Edward Hartnett
6d2d92ddec added new tests to cmake, also additional test development 2019-12-20 06:30:23 -07:00
Edward Hartnett
cde58d23c1 allowing parallel write of gzipped data, plus added test 2019-12-19 09:20:20 -07:00
edwardhartnett
cebe84157b adding test for anonymous dims in HDF5 file 2019-11-13 12:51:34 -07:00
Ward Fisher
9a92201c94 Wiring unit test directory into cmake-based builds. 2019-08-21 14:50:09 -06:00
Even Rouault
0c7be1d278
Add test case for bugfix of #1442 2019-07-18 02:23:43 +02:00
Ward Fisher
44fce0904e
Merge pull request #1387 from Unidata/fixffilter.dmh
Minor config.h changes to support filters in Fortran
2019-05-01 14:44:53 -06:00
Ward Fisher
5410967b00 Merge branch 'master' into addfilter.dmh 2019-04-30 14:51:25 -06:00
Dennis Heimbigner
62feacee00 missing cmake file 2019-04-29 20:55:28 -06:00
Dennis Heimbigner
8d0bced60d Allow in-line definition of filters
Priority: Low

re: issue https://github.com/Unidata/netcdf-c/issues/1329

HDF5 has the ability to programmatically define new filters,
as opposed to using HDF5_PLUGIN_PATH env variable.
This PR adds support for that feature.
Not clear how useful this is, though.
See docs/filters.md for details.
2019-03-21 11:33:27 -06:00
Ed Hartnett
94d9cd7c8f don't allow benchmarks for classic only builds 2019-03-18 08:40:18 -06:00
Ed Hartnett
51b8ba10b4 attempting to get cmake build working with new nc_perf directory 2019-03-18 08:09:46 -06:00
Ward Fisher
404f87b8c2 Turned of filterparser test when building static library. 2019-02-20 15:11:06 -07:00
Ed Hartnett
1dd76c996e added tst_rename3.c for more rename testing 2019-02-02 05:53:45 -07:00
Ward Fisher
462ec93913 Whew! Updated copyright stanza in nc_test4. 2018-12-06 15:27:32 -07:00
Ed Hartnett
8847c843fb further cleanup for benchmark builds 2018-08-24 12:48:42 -06:00
Ed Hartnett
e1bd6f2c20 fixing cmake benchmarks, also removing unneeded run_bm.sh 2018-08-24 09:04:01 -06:00
Ed Hartnett
77d3a6db22 removed tst_ar5 from Makefile.am and cmake build 2018-08-24 07:11:51 -06:00
Ed Hartnett
1b318a01fb getting automake build working 2018-08-16 10:55:11 -06:00
Greg Sjaardema
c3f63b6ffe Enable metadata_perf test in CMake build
Currently defaults to enabled, can change top-level
CMakeLists.txt file to change default to disabled
2018-07-11 13:40:31 -04:00
Ed Hartnett
5112ef8c72 merged master 2018-06-07 07:27:47 -06:00
Ed Hartnett
fd23cb3cf2 cleanup makefile, develop test 2018-05-14 08:11:32 -06:00
Ed Hartnett
350946b1e0 user defined formats only if netcdf-4 is built 2018-05-13 16:03:04 -06:00
Ward Fisher
65f7dd0397
Merge branch 'master' into ejh_move_tests 2018-04-23 13:36:53 -06:00
Ed Hartnett
b5e282996c moved tests 2018-04-23 03:19:11 -06:00
Dennis Heimbigner
d3b309722e re: gh issue https://github.com/Unidata/netcdf-c/issues/911
I took Ed's advice and moved the plugin stuff to its own
top-level directory. This is an attempt to solve the problem of
copying files that we have experienced. In any case, it will
serve as a place to stick additional plugins.
2018-04-21 20:10:47 -06:00
Dennis Heimbigner
25f062528b This completes (for now) the refactoring of libsrc4.
The file docs/indexing.dox tries to provide design
information for the refactoring.

The primary change is to replace all walking of linked
lists with the use of the NCindex data structure.
Ncindex is a combination of a hash table (for name-based
lookup) and a vector (for walking the elements in the index).
Additionally, global vectors are added to NC_HDF5_FILE_INFO_T
to support direct mapping of an e.g. dimid to the NC_DIM_INFO_T
object. These global vectors exist for dimensions, types, and groups
because they have globally unique id numbers.

WARNING:
1. since libsrc4 and libsrchdf4 share code, there are also
   changes in libsrchdf4.
2. Any outstanding pull requests that change libsrc4 or libhdf4
   are likely to cause conflicts with this code.
3. The original reason for doing this was for performance improvements,
   but as noted elsewhere, this may not be significant because
   the meta-data read performance apparently is being dominated
   by the hdf5 library because we do bulk meta-data reading rather
   than lazy reading.
2018-03-16 11:46:18 -06:00
Ed Hartnett
4e646e03f6 fixed cmake build file 2018-03-05 05:03:46 -07:00
Ed Hartnett
49ebcc0e57 fixed problem of test file 2018-02-27 11:10:04 -07:00
Ed Hartnett
64e5742d88 added tst_bug324 to cmake build 2018-02-27 08:53:58 -07:00
Ed Hartnett
dbbd5094cb added rename test to CMakeLists.txt 2018-02-12 10:12:49 -07:00
Ward Fisher
d0339c8902 Merge branch 'pr-catchup' into v4.6.0-release-branch 2018-01-24 15:51:24 -06:00
Ward Fisher
ca814c17f0 Added fenceposts for filter testing. 2018-01-23 18:18:58 -06:00
Ed Hartnett
001483505f rehabilitated tst_types.c 2018-01-18 05:34:52 -07:00
Ward Fisher
62abc6af09 Updated CMakeLists.txt to correct an issue seen on OSX when using cmake and running 'make test' directly from nc_test4 2018-01-17 14:00:52 -07:00