Commit Graph

192 Commits

Author SHA1 Message Date
Dennis Heimbigner
289103d2b1 Merge branch 'master' into zarrs3.dmh 2021-10-07 15:10:03 -06:00
Dennis Heimbigner
dc2ecc74ac (1) improve INI parser (2) Fix make discheck 2021-09-30 13:45:09 -06:00
Edward Hartnett
0ce463761c
Merge branch 'main' into ejh_quantize_2 2021-09-07 10:44:45 -06:00
Dennis Heimbigner
11fe00ea05 Add filter support to NCZarr
Filter support has three goals:

1. Use the existing HDF5 filter implementations,
2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr,
3. Allow filters to be used even when HDF5 is disabled

Detailed usage directions are define in docs/filters.md.

For now, the existing filter API is left in place. So filters
are defined using ''nc_def_var_filter'' using the HDF5 style
where the id and parameters are unsigned integers.

This is a big change since filters affect many parts of the code.

In the following, the terms "compressor" and "filter" and "codec" are generally
used synonomously.

### Filter-Related Changes:
* In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms.
* Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h.
* Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out.
* Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h.
* Add a number of new test to test the new nczarr filters.
* Let ncgen parse _Codecs attribute, although it is ignored.

### Plugin directory changes:
* Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file
* Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip
* Add a Codec defaulter (see docs/filters.md) for the big four filters.
* Make plugins work with windows by properly adding __declspec declaration.

### Misc. Non-Filter Changes
* Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5.
* Improve support for caching
* More fixes for path conversion code
* Fix misc. memory leaks
* Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath.
* Add a number of new test to test the non-filter fixes.
* Update the parsers
* Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'
2021-09-02 17:04:26 -06:00
Edward Hartnett
f880a63f73 added parallel I/O quantize test 2021-09-02 10:18:42 -06:00
Edward Hartnett
9a18689ffa getting ready for next try at quantization code 2021-08-24 00:45:38 -06:00
Ward Fisher
7446311dfd
Merge pull request #2014 from Unidata/fix-makedist.wif
Another PR to fix 'make distcheck' issues that had crept in.
2021-06-01 15:26:23 -06:00
Ward Fisher
64fa63a774 Clean up stray file that was fouling make distcheck 2021-06-01 13:55:10 -06:00
Dennis Heimbigner
e632d02041 Re-enable DAP2 authorization tests
The thredds-test server now has some password protected datasets
that can be used to test DAP2 authorization support.
The general location is
````
https://thredds.ucar.edu/thredds/tdscapabilities/authTest.html
````
and specifically:
````
https://thredds.ucar.edu/thredds/dodsC/test3/testData.nc.html
````

This PR replaces old testcases with ncdap_test/testauth.sh.
This testcase allows us to test use of the .dodsrc file and .netrc file
and embedded user+pwd.

As part of this, I had to create a program (ncdap_test/pathcvt.c)
that is essentially the equivalent to cygpath. Given a path in
windows, unix, msys or cygwin format, it converts it to the
equivalent format in one of those four cases.  So it can be used
to convert a cygwin path to a windows path, for example. This is
needed in testpathcvt and testauth to make sure that the paths
in .daprc (e.g. the reference to .netrc) are of the proper
format.

Misc. Other Changes:
1. Fix some memory leaks in libdap2
2. Setting the env variable CURLOPT_VERBOSE allows tracking of curl
   operations.
3. Make tst_charvlenbug be conditional on NC_VLEN_NOTEST.
2021-05-29 21:30:33 -06:00
Dennis Heimbigner
e07e32d552 Add test case 2021-04-16 16:12:53 -06:00
Dennis Heimbigner
efd905a323 Add tests for filter order on read and write cases
re: https://github.com/Unidata/netcdf-c/issues/1923
re: https://github.com/Unidata/netcdf-c/issues/1921

The issue was raised about the order of returned filter ids
for nc_inq_var_filter_ids() when creating a file as opposed
to later reading the file.

For creation, the order is the same as the order in which the
calls to nc_def_var_filter() occur.
However, after the file is closed and then reopened for reading,
the question was raised if the returned order is the same or the reverse.
In fact the order is the same in both cases.

This PR extends the existing filter order testcase to check the create
versus read orders. This also required changing the H5Znoop(1) filters
in the plugins directory.

Misc. Unrelated Changes
1. fix calls to fdopen under windows
2. Temporarily suppres the nczarr_tests/run_chunkcases test
   since it seems to be causing problems with github actions.
2020-12-29 20:12:35 -07:00
Dennis Heimbigner
aeb3ac2809 Mostly revert the filter code to reduce its complexity of use.
re: https://github.com/Unidata/netcdf-c/issues/1836

Revert the internal filter code to simplify it. From the user's
point of view, the only visible changes should be:

1. The functions that convert text to filter specs have had their signature reverted and have been moved to netcdf_aux.h
2. Some filter API functions now return NC_ENOFILTER when inquiry is made about some filter.

Internally,the dispatch table has been modified to get rid of the filter_actions
entry and associated complex structures. It has been replaced with
inq_var_filter_ids and inq_var_filter_info entries and the dispatch table
version has been bumped to 3. Corresponding NOOP and NOTNC4 functions
were added to libdispatch/dnotnc4.c. Also, the filter_action entries
in dispatch tables were replaced for all dispatch code bases (HDF5, DAP2,
etc). This should only impact UDF users.

In the process, it became clear that the form of the filters
field in NC_VAR_INFO_T was format dependent, so I converted it to
be of type void* and pushed its management into the various dispatch
code bases. Specifically libhdf5 and libnczarr now manage the filters
field in their own way.

The auxilliary functions for parsing textual filter specifications
were moved to netcdf_aux.h and were renamed to the following:
* ncaux_h5filterspec_parse
* ncaux_h5filterspec_parselist
* ncaux_h5filterspec_free
* ncaux_h5filter_fix8

Misc. Other Changes:

1. Document NUG/filters.md updated to reflect the changes above.
2. All the old data types (structs and enums)
   used by filter_actions actions were deleted.
   The exception is the NC_H5_Filterspec because it is needed
   by ncaux_h5filterspec_parselist.
3. Clientside filters were removed -- another enhancement
   for which no-one ever asked.
4. The ability to remove filters was itself removed.
5. Some functionality needed by nczarr was moved from libhdf5
   to libsrc4 e.g. nc4_find_default_chunksizes
6. All the filterx code was removed
7. ncfilter.h and nc4filter.c no longer used

Misc. Unrelated Changes:

1. The nczarr_test makefile clean was leaving some directories; so
   add clean-local to take care of them.
2020-09-27 12:43:46 -06:00
Dennis Heimbigner
2f0a6d22e9 Fix error where not converting fill data
re: Github Issue https://github.com/Unidata/netcdf-c/issues/1826

It turns out that the common get code (NC4_get_vars) in libhdf5
(and libnczarr) has an optimization where it does not attempt to
read from the file if the file is all fill values. Rather it
just fills the output buffer with the fill value.  The problem
is that -- in that case -- it forgets that conversion might still be
needed.  So the conversion never occurs and the raw bits of
the fill data are stored directly into the memory space.

Solution: move some code around to properly do the
conversion no matter how the data was obtained.

Added a test cases nc_test4/test_fillonly.sh and
nczarr_test/test_fillonlyz.sh
2020-09-12 14:49:59 -06:00
Dennis Heimbigner
84c69afca7 Allow redefinition of variable filters
re: Github issue https://github.com/Unidata/netcdf-c/issues/1713

If nc_def_var_filter or nc_def_var_deflate or nc_def_var_szip is
called multiple times with the same filter id, but possibly with
different sets of parameters, then the first invocation is
sticky and later invocations are ignored. The desired behavior
is to have the last invocation be used.

This PR implements that desired behavior, with some special
cases.  If you call nc_def_var_deflate multiple times, then the
last invocation rule applies with respect to deflate. However,
the shuffle filter, if enabled, is always applied just before
applying deflate.

Misc unrelated changes:
1. Make client-side filters be disabled by default
2. Fix the definition of uintptr_t and use in oc2 and libdap4
3. Add some test cases
4. modify filter order tests to use plugin filters rather
   than client-side filters
2020-05-11 09:42:31 -06:00
Dennis Heimbigner
44d0dcaad2 Add support for multiple filters per variable.
re: https://github.com/Unidata/netcdf-c/issues/1584

Support has been added for multiple filters per variable.  This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.

The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
    - Obtain a list of the filters associated with a variable
    - Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
  about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
  an extended format for specifying a sequence of filters.
  The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
  filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
  for filters known to the netcdf library, which in turn is taken
  from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
  is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
  to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.

Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.

Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
2020-02-16 12:59:33 -07:00
Edward Hartnett
04624c7be3 fix error in makefile building tst_parallel_compress.c 2020-02-06 09:17:49 -07:00
Edward Hartnett
cf74f49fdb moved tst_parallel_zlib2 to tst_parallel_compress 2020-02-05 17:40:31 -07:00
Edward Hartnett
d534b1298a adding another zlib parallel I/O test 2019-12-20 05:28:20 -07:00
Edward Hartnett
3e00967879 allow parallel writes to use zlib 2019-12-19 09:19:23 -07:00
edwardhartnett
cebe84157b adding test for anonymous dims in HDF5 file 2019-11-13 12:51:34 -07:00
Dennis Heimbigner
9e016b85aa Add test cases 2019-11-03 12:03:13 -07:00
Even Rouault
16babd3e89
nc_test4/Makefile.am: add tst_bug1442 2019-07-18 02:38:28 +02:00
Ward Fisher
103e5a3ff5
Merge pull request #1385 from Unidata/mmapfix.dmh
Fix cmake wrt mmap
2019-05-02 11:09:12 -06:00
Ward Fisher
5410967b00 Merge branch 'master' into addfilter.dmh 2019-04-30 14:51:25 -06:00
Dennis Heimbigner
c9d16d82d6 Fix cmake X mmap
supercede PR: https://github.com/Unidata/netcdf-c/pull/1384

Since we have an mmap user, undeprecate it and make sure
it works. Other changes:

* fix test cases to work with make -j
* fix exposed ncgen error.
2019-04-19 20:32:26 -06:00
Ed Hartnett
d4ab03ea23 merged upstream/master 2019-03-26 11:29:01 -06:00
Dennis Heimbigner
8d0bced60d Allow in-line definition of filters
Priority: Low

re: issue https://github.com/Unidata/netcdf-c/issues/1329

HDF5 has the ability to programmatically define new filters,
as opposed to using HDF5_PLUGIN_PATH env variable.
This PR adds support for that feature.
Not clear how useful this is, though.
See docs/filters.md for details.
2019-03-21 11:33:27 -06:00
Ed Hartnett
54eb9dad77
Merge branch 'master' into ejh_non_contoversial 2019-03-19 14:16:52 -06:00
Ed Hartnett
d709624174 now run tst_put_vars_two_unlim_dim 2019-03-18 11:16:44 -06:00
Ed Hartnett
8799fff54b cleanup Makefile.am 2019-03-18 11:09:44 -06:00
Ed Hartnett
06c9f681f4 got distcheck target working 2019-03-18 10:15:18 -06:00
Ed Hartnett
e3c628482e moving metadata performance tests to nc_perf 2019-03-18 08:30:14 -06:00
Ed Hartnett
cf7126b294 fixing distcheck issues 2019-03-17 11:17:29 -06:00
Ed Hartnett
e48bd5e4c2 removed benchmarking stuff from nc_test4/Makefile.am 2019-03-17 10:57:45 -06:00
Ed Hartnett
73354a9862 moved tst_create_files from nc_test4 to nc_perf 2019-03-17 08:11:08 -06:00
Ed Hartnett
7902615b1c adding nc_perf directory 2019-03-17 08:03:27 -06:00
Ed Hartnett
c89fad34e0 only run some benchmark tests if utilities have been built 2019-03-12 09:23:49 -06:00
Dennis Heimbigner
0c59e13bf7 Master merge, conflict resolution, cleanup 2019-02-24 16:54:13 -07:00
Ward Fisher
53ba2ff316
Merge pull request #1311 from Unidata/filterexpr.dmh
Extend nccopy -F option syntax.
2019-02-15 15:16:25 -07:00
Dennis Heimbigner
5f0fdd7e5d Update Makefile.am to use ref_ filter files 2019-02-11 16:07:03 -07:00
Dennis Heimbigner
a6b04c0c66 Extend nccopy -F option syntax.
A user suggested that the nccopy -F option
syntax should be extended to support specification
of multiple (or all) variables in a single -F option.

The new syntax allows:

1. '*' as the name of the variable; this means apply the
   filter to all variables in the data set.
2. *var1|var2|...* as the variable name to indicate that the filter
   should be applied to the multiple specified variables.
2019-02-08 18:48:17 -07:00
Ward Fisher
1d56285145
Merge branch 'master' into ejh_more_rename_woes_2 2019-02-07 14:28:43 -07:00
Ward Fisher
6511599a6d Corrected issue in support of https://github.com/Unidata/netcdf-c/issues/1310 2019-02-07 12:43:07 -07:00
Ed Hartnett
1dd76c996e added tst_rename3.c for more rename testing 2019-02-02 05:53:45 -07:00
Ed Hartnett
59ee728b93 fixed parallel build problem with tst_files2.c when --enable-benchmarks is used 2019-01-20 10:07:13 -07:00
Ed Hartnett
35017d9549 fixed mistake in Makefile.am 2019-01-02 06:21:11 -07:00
Ed Hartnett
c15c3fa8d5 cleaned up makefile.am, added config.h to some tests 2019-01-02 05:31:15 -07:00
Ed Hartnett
dc982efb3d moved tst_attsperf to be with rest of benchmarks 2018-12-17 08:44:36 -07:00
Ed Hartnett
89b2c48d4c added benchmark program tst_wrf_reads.c 2018-12-11 06:32:25 -07:00
Dennis Heimbigner
751300ec59 Fix more memory leaks in netcdf-c library
This is a follow up to PR https://github.com/Unidata/netcdf-c/pull/1173

Sorry that it is so big, but leak suppression can be complex.

This PR fixes all remaining memory leaks -- as determined by
-fsanitize=address, and with the exceptions noted below.

Unfortunately. there remains a significant leak that I cannot
solve. It involves vlens, and it is unclear if the leak is
occurring in the netcdf-c library or the HDF5 library.

I have added a check_PROGRAM to the ncdump directory to show the
problem.  The program is called tst_vlen_demo.c To exercise it,
build the netcdf library with -fsanitize=address enabled. Then
go into ncdump and do a "make clean check".  This should build
tst_vlen_demo without actually executing it.  Then do the
command "./tst_vlen_demo" to see the output of the memory
checker.  Note the the lost malloc is deep in the HDF5 library
(in H5Tvlen.c).

I am temporarily working around this error in the following way.
1. I modified several test scripts to not execute known vlen tests
   that fail as described above.
2. Added an environment variable called NC_VLEN_NOTEST.
   If set, then those specific tests are suppressed.

This should mean that the --disable-utilities option to
./configure should not need to be set to get a memory leak clean
build.  This should allow for detection of any new leaks.

Note: I used an environment variable rather than a ./configure
option to control the vlen tests. This is because it is
temporary (I hope) and because it is a bit tricky for shell
scripts to access ./configure options.

Finally, as before, this only been tested with netcdf-4 and hdf5 support.
2018-11-15 10:00:38 -07:00