Commit Graph

808 Commits

Author SHA1 Message Date
Dennis Heimbigner
c7cf0d3807 Fix LGTM errors 2020-06-28 19:07:08 -06:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Edward Hartnett
87226c4879 readded NOTNC3 varm functions to dispatch 2020-06-03 05:55:30 -06:00
Sean Arms
c37cc13dca Treat time units as case-insensitive in ncdump
Enables ncdump -t (-i) to recognize a wider variety of time related units
and calendar names. This brings ncdump closer to what it advertises in its
man page regarding its understanding of udunits compliant time units.
2020-05-14 06:48:03 -06:00
Dennis Heimbigner
84c69afca7 Allow redefinition of variable filters
re: Github issue https://github.com/Unidata/netcdf-c/issues/1713

If nc_def_var_filter or nc_def_var_deflate or nc_def_var_szip is
called multiple times with the same filter id, but possibly with
different sets of parameters, then the first invocation is
sticky and later invocations are ignored. The desired behavior
is to have the last invocation be used.

This PR implements that desired behavior, with some special
cases.  If you call nc_def_var_deflate multiple times, then the
last invocation rule applies with respect to deflate. However,
the shuffle filter, if enabled, is always applied just before
applying deflate.

Misc unrelated changes:
1. Make client-side filters be disabled by default
2. Fix the definition of uintptr_t and use in oc2 and libdap4
3. Add some test cases
4. modify filter order tests to use plugin filters rather
   than client-side filters
2020-05-11 09:42:31 -06:00
Ward Fisher
d772543a9b
Merge branch 'master' into dispnoop.dmh 2020-04-27 15:54:22 -06:00
Dennis Heimbigner
f0cd7f8ec1 Support no-op dispatch functions
re: https://github.com/Unidata/netcdf-c/issues/1693

1. Add functions to libdispatch/dnotnc4.c to support
   dispatch table operations that should work for any
   dispatch table, even if they do not do anything.
   Functions such as nc_inq_var_filter.
2. Modify selected dispatch tables to utilize
   the noop functions.
3. Extend nc_test/tst_formats.c to test.

This is an extension of Ed's work to do this for
chunking and deflate and szip. See PRs
https://github.com/Unidata/netcdf-c/pull/1697
and
https://github.com/Unidata/netcdf-c/pull/1692

As a side effect, elide libdispatch/dnotnc3.c since
it is no longer used.
2020-04-15 14:44:58 -06:00
Edward Hartnett
9ac441ad6a cleanup 2020-04-15 05:53:59 -06:00
Dennis Heimbigner
313121a229 Use proper CURLOPT values for VERIFYHOST and VERIFYPEER
re: https://github.com/Unidata/netcdf-c/issues/1684
re: e-support VZL-904142

Two issues:
1. As of libcurl 7.66, the semantics of CURLOPT_SSL_VERIFYHOST
   changed so that the non-zero values affects certificate processing.
2. The current library was forcing the values of VERIFYPEER
   and VERIFYHOST to zero instead of leaving them to the default values.

Solution was first to leave the defaults in place for VERIFYPEER and VERIFYHOST
as long as they are not set in .ocrc/.dodsrc file.
Second, the value of HTTP.SSL.VERIFYPEER or HTTP.SSL.VERIFYHOST
as set in .ocrc/.dodrc is used to set the corresponding CURLOPT flags.
So for example, adding
> HTTP.SSL.VERIFYHOST=2
will set the value of CURLOPT_SSL_VERIFYHOST to 2, the default.
Using
> HTTP.SSL.VERIFYHOST=0
will set the value of CURLOPT_SSL_VERIFYHOST to 0, which disables it.
Similarly for VERIFYPEER.

Finally the semantics of HTTP.SSL.VALIDATE is now equivalent to
> HTTP.SSL.VERIFYPEER=1
> HTTP.SSL.VERIFYHOST=2
2020-04-10 13:42:27 -06:00
Edward Hartnett
b76a0c8521 documentation improvements 2020-04-08 09:12:19 -06:00
Edward Hartnett
7366edb43f documentation improvements 2020-04-08 09:10:42 -06:00
Edward Hartnett
58e5d53e96 documentation improvements 2020-04-08 09:09:46 -06:00
Edward Hartnett
41ea23a8ac
Merge branch 'master' into ejh_fix_nc3_deflate 2020-04-08 08:54:50 -06:00
Edward Hartnett
1c189b2c56 dealing with nc_inq_var_szip(), testing, and release notes 2020-04-08 08:49:04 -06:00
Edward Hartnett
aab2f998b3 now testing that nc_inq_var_deflate() works for all formats and returns 0 deflate and deflate_level 2020-04-08 08:31:53 -06:00
Dennis Heimbigner
6f86660da8 Fix missing forward declarations
re: issue https://github.com/Unidata/netcdf-c/issues/1687

static functions are being used before decl and it causes
errors. Only occurs when BIG_ENDIAN is defined.
Solution is to add the forward declarations.
2020-04-03 20:15:34 -06:00
Edward Hartnett
9b6215936b updated documentation of nc_inq_var_deflate() to describe behavior of deflate_level when deflate not in use 2020-03-17 10:33:53 -06:00
Edward Hartnett
edea5e3552 now pass 0 for deflate_level if deflate not in use 2020-03-16 11:01:13 -06:00
Dennis Heimbigner
1bce6b9b5c Fix open/create of UTF8 names
re: issue https://github.com/Unidata/netcdf-c/issues/1666

The code in NC_open and NC_create (in dfile.c)
was using improperly testing for leading whitespace chars.
It was treating UTF-8 as whitespace.

Fix is to do tests using unsigned char.
2020-03-11 11:25:57 -06:00
Edward Hartnett
7004bbc2d5 updated documentation 2020-03-06 09:54:26 -07:00
Edward Hartnett
d5aba68cec updated docs for nc_def_var_chunking WRT scalars 2020-03-02 16:41:01 -07:00
Edward Hartnett
ba0491bb40 documentation improvements for nc_var_par_access() 2020-03-02 16:36:56 -07:00
Dennis Heimbigner
b488c272d5 Fix conflicts with master 2020-02-27 14:06:45 -07:00
Dennis Heimbigner
44d0dcaad2 Add support for multiple filters per variable.
re: https://github.com/Unidata/netcdf-c/issues/1584

Support has been added for multiple filters per variable.  This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.

The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
    - Obtain a list of the filters associated with a variable
    - Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
  about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
  an extended format for specifying a sequence of filters.
  The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
  filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
  for filters known to the netcdf library, which in turn is taken
  from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
  is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
  to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.

Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.

Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
2020-02-16 12:59:33 -07:00
Edward Hartnett
c0d9c6237d added more documentation to nc_def_var_filter() 2020-02-09 17:59:41 -07:00
Edward Hartnett
8057a552ef move nc_def_var_szip function so it will appear in the documentation 2020-02-07 09:09:01 -07:00
Edward Hartnett
cbc2677094 updated documentation 2020-02-07 05:02:59 -07:00
Edward Hartnett
fb2a1048bb documentation improvements for nc_inq_var_szip() 2020-02-06 07:42:53 -07:00
Edward Hartnett
d5859e91b7 not return 0 for parameters to nc_inq_var_szip if szip is not turned on for var 2020-02-06 07:35:07 -07:00
Edward Hartnett
a6fbc3eea2 fix problem with pre-enddef call to nc_inq_var_szip() 2020-02-04 07:11:44 -07:00
Edward Hartnett
626f40843c more documentation for nc_inq_var_szip 2020-02-04 05:41:20 -07:00
Ward Fisher
8771d0bdf4
Merge pull request #1582 from NOAA-GSD/ejh_parallel_zlib
Allow user to turn on zlib, shuffle, and/or fletcher32 filters with parallel I/O for HDF5-1.10.2+
2020-01-13 16:06:51 -07:00
Dennis Heimbigner
748d26c114 Add support for CURLOPT_CONNECTTIMEOUT
I see that there is no way to set CURLOPT_CONNECTTIMEOUT,
but there is support for CURLOPT_TIMEOUT.
So, accept the line 'HTTP.CONNECTTIMEOUT'
in .rc file to allow user to set CURLOPT_CONNECTTIMEOUT.
2020-01-09 11:48:04 -07:00
Dennis Heimbigner
f587654670 Make the dap4 code resistant to various server errors.
Some versions of some servers are returning malformed responses.
Make the library either handle them or gracefully fail.
The three server errors "fixed" here are as follows.
1. The attribute _NCProperties sometimes has a trailing nul character
   in its value. Soln is to elide the nul(s).
2. Sometimes a DAP response has no data part, only a DMR.
   Soln is to detect and return an error code instead of crashing.
3. Sometimes a server returns a redirection, but our current
   openmagic() function was not following the redirect. Soln
   is to follow redirects.
Also because of #2, I am temporarily making --disable-dap-remote-tests
be the default.
2020-01-08 15:18:31 -07:00
Ward Fisher
438119dd69
Merge pull request #1560 from NOAA-GSD/ejh_cache_docs
increase default cache size for netCDF-4/HDF5 files, also improve cache docs and add benchmarking program
2020-01-07 11:46:30 -07:00
Ward Fisher
fb062f4406 Correct a cmake linking error discovered when working in a mips qemu environment. 2020-01-02 12:57:59 -05:00
Edward Hartnett
995cfdad96 merged master 2019-12-20 11:16:11 -07:00
Edward Hartnett
accb83a8b5 even more documentation updates 2019-12-20 07:20:02 -07:00
Edward Hartnett
8681b0d241 more documentaiton 2019-12-20 07:10:13 -07:00
Edward Hartnett
2136063d69 better documentation 2019-12-20 07:05:23 -07:00
Edward Hartnett
6952eb779b documentation updates 2019-12-20 05:36:09 -07:00
Ward Fisher
6c75e97764
Merge pull request #1570 from NOAA-GSD/ejh_compact
enable compact storage for netcdf-4 vars
2019-12-19 16:47:05 -07:00
James Sharpe
c5d1e4bdec Call find_package(MPI) to locate MPI paths and link to libdispatch if required 2019-12-18 16:48:40 +00:00
Edward Hartnett
19fef32a9e better documentation for compact storage 2019-12-16 09:52:59 -07:00
Edward Hartnett
e43a5d952c updated docs for NC_COMPACT 2019-12-04 09:16:33 -07:00
Edward Hartnett
64d821b568 removed non-relaxed coord bounds from test code 2019-11-26 06:20:34 -07:00
Edward Hartnett
5ab7bf7796 now always relax! 2019-11-26 05:36:16 -07:00
Edward Hartnett
2682ffd68d improved docs for cache functions, added libhdf5/hdf5cache.c to Doxyfile.in, added benchmark program for cache settings 2019-11-25 16:33:04 -07:00
Ward Fisher
af8f9ad2cf
Merge pull request #1523 from NetCDF-World-Domination-Council/ejh_udf
User-defined formats must come first in NC_infermodel, plus test
2019-11-15 16:40:07 -07:00
Ward Fisher
e4003be502
Merge pull request #1515 from NetCDF-World-Domination-Council/ejh_att_docs
update for attribute documentation
2019-11-15 16:39:46 -07:00