Commit Graph

857 Commits

Author SHA1 Message Date
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Edward Hartnett
87226c4879 readded NOTNC3 varm functions to dispatch 2020-06-03 05:55:30 -06:00
Sean Arms
c37cc13dca Treat time units as case-insensitive in ncdump
Enables ncdump -t (-i) to recognize a wider variety of time related units
and calendar names. This brings ncdump closer to what it advertises in its
man page regarding its understanding of udunits compliant time units.
2020-05-14 06:48:03 -06:00
Dennis Heimbigner
84c69afca7 Allow redefinition of variable filters
re: Github issue https://github.com/Unidata/netcdf-c/issues/1713

If nc_def_var_filter or nc_def_var_deflate or nc_def_var_szip is
called multiple times with the same filter id, but possibly with
different sets of parameters, then the first invocation is
sticky and later invocations are ignored. The desired behavior
is to have the last invocation be used.

This PR implements that desired behavior, with some special
cases.  If you call nc_def_var_deflate multiple times, then the
last invocation rule applies with respect to deflate. However,
the shuffle filter, if enabled, is always applied just before
applying deflate.

Misc unrelated changes:
1. Make client-side filters be disabled by default
2. Fix the definition of uintptr_t and use in oc2 and libdap4
3. Add some test cases
4. modify filter order tests to use plugin filters rather
   than client-side filters
2020-05-11 09:42:31 -06:00
Ward Fisher
d772543a9b
Merge branch 'master' into dispnoop.dmh 2020-04-27 15:54:22 -06:00
Dennis Heimbigner
f0cd7f8ec1 Support no-op dispatch functions
re: https://github.com/Unidata/netcdf-c/issues/1693

1. Add functions to libdispatch/dnotnc4.c to support
   dispatch table operations that should work for any
   dispatch table, even if they do not do anything.
   Functions such as nc_inq_var_filter.
2. Modify selected dispatch tables to utilize
   the noop functions.
3. Extend nc_test/tst_formats.c to test.

This is an extension of Ed's work to do this for
chunking and deflate and szip. See PRs
https://github.com/Unidata/netcdf-c/pull/1697
and
https://github.com/Unidata/netcdf-c/pull/1692

As a side effect, elide libdispatch/dnotnc3.c since
it is no longer used.
2020-04-15 14:44:58 -06:00
Edward Hartnett
9ac441ad6a cleanup 2020-04-15 05:53:59 -06:00
Dennis Heimbigner
313121a229 Use proper CURLOPT values for VERIFYHOST and VERIFYPEER
re: https://github.com/Unidata/netcdf-c/issues/1684
re: e-support VZL-904142

Two issues:
1. As of libcurl 7.66, the semantics of CURLOPT_SSL_VERIFYHOST
   changed so that the non-zero values affects certificate processing.
2. The current library was forcing the values of VERIFYPEER
   and VERIFYHOST to zero instead of leaving them to the default values.

Solution was first to leave the defaults in place for VERIFYPEER and VERIFYHOST
as long as they are not set in .ocrc/.dodsrc file.
Second, the value of HTTP.SSL.VERIFYPEER or HTTP.SSL.VERIFYHOST
as set in .ocrc/.dodrc is used to set the corresponding CURLOPT flags.
So for example, adding
> HTTP.SSL.VERIFYHOST=2
will set the value of CURLOPT_SSL_VERIFYHOST to 2, the default.
Using
> HTTP.SSL.VERIFYHOST=0
will set the value of CURLOPT_SSL_VERIFYHOST to 0, which disables it.
Similarly for VERIFYPEER.

Finally the semantics of HTTP.SSL.VALIDATE is now equivalent to
> HTTP.SSL.VERIFYPEER=1
> HTTP.SSL.VERIFYHOST=2
2020-04-10 13:42:27 -06:00
Edward Hartnett
b76a0c8521 documentation improvements 2020-04-08 09:12:19 -06:00
Edward Hartnett
7366edb43f documentation improvements 2020-04-08 09:10:42 -06:00
Edward Hartnett
58e5d53e96 documentation improvements 2020-04-08 09:09:46 -06:00
Edward Hartnett
41ea23a8ac
Merge branch 'master' into ejh_fix_nc3_deflate 2020-04-08 08:54:50 -06:00
Edward Hartnett
1c189b2c56 dealing with nc_inq_var_szip(), testing, and release notes 2020-04-08 08:49:04 -06:00
Edward Hartnett
aab2f998b3 now testing that nc_inq_var_deflate() works for all formats and returns 0 deflate and deflate_level 2020-04-08 08:31:53 -06:00
Dennis Heimbigner
6f86660da8 Fix missing forward declarations
re: issue https://github.com/Unidata/netcdf-c/issues/1687

static functions are being used before decl and it causes
errors. Only occurs when BIG_ENDIAN is defined.
Solution is to add the forward declarations.
2020-04-03 20:15:34 -06:00
Edward Hartnett
9b6215936b updated documentation of nc_inq_var_deflate() to describe behavior of deflate_level when deflate not in use 2020-03-17 10:33:53 -06:00
Edward Hartnett
edea5e3552 now pass 0 for deflate_level if deflate not in use 2020-03-16 11:01:13 -06:00
Dennis Heimbigner
1bce6b9b5c Fix open/create of UTF8 names
re: issue https://github.com/Unidata/netcdf-c/issues/1666

The code in NC_open and NC_create (in dfile.c)
was using improperly testing for leading whitespace chars.
It was treating UTF-8 as whitespace.

Fix is to do tests using unsigned char.
2020-03-11 11:25:57 -06:00
Edward Hartnett
7004bbc2d5 updated documentation 2020-03-06 09:54:26 -07:00
Edward Hartnett
d5aba68cec updated docs for nc_def_var_chunking WRT scalars 2020-03-02 16:41:01 -07:00
Edward Hartnett
ba0491bb40 documentation improvements for nc_var_par_access() 2020-03-02 16:36:56 -07:00
Dennis Heimbigner
b488c272d5 Fix conflicts with master 2020-02-27 14:06:45 -07:00
Dennis Heimbigner
44d0dcaad2 Add support for multiple filters per variable.
re: https://github.com/Unidata/netcdf-c/issues/1584

Support has been added for multiple filters per variable.  This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.

The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
    - Obtain a list of the filters associated with a variable
    - Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
  about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
  an extended format for specifying a sequence of filters.
  The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
  filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
  for filters known to the netcdf library, which in turn is taken
  from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
  is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
  to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.

Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.

Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
2020-02-16 12:59:33 -07:00
Edward Hartnett
c0d9c6237d added more documentation to nc_def_var_filter() 2020-02-09 17:59:41 -07:00
Edward Hartnett
8057a552ef move nc_def_var_szip function so it will appear in the documentation 2020-02-07 09:09:01 -07:00
Edward Hartnett
cbc2677094 updated documentation 2020-02-07 05:02:59 -07:00
Edward Hartnett
fb2a1048bb documentation improvements for nc_inq_var_szip() 2020-02-06 07:42:53 -07:00
Edward Hartnett
d5859e91b7 not return 0 for parameters to nc_inq_var_szip if szip is not turned on for var 2020-02-06 07:35:07 -07:00
Edward Hartnett
a6fbc3eea2 fix problem with pre-enddef call to nc_inq_var_szip() 2020-02-04 07:11:44 -07:00
Edward Hartnett
626f40843c more documentation for nc_inq_var_szip 2020-02-04 05:41:20 -07:00
Ward Fisher
8771d0bdf4
Merge pull request #1582 from NOAA-GSD/ejh_parallel_zlib
Allow user to turn on zlib, shuffle, and/or fletcher32 filters with parallel I/O for HDF5-1.10.2+
2020-01-13 16:06:51 -07:00
Dennis Heimbigner
748d26c114 Add support for CURLOPT_CONNECTTIMEOUT
I see that there is no way to set CURLOPT_CONNECTTIMEOUT,
but there is support for CURLOPT_TIMEOUT.
So, accept the line 'HTTP.CONNECTTIMEOUT'
in .rc file to allow user to set CURLOPT_CONNECTTIMEOUT.
2020-01-09 11:48:04 -07:00
Dennis Heimbigner
f587654670 Make the dap4 code resistant to various server errors.
Some versions of some servers are returning malformed responses.
Make the library either handle them or gracefully fail.
The three server errors "fixed" here are as follows.
1. The attribute _NCProperties sometimes has a trailing nul character
   in its value. Soln is to elide the nul(s).
2. Sometimes a DAP response has no data part, only a DMR.
   Soln is to detect and return an error code instead of crashing.
3. Sometimes a server returns a redirection, but our current
   openmagic() function was not following the redirect. Soln
   is to follow redirects.
Also because of #2, I am temporarily making --disable-dap-remote-tests
be the default.
2020-01-08 15:18:31 -07:00
Ward Fisher
438119dd69
Merge pull request #1560 from NOAA-GSD/ejh_cache_docs
increase default cache size for netCDF-4/HDF5 files, also improve cache docs and add benchmarking program
2020-01-07 11:46:30 -07:00
Ward Fisher
fb062f4406 Correct a cmake linking error discovered when working in a mips qemu environment. 2020-01-02 12:57:59 -05:00
Edward Hartnett
995cfdad96 merged master 2019-12-20 11:16:11 -07:00
Edward Hartnett
accb83a8b5 even more documentation updates 2019-12-20 07:20:02 -07:00
Edward Hartnett
8681b0d241 more documentaiton 2019-12-20 07:10:13 -07:00
Edward Hartnett
2136063d69 better documentation 2019-12-20 07:05:23 -07:00
Edward Hartnett
6952eb779b documentation updates 2019-12-20 05:36:09 -07:00
Ward Fisher
6c75e97764
Merge pull request #1570 from NOAA-GSD/ejh_compact
enable compact storage for netcdf-4 vars
2019-12-19 16:47:05 -07:00
James Sharpe
c5d1e4bdec Call find_package(MPI) to locate MPI paths and link to libdispatch if required 2019-12-18 16:48:40 +00:00
Edward Hartnett
19fef32a9e better documentation for compact storage 2019-12-16 09:52:59 -07:00
Edward Hartnett
e43a5d952c updated docs for NC_COMPACT 2019-12-04 09:16:33 -07:00
Edward Hartnett
64d821b568 removed non-relaxed coord bounds from test code 2019-11-26 06:20:34 -07:00
Edward Hartnett
5ab7bf7796 now always relax! 2019-11-26 05:36:16 -07:00
Edward Hartnett
2682ffd68d improved docs for cache functions, added libhdf5/hdf5cache.c to Doxyfile.in, added benchmark program for cache settings 2019-11-25 16:33:04 -07:00
Ward Fisher
af8f9ad2cf
Merge pull request #1523 from NetCDF-World-Domination-Council/ejh_udf
User-defined formats must come first in NC_infermodel, plus test
2019-11-15 16:40:07 -07:00
Ward Fisher
e4003be502
Merge pull request #1515 from NetCDF-World-Domination-Council/ejh_att_docs
update for attribute documentation
2019-11-15 16:39:46 -07:00
Ward Fisher
2462cda15e
Merge pull request #1525 from NetCDF-World-Domination-Council/ejh_anon_dims
If HDF5 dataset has multiple anonymous dimensions of the same size, assume they are different dimensions
2019-11-14 16:58:24 -07:00
Constantine Khrulev
91d3a89bdd Fix NC_coord_zero indexing in NCDISPATCH_initialize()
Fixes #1518.
2019-11-14 08:26:33 -09:00
edwardhartnett
b9f57b2b5f now UDF mode flags take priority over NC_NETCDF4 2019-11-13 12:13:33 -07:00
edwardhartnett
0bbe91e438 udf must take priority in NC_infermodel 2019-11-13 12:07:33 -07:00
edwardhartnett
c76dae1c5d added anchors for reading_attributes and writing_attributes, and refs to them, also changed order of files in Doxygen.in 2019-11-08 05:19:51 -07:00
edwardhartnett
09fe16c847 cleanup 2019-11-08 04:47:57 -07:00
edwardhartnett
8b2630913a adding doxygen docs for every att function 2019-11-08 04:45:45 -07:00
edwardhartnett
7919e2c052 fixing documentation for attribute put functions 2019-11-07 12:19:18 -07:00
edwardhartnett
42df9b09e5 fixing documentation for attribute put functions 2019-11-07 11:58:41 -07:00
edwardhartnett
3ecef5e7f0 fixing documentation for attribute get functions 2019-11-07 11:47:27 -07:00
edwardhartnett
209da6563c greater distinction between netCDF-4 and classic formats in attribute documentation 2019-11-07 11:32:57 -07:00
edwardhartnett
f46679c8cc cleanup and minor fixes for attribute rename/delete functions 2019-11-07 09:53:43 -07:00
edwardhartnett
ed8ef60855 cleanup and minor fixes for attribute inq functions 2019-11-07 09:46:23 -07:00
edwardhartnett
d961f7b76e cleanup of documentation format in attributes write code 2019-11-07 09:31:04 -07:00
edwardhartnett
837dccd217 changed format to match other docs, fixed file documentation 2019-11-07 07:09:54 -07:00
Ward Fisher
f77b96b066 Fixed a potential null/garbage free. 2019-10-24 16:37:52 -06:00
Ward Fisher
36ccecf053 Addressing a potential null argument to strlen 2019-10-24 14:28:39 -06:00
Greg Sjaardema
5ecad63c6a
Remove incorrect comment
The comment states that prefix must end in '/', but the '/' is added in the function itself, so the prefix should *not* end in '/' and the comment is incorrect.
2019-10-16 08:40:58 -06:00
Ward Fisher
d001ec8590 Removing a problematic const causing issues on OSX. 2019-10-09 17:18:48 -06:00
Dennis Heimbigner
f1506d552e Change (again), and hopefully simplify, the file model inference algorithm.
* For URL paths, the new approach essentially centralizes all information
  in the URL into the "#mode=" fragment key and uses that value
  to determine the dispatcher for (most) URLs.

* The new approach has the following steps:

  1. canonicalize the path if it is a URL.
  2. use the mode= fragment key to determine the dispatcher
  3. if dispatcher still not determined, then use the mode flags
     argument to nc_open/nc_create to determine the dispatcher.
  4. if the path points to something readable, attempt to read the
     magic number at the front, and use that to determine the dispatcher.
     this case may override all previous cases.

* Misc changes.

  1. Update documentation
  2. Moved some unit tests from libdispatch to unit_test directory.
  3. Fixed use of wrong #ifdef macro in test_filter_reg.c
     [I think this may fix an previously reported esupport query].
2019-09-29 12:59:28 -06:00
Ward Fisher
e7cc899264 Merge branch 'ejh_try2' of https://github.com/NetCDF-World-Domination-Council/netcdf-c into gh1487.wif 2019-09-20 14:04:56 -06:00
Greg Sjaardema
56c0d5cf8a Spelling fixes 2019-09-18 08:03:01 -06:00
edwardhartnett
2cd228bcd4 porting changes from other PR 2019-09-16 11:28:18 -06:00
edwardhartnett
e4ef7b1a65 more unit tests, this time for nc4internal.c 2019-08-21 04:46:00 -06:00
Ed Hartnett
bce3fa6169
Merge branch 'master' into ejh_next 2019-08-16 03:42:32 -06:00
edwardhartnett
94f1a89a40 final removal 2019-08-15 07:05:10 -06:00
edwardhartnett
c7c2892de5 clean up 2019-08-15 06:53:57 -06:00
edwardhartnett
2077729abc removed base_pe functions from dispatch table 2019-08-15 06:51:06 -06:00
edwardhartnett
60f436e7ee starting to remove obsolete _CRAYMPP macros 2019-08-14 06:13:45 -06:00
edwardhartnett
dce6f32a76 documentation 2019-08-13 14:57:43 -06:00
edwardhartnett
f007523826 fixed missing dependency in unit_test Makefile.am 2019-08-13 11:06:06 -06:00
edwardhartnett
978707c319 only run slow nclist test if --enable-large-file-tests is used 2019-08-13 10:55:44 -06:00
edwardhartnett
88077fe26e more comments 2019-08-13 06:31:06 -06:00
edwardhartnett
821b749186 removed unnecessary checking in find_in_NCList() 2019-08-13 06:03:48 -06:00
edwardhartnett
d76114aab3 more testing, sorting out some memory issues in test 2019-08-13 05:45:03 -06:00
edwardhartnett
8b8ece4f4b more testing of nclistmgr.c 2019-08-09 13:49:52 -06:00
edwardhartnett
f20db2e024 more documentation 2019-08-09 11:13:55 -06:00
edwardhartnett
d558873c93 documented functions in nclistmgr.c 2019-08-09 09:15:59 -06:00
edwardhartnett
916802bf4c starting to add doxygen docs for nclistmgr.c 2019-08-09 08:48:28 -06:00
edwardhartnett
fb32957b2b whitespace cleanup 2019-08-09 08:45:39 -06:00
edwardhartnett
3a9207d55c more changes for user-defined formats 2019-08-03 18:33:43 -06:00
edwardhartnett
83c6cd58a7 more changes in support of user-defined formats 2019-08-03 17:19:13 -06:00
edwardhartnett
170c5b0901 removed NC from open in dispatch table 2019-08-01 14:30:20 -06:00
Dennis Heimbigner
4c92fc3405 Remove netcdf-4 conditional on the dispatch table.
Partially address: https://github.com/Unidata/netcdf-c/issues/1056

Currently, some of the entries in the dispatch table
are conditional'd on USE_NETCDF4.

As a step in upgrading the dispatch table for use
with user-defined tables, we remove that conditional.
This means that all dispatch tables must implement the
netcdf-4 specific functions even if only to make them
return NC_ENOTNC4. To simplify this, a set of default
functions are defined in libdispatch/dnotnc4.c to provide this
behavior. The file libdispatch/dnotnc3.c is also relevant to
this.

The primary fix is to modify the various dispatch tables to
remove the conditional and use the functions in
libdispatch/dnotnc4.c as appropriate. In practice, all of the
existing tables are prepared to handle this, so the only
real change is to remove the conditionals.

Misc. Unrelated fixes
1. Fix some annoying warnings in ncvalidator.

Notes:
1. This has not been tested with either pnetcdf or hdf4 enabled.
   When those are enabled, it is possible that there are still
   some conditionals that need to be fixed.
2019-07-20 13:59:40 -06:00
Dennis Heimbigner
000f22b12a Fix encoding of a DAP2 constraint specified outside the URL.
re: github issue #1425

The 'ncdump -v' command causes a constraint to be sent
to the opendap code (in libdap2). This is a separate path
from specifying the constraint via a URL.

This separate path encoded its constraint using code independent
of and duplicative of that provided by ncuri.c and this duplicate
code did not properly encode the constraint, which might include
square brackets.

Solution chosen here was to get rid of the duplicate code and
ensure that all URL escaping is performed in the ncuribuild function
in the ncuri.c file.

Also removed the use of the NEWESCAPE conditional in ncuri.c
because it is no longer needed.
2019-07-14 15:56:29 -06:00
Ward Fisher
9db0e26b80
Merge pull request #1432 from NetCDF-World-Domination-Council/ejh_dispatch
create header netcdf_dispatch.h
2019-07-09 12:58:28 -06:00
Ward Fisher
281ac5ff15
Merge pull request #1431 from NetCDF-World-Domination-Council/ejh_remove_macro
remove unused macro USE_REFCOUNT
2019-07-09 12:58:17 -06:00
Ward Fisher
8b1c4e3ff8
Merge pull request #1410 from Unidata/ansifix2.dmh
Fix ncconfigure.h to solve a -ansi problem with strdup()
2019-07-09 12:57:31 -06:00
Ed Hartnett
d408006d06 handle UDF formats on NC_create() 2019-07-05 13:39:50 -06:00
Ed Hartnett
620f17d5ef finidhed removing refcount from dfile.c 2019-07-04 15:46:15 -06:00
Ed Hartnett
f6ea863011 finidhed removing refcount from dfile.c 2019-07-04 15:45:49 -06:00