Commit Graph

53 Commits

Author SHA1 Message Date
Dennis Heimbigner
231ae96c4b Add support for Zarr string type to NCZarr
* re: https://github.com/Unidata/netcdf-c/pull/2278
* re: https://github.com/Unidata/netcdf-c/issues/2485
* re: https://github.com/Unidata/netcdf-c/issues/2474

This PR subsumes PR https://github.com/Unidata/netcdf-c/pull/2278.
Actually is a bit an omnibus covering several issues.

## PR https://github.com/Unidata/netcdf-c/pull/2278
Add support for the Zarr string type.
Zarr strings are restricted currently to be of fixed size.
The primary issue to be addressed is to provide a way for user to
specify the size of the fixed length strings. This is handled by providing
the following new attributes special:
1. **_nczarr_default_maxstrlen** —
This is an attribute of the root group. It specifies the default
maximum string length for string types. If not specified, then
it has the value of 64 characters.
2. **_nczarr_maxstrlen** —
This is a per-variable attribute. It specifies the maximum
string length for the string type associated with the variable.
If not specified, then it is assigned the value of
**_nczarr_default_maxstrlen**.

This PR also requires some hacking to handle the existing netcdf-c NC_CHAR
type, which does not exist in zarr. The goal was to choose numpy types for
both the netcdf-c NC_STRING type and the netcdf-c NC_CHAR type such that
if a pure zarr implementation read them, it would still work and an
NC_CHAR type would be handled by zarr as a string of length 1.

For writing variables and NCZarr attributes, the type mapping is as follows:
* "|S1" for NC_CHAR.
* ">S1" for NC_STRING && MAXSTRLEN==1
* ">Sn" for NC_STRING && MAXSTRLEN==n

Note that it is a bit of a hack to use endianness, but it should be ok since for
string/char, the endianness has no meaning.

For reading attributes with pure zarr (i.e. with no nczarr
atribute types defined), they will always be interpreted as of
type NC_CHAR.

## Issue: https://github.com/Unidata/netcdf-c/issues/2474
This PR partly fixes this issue because it provided more
comprehensive support for Zarr attributes that are JSON valued expressions.
This PR still does not address the problem in that issue where the
_ARRAY_DIMENSION attribute is incorrectly set. Than can only be
fixed by the creator of the datasets.

## Issue: https://github.com/Unidata/netcdf-c/issues/2485
This PR also fixes the scalar failure shown in this issue.
It generally cleans up scalar handling.
It also adds a note to the documentation describing that
NCZarr supports scalars while Zarr does not and also how
scalar interoperability is achieved.

## Misc. Other Changes
1. Convert the nczarr special attributes and keys to be all lower case. So "_NCZARR_ATTR" now used "_nczarr_attr. Support back compatibility for the upper case names.
2. Cleanup my too-clever-by-half handling of scalars in libnczarr.
2022-08-27 20:21:13 -06:00
Dennis Heimbigner
11fe00ea05 Add filter support to NCZarr
Filter support has three goals:

1. Use the existing HDF5 filter implementations,
2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr,
3. Allow filters to be used even when HDF5 is disabled

Detailed usage directions are define in docs/filters.md.

For now, the existing filter API is left in place. So filters
are defined using ''nc_def_var_filter'' using the HDF5 style
where the id and parameters are unsigned integers.

This is a big change since filters affect many parts of the code.

In the following, the terms "compressor" and "filter" and "codec" are generally
used synonomously.

### Filter-Related Changes:
* In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms.
* Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h.
* Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out.
* Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h.
* Add a number of new test to test the new nczarr filters.
* Let ncgen parse _Codecs attribute, although it is ignored.

### Plugin directory changes:
* Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file
* Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip
* Add a Codec defaulter (see docs/filters.md) for the big four filters.
* Make plugins work with windows by properly adding __declspec declaration.

### Misc. Non-Filter Changes
* Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5.
* Improve support for caching
* More fixes for path conversion code
* Fix misc. memory leaks
* Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath.
* Add a number of new test to test the non-filter fixes.
* Update the parsers
* Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'
2021-09-02 17:04:26 -06:00
Ward Fisher
9e6b6bcd6b Correct logic error. 2021-08-31 09:51:19 -06:00
Ward Fisher
5d5a7c4f6e Correct an issue with appveyor builds, specific to systems where a bash shell is not available. 2021-08-31 09:27:51 -06:00
Ward Fisher
e85d95702e Attempt at test orchestration to avoid a race condition when running tests in parallel. 2021-08-25 13:33:49 -06:00
Ward Fisher
ad11c54a79 Added a temporary fix in support of https://github.com/Unidata/netcdf-c/issues/2018. Also, fixed a missing reference file when using cmake-based build systems. 2021-06-09 15:55:11 -06:00
Ward Fisher
b0f85f41df
Merge branch 'master' into authfix.dmh 2021-06-01 14:09:43 -06:00
Dennis Heimbigner
e632d02041 Re-enable DAP2 authorization tests
The thredds-test server now has some password protected datasets
that can be used to test DAP2 authorization support.
The general location is
````
https://thredds.ucar.edu/thredds/tdscapabilities/authTest.html
````
and specifically:
````
https://thredds.ucar.edu/thredds/dodsC/test3/testData.nc.html
````

This PR replaces old testcases with ncdap_test/testauth.sh.
This testcase allows us to test use of the .dodsrc file and .netrc file
and embedded user+pwd.

As part of this, I had to create a program (ncdap_test/pathcvt.c)
that is essentially the equivalent to cygpath. Given a path in
windows, unix, msys or cygwin format, it converts it to the
equivalent format in one of those four cases.  So it can be used
to convert a cygwin path to a windows path, for example. This is
needed in testpathcvt and testauth to make sure that the paths
in .daprc (e.g. the reference to .netrc) are of the proper
format.

Misc. Other Changes:
1. Fix some memory leaks in libdap2
2. Setting the env variable CURLOPT_VERBOSE allows tracking of curl
   operations.
3. Make tst_charvlenbug be conditional on NC_VLEN_NOTEST.
2021-05-29 21:30:33 -06:00
Dennis Heimbigner
6901206927 Regularize the semantics of mkstemp.
re: https://github.com/Unidata/netcdf-c/issues/1827

The issue is partly resolved by this PR. The proximate problem appears to be that the semantics of mkstemp in **nix is different than the semantics of _mktemp_s in Windows. I had thought they were the same but that is incorrect. The _mktemp_s function will only produce 26 different files and so the netcdf temp file code will fail after about that many iterations.

So, to solve this, I created my own version of mkstemp for windows that uses a random number generator. This appears to solve the reported issue.  I also added the testcase ncdap_test/test_manyurls but made it conditional on --enable-dap-long-tests because it is very slow.

I did note that the provided test program now fails after some 800 iterations with a libcurl error claiming it cannot resolve the host name. My belief is that the library is just running out of resources at this point: too many open curl handles or some such. I doubt if this failure is fixable.

So bottom line is that it is really important to do nc_close when you are finished with a file.

Misc. Other Changes:

1. I took the opportunity to clean up some bad string hacks in the code. Specifically
    * change all uses of strncat to strlcat
    * remove old string hacks: occoncat and occopycat
2. Add heck to see if test.opendap.org is running and if not, then skip test
3. Make CYGWIN use TEMP environment variable
2021-05-14 11:33:03 -06:00
Dennis Heimbigner
90fd1406bc Make use of clock_gettime be conditional.
Re: GH Issue https://github.com/Unidata/netcdf-c/issues/1900

Apparently the clock_gettime() function is not always available.
It is used in unit_test/tst_exhash.c and unit_test/tst_xcache.c.

To solve this, a number of things were changed:
* Move the timing code to a new file unit_tests/timer_utils.[ch]
* Modify the timing code to choose one of several timing methods
depending on availability. The prioritized order is as follows:
    1. If Windows, use the QueryPerformanceCounter mechanism else
    2. Use clock_gettime if available else
    3. Use gettimeofday if available else
    4. Use getrusage if available

Note that the resolution of 3 and 4 is less than 1 or 2.

Misc. Other Changes:
* Move the test in CMakeLists.txt that disables unit tests for WIN32 to unit_test/CMakeLists.txt since some unit tests actually work under Visual Studio.
* Fix some of the unit tests to work under visual studio
* Fix problem with using remove() in zmap_nzf.c
* Remove some warning about use of EXTERNL
2020-12-06 18:19:53 -07:00
Ward Fisher
bfa8fde35e Temporarily disabled encoding test under cmake 2020-12-04 15:49:30 -07:00
Dennis Heimbigner
6d3546b8a0 Add encode= tests 2020-11-12 16:06:01 -07:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Dennis Heimbigner
7223c4a5aa Avoid spurious test failures when servers fail.
re: https://github.com/Unidata/netcdf-c/issues/1451

The situation with the various DAP (and other) remote test
servers is currently in a state of flux.  For example, Unidata
admin is planning to forcibly shift the remote test server to
remotetest.unidata.ucar.edu soon.  In addition, the server
test.opendap.org has shown some recent instability.

The result is that various DAP (and byterange) tests can fail
unexpectedly. This is an irritant to users and reveals nothing
about test sucess or failure.

Solve by modifying tests to report server inaccessibility and
otherwise pretend to succeed.

This puts an onus on Unidata to detect such server failures, but
will not cause users to see spurious failures. [Note. Do similar
fix for netcdf-java]. The check is:
1. export SETX=1 to cause all the shell scripts to trace
2. search the log files for the phrase "WARNING" (in upper case)
and see if it is complaining about not finding a server.

Misc. Changes
-------------
1. Added a pingurl program to see if a server was up.
2. modified some test case url targets
2019-12-31 15:42:58 -07:00
Turing Eret
b633ea97fd Reverted changes to C files. Can't change them as that messes with the
other configuration paths. Tweaked CMakeLists.txt to set the TOPSRCDIR
to the netcdf root.
2019-11-07 15:46:50 -07:00
Turing Eret
84a293351e Changes to make it possible to nest this project inside of another CMake
project.
2019-11-07 14:00:53 -07:00
Dennis Heimbigner
d13619b3e1 Fix additional big-endian machine error in dap4.
Follow on to PR: https://github.com/Unidata/netcdf-c/pull/1302
2019-02-15 11:36:29 -07:00
Ward Fisher
df5a146432 Added test in support of https://github.com/Unidata/netcdf-c/issues/1300 and https://github.com/Unidata/netcdf-c/issues/1301 2019-01-29 15:23:03 -07:00
Ward Fisher
02937d2d0e ncdump, other directories updated with copyright stanza. 2018-12-06 15:36:53 -07:00
Dennis Heimbigner
8072d1f6bb Modify DAP2 and DAP4 to optionally allow Fillvalue/Variable mismatch
re: issue https://github.com/Unidata/netcdf-c/issues/1151

Modify DAP2 and DAP4 code to handle case when _FillValue type is not
same as the parent variable type.

Specifically:
1. Define a parameter [fillmismatch] to allow this mismatch;
   default is to disallow.
2. If allowed, forcibly change the type of the _FillValue to match
   the parent variable.
3. If allowed Convert the values to match new type
4. Generate a log message
5. if not allowed, then fail

Implementing this required some changes to ncdap_test/dapcvt.c
Also added test cases.

Minor Unrelated Changes:
1. There were a number of warnings about e.g.
   assigning a const char* to a char*. Fix these
2. In nccopy.1, replace .NP with .IP "n"
   (re PR https://github.com/Unidata/netcdf-c/pull/1144)
3. fix minor error in ncdump/ocprint
2018-10-01 15:51:43 -06:00
Dennis Heimbigner
d62a9e623c Fix the NC_INMEMORY code to work in all cases with HDF5 1.10.
re: github issue https://github.com/Unidata/netcdf-c/issues/1111

One of the less common use cases for the in-memory feature is
apparently failing with HDF5-1.10.x.  The fix is complicated and
requires significant changes to libhdf5/nc4memcb.c. The current
setup is detailed in the file docs/inmeminternal.dox.

Additionally, it was discovered that the program
nc_test/tst_inmemory.c, which is invoked by
nc_test/run_inmemory.sh, actually was failing because of the
above problem. But the failure is not detected since the script
does not return non-zero value.

Other Changes:
1. Fix nc_test_tst_inmemory to return errors correctly.
2. Make ncdap_tests/findtestserver.c and dap4_tests/findtestserver4.c
   be generated from ncdap_test/findtestserver.c.in.
3. Make LOG() print output to stderr instead of stdout to
   avoid contaminating e.g. ncdump output.
4. Modify the handling of NC_INMEMORY and NC_DISKLESS flags
   to properly handle that NC_DISKLESS => NC_INMEMORY. This
   affects a number of code pieces, especially memio.c.
2018-09-04 11:27:47 -06:00
Dennis Heimbigner
05d078a8b0 The ncdap_tests were a mess, so I decided to clean them up
to remove cruft and to remove unused site tests
and make the tests somewhat more understandable.

Also did a fix to libdispatch/dwinpath to convert
relative paths to absolute paths. This will, I hope,
take care of some windows path problems when using
$srcdir in shell scripts.
2018-03-20 21:31:31 -06:00
Dennis Heimbigner
081963ef8a forgot to add to cmake 2017-07-14 16:44:50 -06:00
Dennis Heimbigner
3db4f013bf Primary change: add dap4 support
Specific changes:
1. Add dap4 code: libdap4 and dap4_test.
   Note that until the d4ts server problem is solved, dap4 is turned off.
2. Modify various files to support dap4 flags:
	configure.ac, Makefile.am, CMakeLists.txt, etc.
3. Add nc_test/test_common.sh. This centralizes
   the handling of the locations of various
   things in the build tree: e.g. where is
   ncgen.exe located. See nc_test/test_common.sh
   for details.
4. Modify .sh files to use test_common.sh
5. Obsolete separate oc2 by moving it to be part of
   netcdf-c. This means replacing code with netcdf-c
   equivalents.
5. Add --with-testserver to configure.ac to allow
   override of the servers to be used for --enable-dap-remote-tests.
6. There were multiple versions of nctypealignment code. Try to
   centralize in libdispatch/doffset.c and include/ncoffsets.h
7. Add a unit test for the ncuri code because of its complexity.
8. Move the findserver code out of libdispatch and into
   a separate, self contained program in ncdap_test and dap4_test.
9. Move the dispatch header files (nc{3,4}dispatch.h) to
   .../include because they are now shared by modules.
10. Revamp the handling of TOPSRCDIR and TOPBUILDDIR for shell scripts.
11. Make use of MREMAP if available
12. Misc. minor changes e.g.
	- #include <config.h> -> #include "config.h"
	- Add some no-install headers to /include
	- extern -> EXTERNL and vice versa as needed
	- misc header cleanup
	- clean up checking for misc. unix vs microsoft functions
13. Change copyright decls in some files to point to LICENSE file.
14. Add notes to RELEASENOTES.md
2017-03-08 17:01:10 -07:00
Dennis Heimbigner
ee2916f60a Originally, libdap2 was also going to provide
a dap2->netcdf-4 translation. That is now
superceded by dap4. But there is some cruft
in the dap2 code around this that should be removed.
2017-02-21 15:13:37 -07:00
Ward Fisher
b477a9c1b3 Corrected a typo resulting in a cmake warning. 2016-09-06 12:41:48 -06:00
Ward Fisher
486f69522c Merged utility-based tests fix into cmake-based builds. 2016-08-31 15:48:10 -06:00
dmh
dbc7215063 1. fix nc_open_mem doxygen description
2. fix some bugs wrt to cygwin vs linux.
3. add to RELEASE_NOTES
4. try to fix tst_inmemory fd error by putting the first
   arg to dotest in quotes.
2015-05-28 19:23:33 -06:00
Ward Fisher
281c30bed0 disabled a remote test that is currently hanging. 2015-05-15 16:08:48 -06:00
Ward Fisher
d06e15579f Moved test for [NCF-330] out of 'known-failing' fenceposts and into the regular test rotation. Also added it to the autotools-based tests. 2015-05-08 14:02:12 -06:00
Ward Fisher
5cc6e915f1 Added test for issue found in [NCF-330]. 2015-05-08 14:02:11 -06:00
Ward Fisher
2d72b7ca7f More work on getting the shell scripts working properly with cmake under MSYS. 2015-02-02 14:46:50 -07:00
dmh
d2a832163b Make cmake run testauth.sh 2014-12-31 22:53:03 -07:00
dmh
20720199c8 1. synch with oc
2. fix ocuri parameter handling
3. add ncdap_test/testuri.sh to test parameter handling.
2014-12-27 20:42:01 -07:00
Ward Fisher
7f812b367e Manual merge of pull request https://github.com/Unidata/netcdf-c/pull/64 contributed by nschloe. Assorted CMake improvements. 2014-06-11 15:51:31 -06:00
Ward Fisher
e968429a0c Cleaned up CMake configuration file a little bit. 2014-04-22 14:51:57 -06:00
Ward Fisher
44fae42214 Cleaned up indentation, white space in multiple CMakeLists.txt files. 2014-04-21 11:15:33 -06:00
Ward Fisher
1480ae77a9 Added test_varm3 back in to the list of active tests, to keep it in step with autotools.
Corrected configure.ac; CURLOPT_RESPONSE_CODE should have been CURLINFO_RESPONSE_CODE.
2014-03-13 10:06:12 -06:00
dmh
a189b98b0b 1. Any test that references nctestserver/NC_findtestserver
should be under ENABLE_DAP_REMOTE_TESTS.
   Fixed to make sure that this is so.
   Also attempted to fix ncdap_test/CMakeLists.txt,
   but probably got it wrong.
   HT to Nico Schlomer.
2. Attempted to reduce the number of conversion errors
   when -Wconversion is set. Fixed oc2, but
   rest of netcdf remains to be done.
   HT to Nico Schlomer.
3. When doing #2, I discovered an error in ncgen.y
   that has remained hidden. This required some other
   test case fixes.
2014-03-08 20:41:30 -07:00
Ward Fisher
c72ac7219e Added ENABLE_DAP_AUTH_TESTS flag to CMakelists files. 2014-03-07 13:58:09 -07:00
dmh
e3545bf88d 1. Allow files to be opened by name multiple
times, but returning the same ncid.
2. Separate out the dap auth tests
   and make them disabled by default.
3. turn of ncdap_test/test_varm3 until
   we can find a copy of coads_climatology.nc
2014-03-07 12:04:38 -07:00
Ward Fisher
16400df5e0 Resolved Conflict when cherry-picking patches. 2014-02-06 11:33:31 -07:00
Ward Fisher
9b0a98383a Added CMakeLists.txt to missing Makefile.am files, so that CMake-based builds can be used on source packages generated by make dist 2014-02-06 11:32:32 -07:00
dmh
dfc3749506 [NCF-282]/ HAD-595112
The code that tests if a path is a url is
faulting when the url does not end in a slash
(e.g. http://thredds-tests.ucar.edu).
The code that tests if a path is a url is
faulting when the url does not end in a slash
(e.g. http://thredds-tests.ucar.edu).
CF-273]/HZY-708311

Solution was to do a null pointer test.
Added a test (tst_misc).
2014-01-20 16:11:45 -07:00
Ward Fisher
0032af978b Added INQ_FORMAT_EXTENDED-related tests to the cmake build system (based on inclusion in autotools-based builds). 2014-01-16 13:14:49 -07:00
dmh
c9bb8c0abf Add http auth test to cmake 2013-11-15 13:44:27 -07:00
Ward Fisher
33d3d06971 Added initial 'make dist', 'make distcheck' support to
CMake-based builds.
2013-06-03 16:42:04 +00:00
Ward Fisher
67f96188ff Merged latest from netCDF-cmake branch in preparation for 4.3.0 release. 2013-04-23 21:50:07 +00:00
Ward Fisher
7a226dd3f1 Merging the win_netcdf branch into the trunk. 2012-09-27 22:50:41 +00:00
Ward Fisher
34efd517f9 2012-09-21 17:37:57 +00:00