Commit Graph

227 Commits

Author SHA1 Message Date
Dennis Heimbigner
1243c3d866 Allow .rc tests to work in parallel by isolation 2021-04-25 22:02:29 -06:00
Dennis Heimbigner
74b40fd788 Upgrade the nczarr code to match Zarr V2
Re: https://github.com/zarr-developers/zarr-python/pull/716

The Zarr version 2 spec has been extended to include the ability
to choose the dimension separator in chunk name keys. The legal
separators has been extended from {'.'} to {'.' '/'}.  So now it
is possible to use a key like "0/1/2/0" for chunk names.

This PR implements this for NCZarr. The V2 spec now says that
this separator can be set on a per-variable basis. For now, I
have chosen to allow this be set only globally by adding a key
named "ZARR.DIMENSION_SEPARATOR=<char>" in the
.daprc/.dodsrc/ncrc file. Currently, the only legal separator
characters are '.' (the default) and '/'. On writing, this key
will only be written if its value is different than the default.
This change caused problems because supporting a separator of '/'
is difficult to parse when keys/paths use '/' as the path separator.
A test case was added for this.

Additionally, make nczarr be enabled default by default. This required
some additional changes so that if zip and/or AWS S3 sdk are unavailable,
then they are disabled for NCZarr.

In addition the following unrelated changes were made.

1. Tested that pure-zarr mode could read an nczarr formatted store.
1. The .rc file handling now merges all known .rc files (.ncrc,.daprc, and .dodsrc) in that order and using those in HOME first, then in current directory. For duplicate entries, the later ones override the earlier ones. This change is to remove some of the conflicts inherent in the current .rc file load process. A set of test cases was also added.
1. Re-order tests in configure.ac and CMakeLists.txt so that if libcurl
   is not found then the other options that depend upon it properly
   are disabled.
1. I decided that xarray support should be enabled by default for pure
   zarr. In order to allow disabling, I added a new mode flag "noxarray".
1. Certain test in nczarr_test depend on use of .dodsrc. In order for these
   to work when testing in parallel, some inter-test dependencies needed to
   be added.
1. Improved authorization testing to use changes in thredds.ucar.edu
2021-04-24 19:48:15 -06:00
Dennis Heimbigner
a27283e0e6 fix test cases 2021-01-07 19:37:03 -07:00
Dennis Heimbigner
3f11e0395d fix tst_fillmismatch.sh 2021-01-07 14:22:45 -07:00
Dennis Heimbigner
d2316f866c Additional Fixes to NCZarr
Primary Fixes:
* Add a whole variable optimization -- used in the rare case that nc_get/put_vara covers the whole of a variable and the variable has a single chunk.
* Fix chunking error when stride causes whole chunks to be skipped.
* Fix some memory leaks
* Add test cases
* Add one performance test to nczarr_test/. This uses the timer utils from unit_test: timer_utils.[ch].
* Move ncdumpchunks utility from ncdump to nczarr_test

Misc. Other Changes:
* Make check for aws libraries conditional on --enable-nczarr-s3
* Remove all but one bm tests from nczarr_test until they are working.
* Remove another dependency on HDF5 from supposedly non-HDF5 specific code; specifically hdf5_log_hdf5.
* Make the BAIL2 macro be hdf5 specific and replace elsewhere with an HDF5 independent equivalent.
* Move hdf5cache.c to libsrc4/nc4cache.c because it is used by nczarr.
* Modify unit_tests so that some of them are run even if using Windows.
* Misc. small bug fixes and refactors and memory leaks.
* Rename some conflicting tests for cmake.
* Attempted to make nc_perf work with cmake and failed.
2020-12-16 20:48:02 -07:00
Dennis Heimbigner
90fd1406bc Make use of clock_gettime be conditional.
Re: GH Issue https://github.com/Unidata/netcdf-c/issues/1900

Apparently the clock_gettime() function is not always available.
It is used in unit_test/tst_exhash.c and unit_test/tst_xcache.c.

To solve this, a number of things were changed:
* Move the timing code to a new file unit_tests/timer_utils.[ch]
* Modify the timing code to choose one of several timing methods
depending on availability. The prioritized order is as follows:
    1. If Windows, use the QueryPerformanceCounter mechanism else
    2. Use clock_gettime if available else
    3. Use gettimeofday if available else
    4. Use getrusage if available

Note that the resolution of 3 and 4 is less than 1 or 2.

Misc. Other Changes:
* Move the test in CMakeLists.txt that disables unit tests for WIN32 to unit_test/CMakeLists.txt since some unit tests actually work under Visual Studio.
* Fix some of the unit tests to work under visual studio
* Fix problem with using remove() in zmap_nzf.c
* Remove some warning about use of EXTERNL
2020-12-06 18:19:53 -07:00
Ward Fisher
4cf290dc0b
Merge pull request #1897 from DennisHeimbigner/oceanunavail.dmh
Disable use of opendap2.oceanbrowser.net
2020-12-04 16:11:47 -07:00
Ward Fisher
bfa8fde35e Temporarily disabled encoding test under cmake 2020-12-04 15:49:30 -07:00
Dennis Heimbigner
6f72cec65e opendap2.oceanbrowser.net is temporarily unavailable 2020-12-02 11:18:54 -07:00
Dennis Heimbigner
4a5e228bfd update file permission 2020-12-01 22:17:07 -07:00
Dennis Heimbigner
6d3546b8a0 Add encode= tests 2020-11-12 16:06:01 -07:00
Ward Fisher
31dee0c4da
Revert "Revert "Fix nczarr-experimental: improve build support, disengage hdf5 vs netcdf4 flags, and find AWS libraries"" 2020-08-17 19:15:47 -06:00
Ward Fisher
16c27ca13f
Revert "Fix nczarr-experimental: improve build support, disengage hdf5 vs netcdf4 flags, and find AWS libraries" 2020-08-17 15:51:01 -06:00
Dennis Heimbigner
d85bb6fe20 The big change for this commit is complete the
disengagement of enable-netcdf4 from enable-hdf5.
That is, with the advent of nczarr, it is possible
to turn off hdf5 but still need netcdf-4 enabled
because nczarr uses libsrc4, but not libhdf5.
This change involves a bunch of things:
1. Modify configure.ac and CMakelist to make enable_hdf5
   control if hdf5 support is provided. For back compatibility,
   disable-netcdf4 is treated as disable-hdf5. But internally,
   netcdf4 support is controlled only by the enabling of formats
   that require it.
2. In support of #1, modify .travis.yml to use enable/disable-hdf5
   instead of enable/disable-netcdf4.
3. test_common.in is modified to track selected features,
   including enable-hdf5 and enable-s3-tests. This is used in
   selected tests that mix netcdf-3 and netcdf4 tests.
4. The conflation of USE_HDF5 and USE_NETCDF4 is common in
   code, tests, and build files, so all of those had to be weeded out.
5. It turns out that some of the NC4_dim functions really are HDF5 specific,
   but are not treated as such. So they are moved from nc4dim.c to
   hdf5dim.c or hdf5dispatch.c
6. Some generic functions in libhdf5 can be (and were) moved to libsrc4.
2020-08-12 15:42:50 -06:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Dennis Heimbigner
6e715135ba Fix windows \r problem 2020-05-30 20:14:45 -06:00
Dennis Heimbigner
7223c4a5aa Avoid spurious test failures when servers fail.
re: https://github.com/Unidata/netcdf-c/issues/1451

The situation with the various DAP (and other) remote test
servers is currently in a state of flux.  For example, Unidata
admin is planning to forcibly shift the remote test server to
remotetest.unidata.ucar.edu soon.  In addition, the server
test.opendap.org has shown some recent instability.

The result is that various DAP (and byterange) tests can fail
unexpectedly. This is an irritant to users and reveals nothing
about test sucess or failure.

Solve by modifying tests to report server inaccessibility and
otherwise pretend to succeed.

This puts an onus on Unidata to detect such server failures, but
will not cause users to see spurious failures. [Note. Do similar
fix for netcdf-java]. The check is:
1. export SETX=1 to cause all the shell scripts to trace
2. search the log files for the phrase "WARNING" (in upper case)
and see if it is complaining about not finding a server.

Misc. Changes
-------------
1. Added a pingurl program to see if a server was up.
2. modified some test case url targets
2019-12-31 15:42:58 -07:00
Turing Eret
b633ea97fd Reverted changes to C files. Can't change them as that messes with the
other configuration paths. Tweaked CMakeLists.txt to set the TOPSRCDIR
to the netcdf root.
2019-11-07 15:46:50 -07:00
Turing Eret
84a293351e Changes to make it possible to nest this project inside of another CMake
project.
2019-11-07 14:00:53 -07:00
Greg Sjaardema
56c0d5cf8a Spelling fixes 2019-09-18 08:03:01 -06:00
Dennis Heimbigner
6934aa2e8b Thread safety: step 1: cleanup
re: https://github.com/Unidata/netcdf-c/issues/1373 (partial)

* Mark some global constants be const to indicate to make them easier to track.
* Hide direct access to the ncrc_globalstate behind a function call.
* Convert dispatch tables to constants (except the user defined ones)
  This has some consequences in terms of function arguments needing to be marked
  as const also.
* Remove some no longer needed global fields
* Aggregate all the globals in nclog.c
* Uniformly replace nc_sizevector{0,1} with NC_coord_{zero,one}
* Uniformly replace nc_ptrdffvector1 with NC_stride_one
* Remove some obsolete code
2019-03-30 14:06:20 -06:00
Dennis Heimbigner
d13619b3e1 Fix additional big-endian machine error in dap4.
Follow on to PR: https://github.com/Unidata/netcdf-c/pull/1302
2019-02-15 11:36:29 -07:00
Ward Fisher
12724837b0 Fixed a typo. 2019-01-30 10:34:51 -07:00
Ward Fisher
df5a146432 Added test in support of https://github.com/Unidata/netcdf-c/issues/1300 and https://github.com/Unidata/netcdf-c/issues/1301 2019-01-29 15:23:03 -07:00
Ward Fisher
02937d2d0e ncdump, other directories updated with copyright stanza. 2018-12-06 15:36:53 -07:00
Dennis Heimbigner
751300ec59 Fix more memory leaks in netcdf-c library
This is a follow up to PR https://github.com/Unidata/netcdf-c/pull/1173

Sorry that it is so big, but leak suppression can be complex.

This PR fixes all remaining memory leaks -- as determined by
-fsanitize=address, and with the exceptions noted below.

Unfortunately. there remains a significant leak that I cannot
solve. It involves vlens, and it is unclear if the leak is
occurring in the netcdf-c library or the HDF5 library.

I have added a check_PROGRAM to the ncdump directory to show the
problem.  The program is called tst_vlen_demo.c To exercise it,
build the netcdf library with -fsanitize=address enabled. Then
go into ncdump and do a "make clean check".  This should build
tst_vlen_demo without actually executing it.  Then do the
command "./tst_vlen_demo" to see the output of the memory
checker.  Note the the lost malloc is deep in the HDF5 library
(in H5Tvlen.c).

I am temporarily working around this error in the following way.
1. I modified several test scripts to not execute known vlen tests
   that fail as described above.
2. Added an environment variable called NC_VLEN_NOTEST.
   If set, then those specific tests are suppressed.

This should mean that the --disable-utilities option to
./configure should not need to be set to get a memory leak clean
build.  This should allow for detection of any new leaks.

Note: I used an environment variable rather than a ./configure
option to control the vlen tests. This is because it is
temporary (I hope) and because it is a bit tricky for shell
scripts to access ./configure options.

Finally, as before, this only been tested with netcdf-4 and hdf5 support.
2018-11-15 10:00:38 -07:00
Dennis Heimbigner
245961de00 re: github issues
https://github.com/Unidata/netcdf-c/issues/1168
    https://github.com/Unidata/netcdf-c/issues/1163
    https://github.com/Unidata/netcdf-c/issues/1162

This PR partially fixes memory leaks in the netcdf-c library,
in the ncdump utility, and in some test cases.

The netcdf-c library now runs memory clean with the assumption
that the --disable-utilities option is used. The primary remaining
problem is ncgen. Once that is fixed, I believe the netcdf-c library
will run memory clean with no limitations.

Notes
-----------
1. Memory checking was performed using gcc -fsanitize=address.
   Valgrind-based testing has yet to be performed.
2. The pnetcdf, hdf4, and examples code has not been tested.

Misc. Non-leak changes
1. Make tst_diskless2 only run when netcdf4 is enabled (issue 1162)
2. Fix CmakeLists.txt to turn off logging if ENABLE_NETCDF_4 is OFF
3. Isolated all my debug scripts into a single top-level directory
   called debug
4. Fix some USE_NETCDF4 dependencies in nc_test and nc_test4 Makefile.am
2018-10-30 20:48:12 -06:00
Dennis Heimbigner
26da2fae01 Remove debug output 2018-10-02 11:43:11 -06:00
Dennis Heimbigner
8072d1f6bb Modify DAP2 and DAP4 to optionally allow Fillvalue/Variable mismatch
re: issue https://github.com/Unidata/netcdf-c/issues/1151

Modify DAP2 and DAP4 code to handle case when _FillValue type is not
same as the parent variable type.

Specifically:
1. Define a parameter [fillmismatch] to allow this mismatch;
   default is to disallow.
2. If allowed, forcibly change the type of the _FillValue to match
   the parent variable.
3. If allowed Convert the values to match new type
4. Generate a log message
5. if not allowed, then fail

Implementing this required some changes to ncdap_test/dapcvt.c
Also added test cases.

Minor Unrelated Changes:
1. There were a number of warnings about e.g.
   assigning a const char* to a char*. Fix these
2. In nccopy.1, replace .NP with .IP "n"
   (re PR https://github.com/Unidata/netcdf-c/pull/1144)
3. fix minor error in ncdump/ocprint
2018-10-01 15:51:43 -06:00
Ward Fisher
d0bb5ddde8 Merge remote-tracking branch 'origin/inmemory10.dmh' into combined-pr.wif 2018-09-04 13:39:34 -06:00
Dennis Heimbigner
d62a9e623c Fix the NC_INMEMORY code to work in all cases with HDF5 1.10.
re: github issue https://github.com/Unidata/netcdf-c/issues/1111

One of the less common use cases for the in-memory feature is
apparently failing with HDF5-1.10.x.  The fix is complicated and
requires significant changes to libhdf5/nc4memcb.c. The current
setup is detailed in the file docs/inmeminternal.dox.

Additionally, it was discovered that the program
nc_test/tst_inmemory.c, which is invoked by
nc_test/run_inmemory.sh, actually was failing because of the
above problem. But the failure is not detected since the script
does not return non-zero value.

Other Changes:
1. Fix nc_test_tst_inmemory to return errors correctly.
2. Make ncdap_tests/findtestserver.c and dap4_tests/findtestserver4.c
   be generated from ncdap_test/findtestserver.c.in.
3. Make LOG() print output to stderr instead of stdout to
   avoid contaminating e.g. ncdump output.
4. Modify the handling of NC_INMEMORY and NC_DISKLESS flags
   to properly handle that NC_DISKLESS => NC_INMEMORY. This
   affects a number of code pieces, especially memio.c.
2018-09-04 11:27:47 -06:00
Dennis Heimbigner
79e38de840 Add the ability to set some additional curlopt values
Add the ability to set some additional curlopt values via .daprc (aka .dodsrc).
This effects both DAP2 and DAP4 protocols.

Related issues:
[1] re: esupport: KOZ-821332
[2] re: github issue https://github.com/Unidata/netcdf4-python/issues/836
[3] re: github issue https://github.com/Unidata/netcdf-c/issues/1074

1. CURLOPT_BUFFERSIZE: Relevant to [1]. Allow user to set the read/write
buffersizes used by curl.
This is done by adding the following to .daprc (aka .dodsrc):
	HTTP.READ.BUFFERSIZE=n
where n is the buffersize in bytes. There is a built-in (to curl)
limit of 512k for this value.

2. CURLOPT_TCP_KEEPALIVE (and CURLOPT_TCP_KEEPIDLE and CURLOPT_TCP_KEEPINTVL):
Relevant (maybe) to [2] and [3]. Allow the user to turn on KEEPALIVE
This is done by adding the following to .daprc (aka .dodsrc):
	HTTP.KEEPALIVE=on|n/m
If the value is "on", then simply enable default KEEPALIVE. If the value
is n/m, then enable KEEPALIVE and set KEEPIDLE to n and KEEPINTVL to m.
2018-08-26 17:04:46 -06:00
Dennis Heimbigner
1299a0fcee The Jetstream remote test server is now working.
So it now becomes the first default test server to try.
This also means that the dap4 remote testing is enabled.
The only issue to watch is to see if the jetstream-based
server can stay up for significant periods of time.
A uptimerobot (https://uptimerobot.com) has been set ups
to monitor this hourly, so we shall see.
2018-06-26 13:58:45 -06:00
Dennis Heimbigner
02bd33f250 Merge branch 'dapparams.dmh' of https://github.com/Unidata/netcdf-c into dapparams.dmh 2018-05-19 16:11:24 -06:00
Dennis Heimbigner
5fc6be9472 Code duplicated; merge failure? 2018-05-18 20:28:51 -06:00
Ward Fisher
ebe6633006
Merge branch 'master' into dapparams.dmh 2018-05-16 14:34:28 -06:00
luz.paz
b4d0fe651a Follow-up trivial typos 2018-04-26 23:04:01 -04:00
luz.paz
74fbacdb82 Misc. source comment typos
Some are user-facing. Found via `codespell` and through the downstream FreeCAD.
2018-04-26 23:04:01 -04:00
Ward Fisher
ed9fa9f64c
Merge branch 'master' into dapparams.dmh 2018-04-23 13:38:18 -06:00
Ward Fisher
dad2101a04
Merge branch 'master' into index.dmh 2018-04-02 13:56:00 -06:00
Ward Fisher
a13b722af9
Merge branch 'master' into dapparams.dmh 2018-04-02 13:55:02 -06:00
Ward Fisher
e7a16a19a6 Modified tst_remote3.sh to work on OSX. 2018-03-29 18:30:16 -06:00
Ward Fisher
913b4a3961
Merge branch 'master' into index.dmh 2018-03-26 14:22:46 -06:00
Dennis Heimbigner
e6e3583e30 Merge branch 'master' into ncdaptestsclean.dmh 2018-03-20 21:37:35 -06:00
Dennis Heimbigner
05d078a8b0 The ncdap_tests were a mess, so I decided to clean them up
to remove cruft and to remove unused site tests
and make the tests somewhat more understandable.

Also did a fix to libdispatch/dwinpath to convert
relative paths to absolute paths. This will, I hope,
take care of some windows path problems when using
$srcdir in shell scripts.
2018-03-20 21:31:31 -06:00
Dennis Heimbigner
cc136cad08 Add a configurable "test case" that will create
and then open a file with a lot of metadata.
The test is configurable to determine the parameters
for the created metadata.
2018-03-17 16:25:13 -06:00
Dennis Heimbigner
25f062528b This completes (for now) the refactoring of libsrc4.
The file docs/indexing.dox tries to provide design
information for the refactoring.

The primary change is to replace all walking of linked
lists with the use of the NCindex data structure.
Ncindex is a combination of a hash table (for name-based
lookup) and a vector (for walking the elements in the index).
Additionally, global vectors are added to NC_HDF5_FILE_INFO_T
to support direct mapping of an e.g. dimid to the NC_DIM_INFO_T
object. These global vectors exist for dimensions, types, and groups
because they have globally unique id numbers.

WARNING:
1. since libsrc4 and libsrchdf4 share code, there are also
   changes in libsrchdf4.
2. Any outstanding pull requests that change libsrc4 or libhdf4
   are likely to cause conflicts with this code.
3. The original reason for doing this was for performance improvements,
   but as noted elsewhere, this may not be significant because
   the meta-data read performance apparently is being dominated
   by the hdf5 library because we do bulk meta-data reading rather
   than lazy reading.
2018-03-16 11:46:18 -06:00
Ward Fisher
da87652aa7
Merge branch 'master' into dapparams.dmh 2018-03-07 12:50:03 -07:00
Dennis Heimbigner
8cb1fc4cfe This is the second step in refactoring the libsrc4 code.
The first was branch newhash0.dmh.

As with newhash0.dmh, these changes should be transparent.
2018-02-24 20:36:24 -07:00
Dennis Heimbigner
99fccab359 1. Keep up to date by merging master
2. Fixed plugin building (nc_test4/hdf5plugins)
   to be done properly by cmake and automake.
4. Duplicated part of the nc_test4 filter test code
   in examples/C

An incomplete and untested set of hooks exist
for OS-X in nc_test4/findplugins.in. They need testing.
2018-01-16 11:00:09 -07:00