Commit Graph

25 Commits

Author SHA1 Message Date
Dennis Heimbigner
df3636b959 Mitigate S3 test interference + Unlimited Dimensions in NCZarr
This PR started as an attempt to add unlimited dimensions to NCZarr.
It did that, but this exposed significant problems with test interference.
So this PR is mostly about fixing -- well mitigating anyway -- test
interference.

The problem of test interference is now documented in the document docs/internal.md.
The solutions implemented here are also describe in that document.
The solution is somewhat fragile but multiple cleanup mechanisms
are provided. Note that this feature requires that the
AWS command line utility must be installed.

## Unlimited Dimensions.
The existing NCZarr extensions to Zarr are modified to support unlimited dimensions.
NCzarr extends the Zarr meta-data for the ".zgroup" object to include netcdf-4 model extensions. This information is stored in ".zgroup" as dictionary named "_nczarr_group".
Inside "_nczarr_group", there is a key named "dims" that stores information about netcdf-4 named dimensions. The value of "dims" is a dictionary whose keys are the named dimensions. The value associated with each dimension name has one of two forms
Form 1 is a special case of form 2, and is kept for backward compatibility. Whenever a new file is written, it uses format 1 if possible, otherwise format 2.
* Form 1: An integer representing the size of the dimension, which is used for simple named dimensions.
* Form 2: A dictionary with the following keys and values"
   - "size" with an integer value representing the (current) size of the dimension.
   - "unlimited" with a value of either "1" or "0" to indicate if this dimension is an unlimited dimension.

For Unlimited dimensions, the size is initially zero, and as variables extend the length of that dimension, the size value for the dimension increases.
That dimension size is shared by all arrays referencing that dimension, so if one array extends an unlimited dimension, it is implicitly extended for all other arrays that reference that dimension.
This is the standard semantics for unlimited dimensions.

Adding unlimited dimensions required a number of other changes to the NCZarr code-base. These included the following.
* Did a partial refactor of the slice handling code in zwalk.c to clean it up.
* Added a number of tests for unlimited dimensions derived from the same test in nc_test4.
* Added several NCZarr specific unlimited tests; more are needed.
* Add test of endianness.

## Misc. Other Changes
* Modify libdispatch/ncs3sdk_aws.cpp to optionally support use of the
   AWS Transfer Utility mechanism. This is controlled by the
   ```#define TRANSFER```` command in that file. It defaults to being disabled.
* Parameterize both the standard Unidata S3 bucket (S3TESTBUCKET) and the netcdf-c test data prefix (S3TESTSUBTREE).
* Fixed an obscure memory leak in ncdump.
* Removed some obsolete unit testing code and test cases.
* Uncovered a bug in the netcdf-c handling of big-endian floats and doubles. Have not fixed yet. See tst_h5_endians.c.
* Renamed some nczarr_tests testcases to avoid name conflicts with nc_test4.
* Modify the semantics of zmap\#ncsmap_write to only allow total rewrite of objects.
* Modify the semantics of zodom to properly handle stride > 1.
* Add a truncate operation to the libnczarr zmap code.
2023-09-26 16:56:48 -06:00
Dennis Heimbigner
c3c89693c4 Fix URL encoding in DAP2 url processing
re: Github issue https://github.com/Unidata/netcdf-c/issues/1832
and Github issue https://github.com/Unidata/netcdf4-python/issues/1041

Handling of URL escape sequences for some servers
(e.g. http://iridl.ldeo.columbia.edu) appears to be somewhat
non-standard.
In particular, certain characters need escaping that other servers
do not. Fortunately, the changes should also work existing other servers.
2020-09-08 12:41:12 -06:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Dennis Heimbigner
6934aa2e8b Thread safety: step 1: cleanup
re: https://github.com/Unidata/netcdf-c/issues/1373 (partial)

* Mark some global constants be const to indicate to make them easier to track.
* Hide direct access to the ncrc_globalstate behind a function call.
* Convert dispatch tables to constants (except the user defined ones)
  This has some consequences in terms of function arguments needing to be marked
  as const also.
* Remove some no longer needed global fields
* Aggregate all the globals in nclog.c
* Uniformly replace nc_sizevector{0,1} with NC_coord_{zero,one}
* Uniformly replace nc_ptrdffvector1 with NC_stride_one
* Remove some obsolete code
2019-03-30 14:06:20 -06:00
Ward Fisher
c70480dc05 Updated COPYRIGHT stanza in libdap2 2018-12-06 14:21:03 -07:00
Dennis Heimbigner
0efc4e5023 Turn off debug noise 2017-07-14 15:32:17 -06:00
Dennis Heimbigner
eee41eea55 turn off debugging 2017-07-13 11:45:52 -06:00
Dennis Heimbigner
715a6fe5eb The files libdispatch/dwinpath.c and include/ncwinpath.h
were added to provide a path name converter from e.g. cygwin
paths to e.g. windows paths. This is necessary because
the shell scripts may produce cygwin paths, but the code
may have been compiled with Visual Studio. Similar issues
arise with Mingw.

At appropriate places, and if using Visual Studio or Mingw,
I added calls to the path conversion code.
Apparently I forgot to find all the places where this
conversion was needed. So this pr does the following:
1. Push the calls to the converter to the various libXXX
   directories and out of libdispatch/dfile.c.
2. Add conversion calls to other parts of the code like oc2.

I also turns out that conversion code in dapcvt.c
had a bug when handling DAP Byte type under visual studio.

Notes:
1. there may still be places I missed that need to do path conversion.
2. need to make sure that calls to e.g. H5open also use converted path.
2017-07-13 10:40:07 -06:00
dmh
764a1c40a3 ckp 2016-04-06 19:51:40 -06:00
Ward Fisher
0693b42078 For some reason, setting DAPDEBUG and OCDEBUG to 1 causes a bunch of failures on OSX. 2015-05-27 11:04:09 -06:00
dmh
ab5022256d need to set -e many of the .sh programs 2015-05-16 15:46:39 -06:00
dmh
f423f27693 Sync with oc project.
This supports better authorization
handling for DAP requests, especially redirection
based authorization. I also added a new test case
ncdap_tests/testauth.sh.

Specifically, suppose I have a netrc file /tmp/netrc
containing this.
    machine uat.urs.earthdata.nasa.gov login xxxxxx password yyyyyy
Also suppose I have a .ocrc file containing these lines
    HTTP.COOKIEJAR=/tmp/cookies
    HTTP.NETRC=/tmp/netrc
Assume that .ocrc is in the local directory or HOME.

Then this command should work (assuming a valid login and password).
    ncdump -h "https://54.86.135.31/opendap/data/nc/fnoc1.nc"
2014-12-24 10:22:47 -07:00
dmh
af566dd300 In preparation for adding dap4 support, I have cleaned up
the libdap2 code to make it dap2 protocol + netcdf classic
only.
2014-03-24 14:02:52 -06:00
dmh
baed147ba4 [NCF-277]
Fix Http Basic Authorization.
The problem is really in oc2.0.
In order for it to work,
the CURLOPT_COOKIEJAR must have
a non-null value. The code
was already there, but not being
used for some reason.
1. fixed cookiejar code in oc2.0
2. synched oc2.0 with netcdf-c/oc2
3. added a test case
2013-11-15 11:38:54 -07:00
dmh
2bc308432d sync with oc2.0 2013-11-14 15:13:20 -07:00
Dennis Heimbigner
e3a4a617cf Add test for Jira NCF-249 2013-04-23 21:06:14 +00:00
Dennis Heimbigner
7e8bfb01f3 Fix jira NCF-249 2013-04-23 20:18:16 +00:00
Dennis Heimbigner
6cf31dcf2d jira: NCF-232: bad conversion of grids 2013-02-26 04:31:06 +00:00
Dennis Heimbigner
42999f4c7c move from oc1.0 to oc2.0; create new dir oc2 2012-07-31 20:34:13 +00:00
Dennis Heimbigner
3e592a2604 fix dap string handling 2012-02-03 21:31:50 +00:00
Dennis Heimbigner
aebd11348a 1)Integrate the oc with improved performance 2012-01-29 18:56:29 +00:00
Dennis Heimbigner
efd9808b0a completely rewritten constraint system 2011-11-14 04:20:19 +00:00
Dennis Heimbigner
5c652a42a9 2011-08-12 16:14:56 +00:00
Dennis Heimbigner
eae569c6ee fixed cache handling problem: Jira NCF-106 2011-08-06 02:49:23 +00:00
Dennis Heimbigner
9fe1223316 2011-04-17 18:56:10 +00:00