Commit Graph

23 Commits

Author SHA1 Message Date
Peter Hill
c72511404e
Fix conversion warnings in libdispatch 2024-04-08 11:31:13 +01:00
Dennis Heimbigner
27f615bebc Properly handle missing regions in URLS
NOTE: it is important that this fix gets into 4.9.3

re: Issue https://github.com/Unidata/netcdf-c/issues/2798

## Modifications
* This PR includes PR https://github.com/Unidata/netcdf-c/pull/2813
* Support the following AWS environment variables in the internal S3 library
  (they are already supported by aws-sdk-cpp).
  - AWS_REGION
  - AWS_DEFAULT_REGION
  - AWS_ACCESS_KEY_ID
  - AWS_CONFIG_FILE
  - AWS_PROFILE
  - AWS_SECRET_ACCESS_KEY
  - (source https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-envvars.html).
* Support an empty region when specifying s3.amazonaws.com as the host.
* Move some S3/AWS related functions to ds3util.c
* Add a test case to test empty region and AWS_[DEFAULT]_REGION.
2023-12-02 21:03:59 -07:00
Dennis Heimbigner
98477b9f25 ## Addendum [5/9/23]
It turns out that attempting to test S3 using a github action secret is a very complex process. So, this was disabled for github actions. However, a new *run_tests_s3.yml* action file was added that will eventually encapsulate S3 testing.
2023-05-09 21:13:49 -06:00
Dennis Heimbigner
e31ce10842 Enable ACCEPT_ENCODING on DAP requests
re: PR https://github.com/Unidata/netcdf-c/issues/2622

H/T Nathan Potter for finding this.

Apparently the existing library DAP code for supporting
compressed http responses was disabled.

So:
1. enable CURLOPT_ACCEPT_ENCODING by default
2. Add a new HTTP.ENCODE for .dodsrc to allow it to be disabled.
2023-02-16 20:21:22 -07:00
Ward Fisher
3bf18fbdb8
Merge pull request #2039 from mathstuf/various-fixes
Various fixes
2022-03-10 14:52:30 -07:00
Dennis Heimbigner
56f1f595b8 Fix new lgtm alerts 2021-09-28 14:19:07 -06:00
Dennis Heimbigner
6b69b9c52c Significantly Improve Amazon S3 Cloud Storage Support
## S3 Related Fixes

* Add comprehensive support for specifying AWS profiles to provide access credentials.
* Parse the files "~/.aws/config" and "~/.aws/credentials to provide credentials for the HDF5 ROS3 driver and to locate default region.
* Add a function to obtain the currently active S3 credentials. The search rules are defined in docs/nczarr.md.
* Provide documentation for the new features.
* Modify the struct NCauth (in include/ncauth.h) to replace specific S3 credentials with a profile name.
* Add a unit test to test the operation of profile and credentials management.
* Add support for URLS of the form "s3://<bucket>/<key>"; this requires obtaining a default region.
* Allows the specification of profile and/or region in a URL of the form "#mode=nczarr,...&aws.region=...&aws.profile=..."

## Misc. Fixes

* Move the ezxml code to libdispatch so that it can be used both by DAP4 and nczarr.
* Modify nclist to provide a deep clone operation.
* Modify ncuri to provide a deep clone operation.
* Modify the .rc file format to allow the specification of a path to be tested when looking for an entry in the .rc file.
* Ensure that the NC_rcload function is called.
* Modify nchttp to support setting request headers.
2021-09-27 18:36:33 -06:00
Ben Boeckel
47428e9816 libdispatch: avoid warnings about string size computations
GCC warns if the length parameter to `strncpy` is computed from the
source since it is actually the destination that is relevant here. Since
these allocations are all made with the right amount, other string
functions may be used instead.
2021-07-28 15:19:02 -04:00
Dennis Heimbigner
6901206927 Regularize the semantics of mkstemp.
re: https://github.com/Unidata/netcdf-c/issues/1827

The issue is partly resolved by this PR. The proximate problem appears to be that the semantics of mkstemp in **nix is different than the semantics of _mktemp_s in Windows. I had thought they were the same but that is incorrect. The _mktemp_s function will only produce 26 different files and so the netcdf temp file code will fail after about that many iterations.

So, to solve this, I created my own version of mkstemp for windows that uses a random number generator. This appears to solve the reported issue.  I also added the testcase ncdap_test/test_manyurls but made it conditional on --enable-dap-long-tests because it is very slow.

I did note that the provided test program now fails after some 800 iterations with a libcurl error claiming it cannot resolve the host name. My belief is that the library is just running out of resources at this point: too many open curl handles or some such. I doubt if this failure is fixable.

So bottom line is that it is really important to do nc_close when you are finished with a file.

Misc. Other Changes:

1. I took the opportunity to clean up some bad string hacks in the code. Specifically
    * change all uses of strncat to strlcat
    * remove old string hacks: occoncat and occopycat
2. Add heck to see if test.opendap.org is running and if not, then skip test
3. Make CYGWIN use TEMP environment variable
2021-05-14 11:33:03 -06:00
Dennis Heimbigner
eb3d9eb0c9 Provide a Number of fixes/improvements to NCZarr
Primary changes:
* Add an improved cache system to speed up performance.
* Fix NCZarr to properly handle scalar variables.

Misc. Related Changes:
* Added unit tests for extendible hash and for the generic cache.
* Add config parameter to set size of the NCZarr cache.
* Add initial performance tests but leave them unused.
* Add CRC64 support.
* Move location of ncdumpchunks utility from /ncgen to /ncdump.
* Refactor auth support.

Misc. Unrelated Changes:
* More cleanup of the S3 support
* Add support for S3 authentication in .rc files: HTTP.S3.ACCESSID and HTTP.S3.SECRETKEY.
* Remove the hashkey from the struct OBJHDR since it is never used.
2020-11-19 17:01:04 -07:00
Dennis Heimbigner
25d2e05444 Prepare for the path management code
Rename some files in prep for eventual implementation
of more comprehensive cross-platform file path management.
2020-10-13 19:12:15 -06:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Dennis Heimbigner
313121a229 Use proper CURLOPT values for VERIFYHOST and VERIFYPEER
re: https://github.com/Unidata/netcdf-c/issues/1684
re: e-support VZL-904142

Two issues:
1. As of libcurl 7.66, the semantics of CURLOPT_SSL_VERIFYHOST
   changed so that the non-zero values affects certificate processing.
2. The current library was forcing the values of VERIFYPEER
   and VERIFYHOST to zero instead of leaving them to the default values.

Solution was first to leave the defaults in place for VERIFYPEER and VERIFYHOST
as long as they are not set in .ocrc/.dodsrc file.
Second, the value of HTTP.SSL.VERIFYPEER or HTTP.SSL.VERIFYHOST
as set in .ocrc/.dodrc is used to set the corresponding CURLOPT flags.
So for example, adding
> HTTP.SSL.VERIFYHOST=2
will set the value of CURLOPT_SSL_VERIFYHOST to 2, the default.
Using
> HTTP.SSL.VERIFYHOST=0
will set the value of CURLOPT_SSL_VERIFYHOST to 0, which disables it.
Similarly for VERIFYPEER.

Finally the semantics of HTTP.SSL.VALIDATE is now equivalent to
> HTTP.SSL.VERIFYPEER=1
> HTTP.SSL.VERIFYHOST=2
2020-04-10 13:42:27 -06:00
Dennis Heimbigner
748d26c114 Add support for CURLOPT_CONNECTTIMEOUT
I see that there is no way to set CURLOPT_CONNECTTIMEOUT,
but there is support for CURLOPT_TIMEOUT.
So, accept the line 'HTTP.CONNECTTIMEOUT'
in .rc file to allow user to set CURLOPT_CONNECTTIMEOUT.
2020-01-09 11:48:04 -07:00
Dennis Heimbigner
f1506d552e Change (again), and hopefully simplify, the file model inference algorithm.
* For URL paths, the new approach essentially centralizes all information
  in the URL into the "#mode=" fragment key and uses that value
  to determine the dispatcher for (most) URLs.

* The new approach has the following steps:

  1. canonicalize the path if it is a URL.
  2. use the mode= fragment key to determine the dispatcher
  3. if dispatcher still not determined, then use the mode flags
     argument to nc_open/nc_create to determine the dispatcher.
  4. if the path points to something readable, attempt to read the
     magic number at the front, and use that to determine the dispatcher.
     this case may override all previous cases.

* Misc changes.

  1. Update documentation
  2. Moved some unit tests from libdispatch to unit_test directory.
  3. Fixed use of wrong #ifdef macro in test_filter_reg.c
     [I think this may fix an previously reported esupport query].
2019-09-29 12:59:28 -06:00
Dennis Heimbigner
6934aa2e8b Thread safety: step 1: cleanup
re: https://github.com/Unidata/netcdf-c/issues/1373 (partial)

* Mark some global constants be const to indicate to make them easier to track.
* Hide direct access to the ncrc_globalstate behind a function call.
* Convert dispatch tables to constants (except the user defined ones)
  This has some consequences in terms of function arguments needing to be marked
  as const also.
* Remove some no longer needed global fields
* Aggregate all the globals in nclog.c
* Uniformly replace nc_sizevector{0,1} with NC_coord_{zero,one}
* Uniformly replace nc_ptrdffvector1 with NC_stride_one
* Remove some obsolete code
2019-03-30 14:06:20 -06:00
Dennis Heimbigner
4cd440dd96 URL with username+pwd in url is not working.
re: esupport (DVK-211460)

Turns out it was a typo in libdispatch/dauth.c
Fix is to Change:
HTTP.USERNAME -> HTTP.CREDENTIALS.USERNAME
and
HTTP.PASSWORD-> HTTP.CREDENTIALS.PASSWORD
2019-02-09 15:06:03 -07:00
Ward Fisher
25417880f7 Updated libdispatch/ files with copyright notice. 2018-12-06 14:29:57 -07:00
Dennis Heimbigner
d62a9e623c Fix the NC_INMEMORY code to work in all cases with HDF5 1.10.
re: github issue https://github.com/Unidata/netcdf-c/issues/1111

One of the less common use cases for the in-memory feature is
apparently failing with HDF5-1.10.x.  The fix is complicated and
requires significant changes to libhdf5/nc4memcb.c. The current
setup is detailed in the file docs/inmeminternal.dox.

Additionally, it was discovered that the program
nc_test/tst_inmemory.c, which is invoked by
nc_test/run_inmemory.sh, actually was failing because of the
above problem. But the failure is not detected since the script
does not return non-zero value.

Other Changes:
1. Fix nc_test_tst_inmemory to return errors correctly.
2. Make ncdap_tests/findtestserver.c and dap4_tests/findtestserver4.c
   be generated from ncdap_test/findtestserver.c.in.
3. Make LOG() print output to stderr instead of stdout to
   avoid contaminating e.g. ncdump output.
4. Modify the handling of NC_INMEMORY and NC_DISKLESS flags
   to properly handle that NC_DISKLESS => NC_INMEMORY. This
   affects a number of code pieces, especially memio.c.
2018-09-04 11:27:47 -06:00
Ward Fisher
3cce9e99e2 Corrected several potential null dereference or memory leak issues detected by static analysis. 2018-06-07 15:21:42 -06:00
Nehal J Wani
1b91bd89d4
Fix build on pre-C99 compilers
- Make sure that the variables are declared at the top of the block.
 - Add fix to enable inline for various compilers
2017-11-26 01:47:54 +05:30
Dennis Heimbigner
a2e0f069ec This pr should probably be delayed until after Version 4.5.
Primary change is to cleanup code and remove duplicated code.

1. Unify the rc file reading into libdispatch/drc.c. Eventually extend
   if we need rc file for netcdf itself as opposed to the dap code.
2. Unify the extraction from the rc file of DAP authorization info.
3. Misc. other small unifications: make temp file, read file.
4. Avoid use of libcurl when reading file:// because
   there is some kind of problem with the Visual Studio version.
   Might be related to the winpath problem.
   In any case, do direct read instead.
5. Add new error code NC_ERCFILE for errors in reading RC file.
6. Complete documentation cleanup as indicated in this comment
   https://github.com/Unidata/netcdf-c/pull/472#issuecomment-325926426
7. Convert some occurrences of #ifdef _WIN32 to #ifdef _MSC_VER
2017-09-02 18:09:36 -06:00
Dennis Heimbigner
35b97e0454 add dauth.c 2017-08-31 14:36:04 -06:00