Commit Graph

589 Commits

Author SHA1 Message Date
Dennis Heimbigner
f6e25b695e Fix additional S3 support issues
re: https://github.com/Unidata/netcdf-c/issues/2117
re: https://github.com/Unidata/netcdf-c/issues/2119

* Modify libsrc to allow byte-range reading of netcdf-3 files in private S3 buckets; this required using the aws sdk. Also add a test case.
* The aws sdk can sometimes cause problems if the Awd::ShutdownAPI function is not called. So at optional atexit() support to ensure it is called. This is disabled for Windows.
* Add documentation to nczarr.md on how to build and use the aws sdk under windows. Currently it builds, but testing fails.
* Switch testing from stratus to the Unidata bucket on S3.
* Improve support for the s3: url protocol.
* Add a s3 specific utility code file: ds3util.c
* Modify NC_infermodel to attempt to read the magic number of byte-ranged files in S3.

## Misc.

* Move and rename the core S3 SDK wrapper code (libnczarr/zs3sdk.cpp) to libdispatch since it now used in libsrc as well as libnczarr.
* Add calls to nc_finalize in the utilities in case atexit is disabled.
* Add header only json parser to the distribution rather than as a built source.
2021-10-29 20:06:37 -06:00
Dennis Heimbigner
58ba84de0b update merge 2021-10-26 20:53:05 -06:00
Greg Sjaardema
9c6181da09
Remove duplicate line
Remove a duplicate line...
2021-10-25 17:09:17 -06:00
Dennis Heimbigner
2d08c64290 Fix bug in the default HDF5 byte-range reader
re: https://github.com/Unidata/netcdf-c/issues/2122

There was a string allocation error in H5FDhttp.c
2021-10-17 13:55:03 -06:00
Ward Fisher
4086bbd887
Merge pull request #2056 from gsjaardema/WIP-attribute-creation-order-tracking-option
Attribute creation order on/off
2021-10-13 10:18:36 -06:00
Dennis Heimbigner
289103d2b1 Merge branch 'master' into zarrs3.dmh 2021-10-07 15:10:03 -06:00
Ward Fisher
5cd17ba059
Merge pull request #2113 from rouault/fix_stack_read_overflow_ncindexlookup
Fix a stack-read-overflow in ncindexlookup()
2021-10-01 17:09:49 -05:00
Dennis Heimbigner
6b69b9c52c Significantly Improve Amazon S3 Cloud Storage Support
## S3 Related Fixes

* Add comprehensive support for specifying AWS profiles to provide access credentials.
* Parse the files "~/.aws/config" and "~/.aws/credentials to provide credentials for the HDF5 ROS3 driver and to locate default region.
* Add a function to obtain the currently active S3 credentials. The search rules are defined in docs/nczarr.md.
* Provide documentation for the new features.
* Modify the struct NCauth (in include/ncauth.h) to replace specific S3 credentials with a profile name.
* Add a unit test to test the operation of profile and credentials management.
* Add support for URLS of the form "s3://<bucket>/<key>"; this requires obtaining a default region.
* Allows the specification of profile and/or region in a URL of the form "#mode=nczarr,...&aws.region=...&aws.profile=..."

## Misc. Fixes

* Move the ezxml code to libdispatch so that it can be used both by DAP4 and nczarr.
* Modify nclist to provide a deep clone operation.
* Modify ncuri to provide a deep clone operation.
* Modify the .rc file format to allow the specification of a path to be tested when looking for an entry in the .rc file.
* Ensure that the NC_rcload function is called.
* Modify nchttp to support setting request headers.
2021-09-27 18:36:33 -06:00
Even Rouault
0582c2044a
Fix a stack-read-overflow in ncindexlookup()
Fixes an issue with strlen() reading outside the stack allocated buffer
by NC4_HDF5_inq_att, when reading a name whose length is NC_MAX_NAME.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=39189 found
on GDAL

==1895951== Conditional jump or move depends on uninitialised value(s)
==1895951==    at 0x483EF58: strlen (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1895951==    by 0x48EF73E: ncindexlookup (ncindex.c:60)
==1895951==    by 0x48E81DF: nc4_find_grp_att (nc4internal.c:587)
==1895951==    by 0x48E5B39: nc4_get_att_ptrs (nc4attr.c:72)
==1895951==    by 0x48F98A0: NC4_HDF5_inq_att (hdf5attr.c:818)
==1895951==    by 0x48847F7: nc_inq_att (dattinq.c:91)
==1895951==    by 0x10D693: pr_att (ncdump.c:767)
==1895951==    by 0x110ADB: do_ncdump_rec (ncdump.c:1887)
==1895951==    by 0x1112F1: do_ncdump (ncdump.c:2038)
==1895951==    by 0x11248B: main (ncdump.c:2478)
==1895951==
==1895951== Use of uninitialised value of size 8
==1895951==    at 0x48A24E4: crc64_little (dcrc64.c:173)
==1895951==    by 0x48A27F4: NC_crc64 (dcrc64.c:229)
==1895951==    by 0x4892D49: NC_hashmapkey (nchashmap.c:159)
==1895951==    by 0x489314B: NC_hashmapget (nchashmap.c:263)
==1895951==    by 0x48EF75F: ncindexlookup (ncindex.c:60)
==1895951==    by 0x48E81DF: nc4_find_grp_att (nc4internal.c:587)
==1895951==    by 0x48E5B39: nc4_get_att_ptrs (nc4attr.c:72)
==1895951==    by 0x48F98A0: NC4_HDF5_inq_att (hdf5attr.c:818)
==1895951==    by 0x48847F7: nc_inq_att (dattinq.c:91)
==1895951==    by 0x10D693: pr_att (ncdump.c:767)
==1895951==    by 0x110ADB: do_ncdump_rec (ncdump.c:1887)
==1895951==    by 0x1112F1: do_ncdump (ncdump.c:2038)
==1895951==
2021-09-24 11:59:48 +02:00
Edward Hartnett
5200477de1 now nsd of 0 is NC_EINVAL for nc_def_var_quantize() 2021-09-10 06:10:20 -06:00
Edward Hartnett
0ce463761c
Merge branch 'main' into ejh_quantize_2 2021-09-07 10:44:45 -06:00
Ward Fisher
0a4f4e16ed
Merge pull request #2098 from DennisHeimbigner/fortcache.dmh
Make the fortran cache API always be defined.
2021-09-07 10:30:00 -06:00
Dennis Heimbigner
11fe00ea05 Add filter support to NCZarr
Filter support has three goals:

1. Use the existing HDF5 filter implementations,
2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr,
3. Allow filters to be used even when HDF5 is disabled

Detailed usage directions are define in docs/filters.md.

For now, the existing filter API is left in place. So filters
are defined using ''nc_def_var_filter'' using the HDF5 style
where the id and parameters are unsigned integers.

This is a big change since filters affect many parts of the code.

In the following, the terms "compressor" and "filter" and "codec" are generally
used synonomously.

### Filter-Related Changes:
* In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms.
* Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h.
* Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out.
* Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h.
* Add a number of new test to test the new nczarr filters.
* Let ncgen parse _Codecs attribute, although it is ignored.

### Plugin directory changes:
* Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file
* Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip
* Add a Codec defaulter (see docs/filters.md) for the big four filters.
* Make plugins work with windows by properly adding __declspec declaration.

### Misc. Non-Filter Changes
* Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5.
* Improve support for caching
* More fixes for path conversion code
* Fix misc. memory leaks
* Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath.
* Add a number of new test to test the non-filter fixes.
* Update the parsers
* Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'
2021-09-02 17:04:26 -06:00
Dennis Heimbigner
b81f8b676a Make the fortran cache API always be defined.
re: Issue https://github.com/Unidata/netcdf-c/issues/2096

The methods nc_set_var_chunk_cache_ints and nc_def_var_chunking_ints
are Fortran entry points for accessing the cache. They are not defined
if netcdf-c is built with --disable-hdf5.

Fix is to create dummy versions that do nothing and return NC_NOERR
when invoked. These dummy versions are defined when USE_HDF5 is false.
2021-09-01 14:10:02 -06:00
Edward Hartnett
eabbd686b0 bitgroom working for floats 2021-08-29 00:30:17 -06:00
Edward Hartnett
4f96fcc7b2 improved doxygen documenation 2021-08-26 10:46:43 -06:00
Edward Hartnett
d29436c99d improved doxygen documenation 2021-08-26 10:39:27 -06:00
Edward Hartnett
b2c0bb9810 more quantize testing 2021-08-25 01:54:25 -06:00
Edward Hartnett
a02faa0cb5 more testing of qunatize setting 2021-08-25 01:45:38 -06:00
Edward Hartnett
0f26083f4d perparing to apply bitgroom algorithm 2021-08-25 01:31:26 -06:00
Edward Hartnett
c9eca4bbca moving function 2021-08-24 04:09:57 -06:00
Edward Hartnett
148706b657 now reading quantize attribute to get settings 2021-08-24 03:35:52 -06:00
Edward Hartnett
233ddfb0a4 further development 2021-08-24 02:57:49 -06:00
Edward Hartnett
24ed2a40c4 fixed comment 2021-08-24 02:09:40 -06:00
Edward Hartnett
d053418867 merged nc4hdf.c with changes from master branch 2021-08-24 02:07:43 -06:00
Edward Hartnett
d6d9825b2c now qunatizing with inq function in dispatch table 2021-08-24 01:53:16 -06:00
Edward Hartnett
3202b8b37c adding quantize functions to all the dispatch tables 2021-08-24 01:26:44 -06:00
Edward Hartnett
dabe008ad0 further preparation for try 2 at quantizing 2021-08-24 00:55:51 -06:00
Ward Fisher
e06d8b744f Merge branch 'patch-48' of https://github.com/gsjaardema/netcdf-c into v4.8.1-wellspring.wif 2021-08-16 10:37:52 -06:00
Greg Sjaardema
3ce6df07ba Detect attribute creation order tracking setting
When opening an existing file for NC_WRITE access,
check whether the file was created originally with
attribute creation order tracking disabled and if
so, use that setting for subsequent attribute creation.

Also check the `mode` passed in to the open call
in case application is explicitly disabling the
attribute creation order tracking with the NC_NOATTRCREORD flag.
2021-08-12 13:22:49 -06:00
Greg Sjaardema
a611f19b32 Finish argument name refactoring... 2021-08-10 09:03:21 -06:00
Greg Sjaardema
d2c3165664 WIP: attribute creation order on/off
Work in progress / Proof of concept:

Add a capability to disable the tracking of attribute creation order.
See #2054 for details.

This PR adds a `NC_NOATTCREORD` define which can be passed int the
`mode` argument to `nc_create`.  If it is present, then the
calls to set the attribute creation order tracking is disabled.
This should only be used for files in which you *know* that the
ordering of the attributes does not matter to *any* potential
readers of this database.
2021-08-06 13:30:47 -06:00
Greg Sjaardema
b9d192d0c4
Only write the coord dimids if ndims >= 1
It looks like some vars have ndims==0 in which case the coord_dimids should not be written. Modify patch to catch those cases.
2021-08-04 09:49:48 -06:00
Greg Sjaardema
7b6f11c544
The coord dimids should be written for all variables
See discussion in #1279
2021-08-03 13:36:34 -06:00
Edward Hartnett
c77c4a9a40 fixing H5Linterate() API compatipility problem 2021-08-03 02:27:57 -06:00
Ward Fisher
84f0696e7d
Merge pull request #2036 from Unidata/gh1983.wif
Address optimization issue
2021-07-29 11:15:29 -06:00
Ward Fisher
9f798e2ed6 Merge branch 'virtual_datasets' of https://github.com/d70-t/netcdf-c into gh1983.wif 2021-07-19 09:44:35 -07:00
Bruno Pagani
74fe2fe95b
libhdf5/H5FDhttp: add missing semicolons to H5Epush_ret
In HDF5 1.12.1, this was changed from optional to required per the changelog:

>H5Epush_ret() is a function-like macro that has been changed to
>contain a `do {} while(0)` loop. Consequently, a trailing semicolon
>is now required to end the `while` statement. Previously, a trailing
>semi would work, but was not mandatory.

This should be backward compatible with older version of HDF5.
2021-07-18 20:12:47 +00:00
Greg Sjaardema
e2d0bbb8ea
Merge branch 'master' into eliminate_need_for_hdf5-1.6-API 2021-05-28 07:11:13 -06:00
Dennis Heimbigner
74e7812d83 Improve error message when non-existent filter is encountered.
re: https://github.com/Unidata/netcdf-c/issues/1996

Improve the error message and location that is reported when reading a filter with a variable that uses a filter that is not available on the reading platform.

This requires checking the availability of the filter, recording it, and failing when any attempt is made to read or write that variable. A test case was added for this in tst_filter.sh. Also, LOG level 0 message is generated giving the variable and the filter id.

Note that by design if there is no attempt to read or write the variable, then no error is reported; this means that, for example, ncdump -h will list the filter even though it is not actually available. This is important for allowing a user to see the filter details.
2021-05-17 19:49:58 -06:00
Greg Sjaardema
f92b7a9505
Fix so works with hdf5-1.8 also
Fix a bad change so can still compile with hdf5-1.8.x
2021-04-28 15:14:57 -06:00
Greg Sjaardema
cbcee382b0 Remove need for HDF5-1.6 API being defined 2021-04-28 13:59:24 -06:00
Dennis Heimbigner
e038553abe Update RELEASE_NOTES.md 2021-04-01 14:12:49 -06:00
Dennis Heimbigner
e7c4e7ead1 add zjson fix 2021-04-01 13:56:04 -06:00
Ward Fisher
ffa8a7067f Merge branch '951' of https://github.com/brtnfld/netcdf-c into 4.8.0-wellspring-prs.wif 2021-03-22 11:51:54 -06:00
Dennis Heimbigner
0b7a5382e7 Codify cross-platform file paths
The netcdf-c code has to deal with a variety of platforms:
Windows, OSX, Linux, Cygwin, MSYS, etc.  These platforms differ
significantly in the kind of file paths that they accept.  So in
order to handle this, I have created a set of replacements for
the most common file system operations such as _open_ or _fopen_
or _access_ to manage the file path differences correctly.

A more limited version of this idea was already implemented via
the ncwinpath.h and dwinpath.c code. So this can be viewed as a
replacement for that code. And in path in many cases, the only
change that was required was to replace '#include <ncwinpath.h>'
with '#include <ncpathmgt.h>' and then replace file operation
calls with the NCxxx equivalent from ncpathmgr.h Note that
recently, the ncwinpath.h was renamed ncpathmgmt.h, so this pull
request should not require dealing with winpath.

The heart of the change is include/ncpathmgmt.h, which provides
alternate operations such as NCfopen or NCaccess and which properly
parse and rebuild path arguments to work for the platform on which
the code is executing. This mostly matters for Windows because of the
way that it uses backslash and drive letters, as compared to *nix*.
One important feature is that the user can do string manipulations
on a file path without having to worry too much about the platform
because the path management code will properly handle most mixed cases.
So one can for example concatenate a path suffix that uses forward
slashes to a Windows path and have it work correctly.

The conversion code is in libdispatch/dpathmgr.c, and the
important function there is NCpathcvt which does the proper
conversions to the local path format.

As a rule, most code should just replace their file operations with
the corresponding NCxxx ones defined in include/ncpathmgmt.h. These
NCxxx functions all call NCpathcvt on their path arguments before
executing the actual file operation.

In some rare cases, the client may need to directly use NCpathcvt,
but this should be avoided as much as possible. If there is a need
for supporting a new file operation not already in ncpathmgmt.h, then
use the code in dpathmgr.c as a template. Also please notify Unidata
so we can include it as a formal part or our supported operations.
Also, if you see an operation in the library that is not using the
NCxxx form, then please submit an issue so we can fix it.

Misc. Changes:
* Clean up the utf8 testing code; it is impossible to get some
  tests to work under windows using shell scripts; the args do
  not pass as utf8 but as some other encoding.
* Added an extra utf8 test case: test_unicode_path.sh
* Add a true test for HDF5 1.10.6 or later because as noted in
  PR https://github.com/Unidata/netcdf-c/pull/1794,
  HDF5 changed its Windows file path handling.
2021-03-04 13:41:31 -07:00
Dennis Heimbigner
2afbdbd18f Add support for the XArray Zarr _ARRAY_DIMENSIONS attribute
The XArray implementation that uses Zarr for storage
provides a mechanism to simulate named dimensions.
It does this by adding a per-variable attribute called
_ARRAY_DIMENSIONS. This attribute contains a list of names
to be matched against the shape values of the variable.
In effect a named dimension is created with the name
_ARRAY_DIMENSIONS(i) and length shape(i) for all i
in range 0..rank(variable).
Both read and write support is provided.

This XArray support is only invoked if the mode value
of "xarray" is defined. So for example, as in this URL.
````
https://s3.us-west-1.amazonaws.com/bucket/dataset#mode=nczarr,xarray,s3
````
Note that the "xarray" mode flag also implies mode flag "zarr", so the above
is equivalent to this URL.
````
https://s3.us-west-1.amazonaws.com/bucket/dataset#mode=nczarr,zarr,xarray,s3
````

The primary change to implement this was to unify the handling
of dimension references in libnczarr/zsync.

A test for this and other pure-zarr features was added as
nczarr_test/run_purezarr.sh

Other changes:
* Make sure distcheck leaves no files around.
* Change the special attribute flag DIMSCALEFLAG to HIDDENATTRFLAG
  to support the xarray attribute.
* Annotate the zmap implementations with feature flags such as
  WRITEONCE (for zip files).
2021-02-24 13:46:11 -07:00
Dan Ibanez
69182dc438 Ensure MPI header found without wrapper 2021-01-19 09:38:07 -07:00
Scot Breitenfeld
a464bea84b removed the check for H5Pset_libver_bounds (HAVE_H5PSET_LIBVER_BOUNDS) since API
function was introduced in 1.8.0 (and some tests used  H5Pset_libver_bounds without
checking HAVE_H5PSET_LIBVER_BOUNDS.
2021-01-11 16:36:23 -06:00
Scot Breitenfeld
46b2e1d666 removed the use of H5_VERSION_LT 2021-01-11 10:36:53 -06:00