Commit Graph

631 Commits

Author SHA1 Message Date
Dennis Heimbigner
68bcd1122a Enforce that !ENABLE_BYTERANGE => !ENABLE_HDF5_ROS3
This is a follow on to PR https://github.com/Unidata/netcdf-c/pull/1890

Modify configure.ac to enforce that
!ENABLE_BYTERANGE => !ENABLE_HDF5_ROS3
2020-11-28 13:00:06 -07:00
Greg Sjaardema
db0b84252c
Fix undefined struct member access
The `http` field of the hdf5 info struct is not defined unless `ENABLE_HDF5_ROS3` or `ENABLE_BYTERANGE` or `ENABLE_S3_SDK` is defined.  Based on a quick look at the code, I think that the `ENABLE_HDF5_ROS3` define is the relavant one here.  Maybe a better fix is to check if any of them are defined...
2020-11-24 08:25:29 -07:00
Dennis Heimbigner
eb3d9eb0c9 Provide a Number of fixes/improvements to NCZarr
Primary changes:
* Add an improved cache system to speed up performance.
* Fix NCZarr to properly handle scalar variables.

Misc. Related Changes:
* Added unit tests for extendible hash and for the generic cache.
* Add config parameter to set size of the NCZarr cache.
* Add initial performance tests but leave them unused.
* Add CRC64 support.
* Move location of ncdumpchunks utility from /ncgen to /ncdump.
* Refactor auth support.

Misc. Unrelated Changes:
* More cleanup of the S3 support
* Add support for S3 authentication in .rc files: HTTP.S3.ACCESSID and HTTP.S3.SECRETKEY.
* Remove the hashkey from the struct OBJHDR since it is never used.
2020-11-19 17:01:04 -07:00
Dennis Heimbigner
d631656966 Remove trailing comma from _NCProperties attribute value.
If NCPROPERTIES_EXTRA (see configure.ac) is defined
but is null or empty, then an extra comma is being generated
at the end of _NCProperties global attribute.

Soln: check for null/empty NCPROPERTIES_EXTRA value.
2020-11-14 15:07:08 -07:00
Dennis Heimbigner
730aa1f6bc Improve the building of NCZARR S3 support in CMake and Autoconf
There were some irregularities in the flags for handling NCZarr S3 support.

The primary change is to regularize the flags controlling this to the following.

1. Automake: --enable-nczarr-s3 and CMake: ENABLE_NCZARR_S3
2. Automake: --enable-nczarr-s3-tests and CMake: ENABLE_NCZARR_S3_TESTS

Flag 1 indicates that NCZarr should be built with S3 support enabled.
Flag 2 indicates that the NCZarr S3 tests should be run

These two flags are separate because running the NCZarr S3 tests
requires access to protected S3 resources. Currently, running
these tests is restricted to Unidata personnel. However, users
may want to enable S3 support even if they cannot run the tests.
It is, of course, an error to specify 2 without specifying 1.

Additionally, if the AWS S3 SDK library is not found, then the NCZARR S3
support and testing must be disabled. Otherwise an error is signaled
during the build.

Some of these NCZarr and S3 changes are propagated to nc-config.

Misc. Other Changes:

1. Allow testing for CYGWIN or MSVC in shell scripts.
2. Add specific test for HDF5 library version 1.10.6.
   This is encoded as "HDF5_UTF8_PATHS" because that is the first
   version where HDF5 properly supports it under Windows. This is used
   in hdf5internal/nc4_ndf5_ansi_to_utf8.
3. Add a AM Conditional -- AX_IGNORE -- for use in testing
   when it is desirable to temporarily suppress Makefile code.
4. Add MULTIFILTER flag to CMakeLists.txt
2020-10-16 15:04:51 -06:00
Ward Fisher
e4138efa9d
Merge pull request #1851 from brtnfld/master
Replaced deprecated (in 1.8.0) H5Aopen_name with H5Aopen_by_name
2020-10-01 14:25:06 -07:00
Dennis Heimbigner
aeb3ac2809 Mostly revert the filter code to reduce its complexity of use.
re: https://github.com/Unidata/netcdf-c/issues/1836

Revert the internal filter code to simplify it. From the user's
point of view, the only visible changes should be:

1. The functions that convert text to filter specs have had their signature reverted and have been moved to netcdf_aux.h
2. Some filter API functions now return NC_ENOFILTER when inquiry is made about some filter.

Internally,the dispatch table has been modified to get rid of the filter_actions
entry and associated complex structures. It has been replaced with
inq_var_filter_ids and inq_var_filter_info entries and the dispatch table
version has been bumped to 3. Corresponding NOOP and NOTNC4 functions
were added to libdispatch/dnotnc4.c. Also, the filter_action entries
in dispatch tables were replaced for all dispatch code bases (HDF5, DAP2,
etc). This should only impact UDF users.

In the process, it became clear that the form of the filters
field in NC_VAR_INFO_T was format dependent, so I converted it to
be of type void* and pushed its management into the various dispatch
code bases. Specifically libhdf5 and libnczarr now manage the filters
field in their own way.

The auxilliary functions for parsing textual filter specifications
were moved to netcdf_aux.h and were renamed to the following:
* ncaux_h5filterspec_parse
* ncaux_h5filterspec_parselist
* ncaux_h5filterspec_free
* ncaux_h5filter_fix8

Misc. Other Changes:

1. Document NUG/filters.md updated to reflect the changes above.
2. All the old data types (structs and enums)
   used by filter_actions actions were deleted.
   The exception is the NC_H5_Filterspec because it is needed
   by ncaux_h5filterspec_parselist.
3. Clientside filters were removed -- another enhancement
   for which no-one ever asked.
4. The ability to remove filters was itself removed.
5. Some functionality needed by nczarr was moved from libhdf5
   to libsrc4 e.g. nc4_find_default_chunksizes
6. All the filterx code was removed
7. ncfilter.h and nc4filter.c no longer used

Misc. Unrelated Changes:

1. The nczarr_test makefile clean was leaving some directories; so
   add clean-local to take care of them.
2020-09-27 12:43:46 -06:00
Scot Breitenfeld
2620c01067 Replaced deprecated (in 1.8.0) H5Aopen_name with H5Aopen_by_name 2020-09-25 12:17:20 -05:00
Dennis Heimbigner
f3218a2e2c Use the built-in HDF5 byte-range reader, if available.
re: Issue https://github.com/Unidata/netcdf-c/issues/1848

The existing Virtual File Driver built to support byte-range
read-only file access is quite old. It turns out to be extremely
slow (reason unknown at the moment).

Starting with HDF5 1.10.6, the HDF5 library has its own version
of such a file driver. The HDF5 developers have better knowledge
about building such a driver and what incantations are needed to
get good performance.

This PR modifies the byte-range code in hdf5open.c so
that if the HDF5 file driver is available, then it is used
in preference to the one written by the Netcdf group.

Misc. Other Changes:

1. Moved all of nc4print code to ncdump to keep appveyor quiet.
2020-09-24 14:33:58 -06:00
Tobias Kölling
a80d473ca8 Merge remote-tracking branch 'upstream/master' into virtual_datasets 2020-09-15 09:50:25 +02:00
Tobias Kölling
8de6398b99 check if H5D_VIRTUAL exists in installed HDF5 library
Older HDF5 libraries do not support virtual datasets but could otherwise
be supported by netCDF4. This change removes the special case to handle
HDF5 virtual datasets if the installed HDF5 version does not support
virtual datasets.
2020-09-14 17:25:25 +02:00
Dennis Heimbigner
2f0a6d22e9 Fix error where not converting fill data
re: Github Issue https://github.com/Unidata/netcdf-c/issues/1826

It turns out that the common get code (NC4_get_vars) in libhdf5
(and libnczarr) has an optimization where it does not attempt to
read from the file if the file is all fill values. Rather it
just fills the output buffer with the fill value.  The problem
is that -- in that case -- it forgets that conversion might still be
needed.  So the conversion never occurs and the raw bits of
the fill data are stored directly into the memory space.

Solution: move some code around to properly do the
conversion no matter how the data was obtained.

Added a test cases nc_test4/test_fillonly.sh and
nczarr_test/test_fillonlyz.sh
2020-09-12 14:49:59 -06:00
Tobias Kölling
69fb44ec4b added NC_VIRTUAL storage layout 2020-09-03 17:51:46 +02:00
Tobias Kölling
a3b753d764 change H5F_CLOSE_SEMI -> H5F_CLOSE_WEAK for nc4_create_file as well
The nc_sync test fails if the settings are different for file creation
and opening.
2020-09-02 20:02:08 +02:00
Tobias Kölling
9f8897762d changed H5Pset_fclose_degree to H5F_CLOSE_WEAK
It seems like it is part of the design of HDF5 virtual datasets that
objects within a file remain opened while the files is aready "closed".
Setting the fclose degree to SEMI would cause the library to bail out.
This commit makes nc_test4/tst_virtual_dataset succeed.

See also Unidata/netcdf-c#1799
2020-09-02 16:16:57 +02:00
Tobias Kölling
4c27730ae3 hdf5: added unknown storage specification
In case HDF5 adds more storage specifications, netcdf4 should be able to
cope with them by default. Further specializations could be added
nonetheless.
2020-09-02 16:13:23 +02:00
Ward Fisher
31dee0c4da
Revert "Revert "Fix nczarr-experimental: improve build support, disengage hdf5 vs netcdf4 flags, and find AWS libraries"" 2020-08-17 19:15:47 -06:00
Ward Fisher
16c27ca13f
Revert "Fix nczarr-experimental: improve build support, disengage hdf5 vs netcdf4 flags, and find AWS libraries" 2020-08-17 15:51:01 -06:00
Dennis Heimbigner
d85bb6fe20 The big change for this commit is complete the
disengagement of enable-netcdf4 from enable-hdf5.
That is, with the advent of nczarr, it is possible
to turn off hdf5 but still need netcdf-4 enabled
because nczarr uses libsrc4, but not libhdf5.
This change involves a bunch of things:
1. Modify configure.ac and CMakelist to make enable_hdf5
   control if hdf5 support is provided. For back compatibility,
   disable-netcdf4 is treated as disable-hdf5. But internally,
   netcdf4 support is controlled only by the enabling of formats
   that require it.
2. In support of #1, modify .travis.yml to use enable/disable-hdf5
   instead of enable/disable-netcdf4.
3. test_common.in is modified to track selected features,
   including enable-hdf5 and enable-s3-tests. This is used in
   selected tests that mix netcdf-3 and netcdf4 tests.
4. The conflation of USE_HDF5 and USE_NETCDF4 is common in
   code, tests, and build files, so all of those had to be weeded out.
5. It turns out that some of the NC4_dim functions really are HDF5 specific,
   but are not treated as such. So they are moved from nc4dim.c to
   hdf5dim.c or hdf5dispatch.c
6. Some generic functions in libhdf5 can be (and were) moved to libsrc4.
2020-08-12 15:42:50 -06:00
bombipappoo
ecbb0f5bbf Convert filename from ANSI to UTF-8 before calling HDF5. 2020-07-14 22:44:42 +09:00
Ward Fisher
0825c9767f Merge branch 'ejh_par_test' of https://github.com/NOAA-GSD/netcdf-c into NOAA-GSD-ejh_par_test 2020-07-09 17:29:45 -06:00
Ward Fisher
7d2a646f25 Merge branch 'ejh_fix_redef' of https://github.com/NOAA-GSD/netcdf-c into NOAA-GSD-ejh_fix_redef 2020-07-09 13:55:37 -06:00
Edward Hartnett
3e60a863de fixed warning in hdf5filter.c 2020-07-08 11:24:54 -06:00
Edward Hartnett
832fbf19c8 now dont return error on second redef call for netcdf/HDF5 files 2020-07-08 11:10:15 -06:00
Edward Hartnett
ac3b77d418 merged in changes from ejh_test_szip_unlim 2020-07-04 07:43:50 -06:00
Edward Hartnett
4b78c0c4a3 merged master 2020-07-03 13:57:47 -06:00
Edward Hartnett
6c112efb8e fixed problem setting szip on var with unlimited dim and added test 2020-07-02 10:55:34 -06:00
Edward Hartnett
dc37446a5f more test development 2020-06-29 09:01:24 -06:00
Edward Hartnett
467f342ae9 further test development 2020-06-29 08:35:11 -06:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Greg Sjaardema
edf0ca6c98
Avoid potential integer overrun
It is possible for the values stored to `file_value_size` to overrun the storage capacity of a 32-bit integer.  The value does need to store negative values potentially, so can be `size_t` or `hsize_t`, so use the `hssize_t` which is a signed 64-bit value.  Could also use `ssize_t`, but that is not used in this routine...
2020-06-10 15:42:22 -06:00
Dennis Heimbigner
84c69afca7 Allow redefinition of variable filters
re: Github issue https://github.com/Unidata/netcdf-c/issues/1713

If nc_def_var_filter or nc_def_var_deflate or nc_def_var_szip is
called multiple times with the same filter id, but possibly with
different sets of parameters, then the first invocation is
sticky and later invocations are ignored. The desired behavior
is to have the last invocation be used.

This PR implements that desired behavior, with some special
cases.  If you call nc_def_var_deflate multiple times, then the
last invocation rule applies with respect to deflate. However,
the shuffle filter, if enabled, is always applied just before
applying deflate.

Misc unrelated changes:
1. Make client-side filters be disabled by default
2. Fix the definition of uintptr_t and use in oc2 and libdap4
3. Add some test cases
4. modify filter order tests to use plugin filters rather
   than client-side filters
2020-05-11 09:42:31 -06:00
Edward Hartnett
6aa6eff710 now properly setting HDF5 file cache for files created/opened sequentially on parallel IO builds 2020-05-08 11:00:56 -06:00
Edward Hartnett
e3c9e83ecf adding internal function, plus some documentation 2020-05-08 08:58:42 -06:00
Greg Sjaardema
3e919a568f
Remove line that was missed in original patch 2020-04-30 14:00:18 -06:00
Greg Sjaardema
1db3d07beb
Proof-of-Concept: Avoid N^2 behavior in NC4_inq_dim
The current library seems to have some behavior which is N^2 in the number of vars in a file.

The `NC4_inq_dim` routine calls down to `nc4_find_dim_len` which iterates through each `var` in the file/group and calls `find_var_dim_max_length` on each var and finds the largest length of the dim on each of those vars. This is done only for unlimited vars.

I have a file with 129 dim and 1630 vars.  The unlimited dimension is of length 41.  In my test program, I am reading data from 4 files which have the same dim and var count and reading every 4th time step (unlimited dimension).  If I run a profile, I see that 98.2% of the program time is in the `nc_get_vara_float` call tree and most of that is in `find_var_dim_max_length` (94.8%).

There are 66,142 calls to `nc_get_vara_float` resulting in 107,307,290 calls to `find_var_dim_max_length` with twice that number of calls to `malloc/free` and calls to 5 HDF5 routines.  All of this, at least in my case, to return the same `41` each time.

The proof of concept patch here will check whether the file is read-only (or no_write) and if so, it will cache the value of the dim length the first time it is calculated.   With this change, my example run is sped up by a factor of 60.  The time for `NC4_inq_dim` and below drops from 97.2% down to 2.7%.

I'm not sure whether this is the correct fix, or if there is some behavior that I am overlooking, but my users would definitely like a 10 second run compared to a 10 minute run... 

This is on current Netcdf master branch.

I will try to attach some valgrind/callgrind profiles.
2020-04-30 11:01:10 -06:00
Scot Breitenfeld
7b1b06b5ca Merge remote-tracking branch 'upstream/master' 2020-04-23 15:36:14 -05:00
Dennis Heimbigner
b0e0d81aa9 Fix reclamation of the ->format_XXX_info fields
nc4internal.c contains code to free the format_XXX_info
fields. Since these are format specific, this code
was moved to the dispatch code (libhdf5 and libhdf4
in the current case).

Additionally, there are some fields in nc4internal.h (e.g.
dimscale fields) that are specific to HDF5 and have been moved
to the corresponding HDF5 data structures and code.

Misc. other changes:
1. NC_VAR_INFO_T->hdf5_name renamed to alt_name to avoid
   implying it is necessarily HDF5 specific.
2. prefix NC_FILE_INFO_T with an instance of NC_OBJ for consistency.
   this also requires wrapping move_in_NCList() to keep
   hdr.id consistent.
2020-03-29 12:48:59 -06:00
Edward Hartnett
e7b9b1b587 fixed documentation of cache int functions 2020-03-24 15:02:42 -06:00
Scot Breitenfeld
c5d2e99417 Updated to use H5O_info2_t for HDF5 1.12 and the use of H5Oget_info3 instead of H5Gget_objinfo 2020-03-12 15:50:24 +00:00
Edward Hartnett
b29f9f34a0 whitespace cleanup 2020-03-08 09:10:07 -06:00
Edward Hartnett
4c7e162f34 less use of contiguous/compact field 2020-03-08 07:31:21 -06:00
Edward Hartnett
053752440b stop setting contiguous field in nc4hdf5.c 2020-03-08 07:18:52 -06:00
Edward Hartnett
04eafff166 stop setting contiguous field in hdf5filter.c 2020-03-08 07:18:11 -06:00
Edward Hartnett
5574317db7 stop setting contiguous/compact fields at file open 2020-03-08 07:17:01 -06:00
Edward Hartnett
61357cfd4d more use of storage field 2020-03-08 07:09:15 -06:00
Edward Hartnett
1761850795 continuing to switch to storage field 2020-03-08 07:05:51 -06:00
Edward Hartnett
b98a37e0b3 using storage field in nc4var.c 2020-03-08 06:38:44 -06:00
Edward Hartnett
119e8e9465 using storage in hdf5filter.c 2020-03-08 06:31:34 -06:00
Edward Hartnett
8dec9f6c99 now setting storage field when setting var storage 2020-03-08 06:29:49 -06:00
Edward Hartnett
d87a073a34 starting to use storage field when opening file 2020-03-08 06:21:08 -06:00
Edward Hartnett
0c419ec582 removed commented-out code 2020-03-06 09:57:33 -07:00
Edward Hartnett
502336c2c7 now return NC_EINVAL on attempt to set chunking on scalar var 2020-03-03 11:57:16 -07:00
Dennis Heimbigner
73537603e2 Make scalar X filter return an error instead of ignoring it 2020-03-02 15:10:54 -07:00
Dennis Heimbigner
420fdf4625 fix memory allocation failure in hdf5var.c 2020-03-02 11:45:41 -07:00
Dennis Heimbigner
7d1ca9ac85 fix references to var->deflate' 2020-03-02 11:12:30 -07:00
Dennis Heimbigner
e66c727c28 Fix Filters x compact 2020-02-29 15:33:27 -07:00
Dennis Heimbigner
f376c23329 Make utilities support NC_COMPACT
re: https://github.com/Unidata/netcdf-c/issues/1642

Modify ncdump, nccopy, and ncgen to support the NC_COMPACT storage option.
Added test cases and added description to the man pages for the utilities.

1. ncdump: For compact storage variable, print special attribute __Storage_ as
````
    <var>: _Storage = "compact";
````

2. ncgen: parse and implement
````
    <var>: _Storage = "compact";
````
in a .cdl file

3. nccopy: Extend the chunk specification (-c flag) to support
   compact using the forms
````
nccopy ... -c <var>:compact
and
nccopy ... -c <var>:contiguous
````

Misc. other changes
1. cleanup the copy_chunking function in ncdump/nccopy.c
2020-02-29 12:06:21 -07:00
Dennis Heimbigner
10d227fc1b fix parallel filter error discovered by Hartnett 2020-02-28 11:36:58 -07:00
Dennis Heimbigner
a3a3e15cb1 fix bad edit 2020-02-27 15:33:39 -07:00
Dennis Heimbigner
afe5a2998c
Merge branch 'master' into multifilter.dmh 2020-02-27 15:02:27 -07:00
Dennis Heimbigner
b488c272d5 Fix conflicts with master 2020-02-27 14:06:45 -07:00
Edward Hartnett
6f95f655dd
Merge branch 'master' into ejh_dispatch 2020-02-26 16:12:57 -07:00
Edward Hartnett
418e428a05 fixed problem with scalar compact 2020-02-26 09:13:12 -07:00
Edward Hartnett
b31aedcc8e all tests passing but compact storage for scalars not being properly written in file yet 2020-02-26 08:14:06 -07:00
Edward Hartnett
6241a6e7a0 more tests for storage, changed var names to reflect stortage 2020-02-25 15:55:34 -07:00
Edward Hartnett
2ff24bd6fe more tests for compact storage 2020-02-25 13:30:38 -07:00
Dennis Heimbigner
44d0dcaad2 Add support for multiple filters per variable.
re: https://github.com/Unidata/netcdf-c/issues/1584

Support has been added for multiple filters per variable.  This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.

The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
    - Obtain a list of the filters associated with a variable
    - Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
  about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
  an extended format for specifying a sequence of filters.
  The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
  filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
  for filters known to the netcdf library, which in turn is taken
  from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
  is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
  to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.

Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.

Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
2020-02-16 12:59:33 -07:00
Edward Hartnett
a8684c730c fixed merge conflict in RELEASE_NOTES 2020-02-15 06:42:49 -07:00
Edward Hartnett
05a6ff74b2 merged changes from master 2020-02-11 17:19:53 -07:00
Edward Hartnett
15059a18b7 merged changes from master 2020-02-11 17:19:25 -07:00
Edward Hartnett
a0839a2a7a added version to dispatch table 2020-02-09 13:07:58 -07:00
Edward Hartnett
b7ac19a43f only close non-zero typeids 2020-02-09 12:03:21 -07:00
Edward Hartnett
af6b6787bf fix for memory leak due to HDF5 types 2020-02-09 11:47:13 -07:00
Edward Hartnett
8057a552ef move nc_def_var_szip function so it will appear in the documentation 2020-02-07 09:09:01 -07:00
Edward Hartnett
558988bb18 fixed docs, removed unneeded defines in test 2020-02-07 07:54:12 -07:00
Edward Hartnett
c4d3937099 now check number of elements in chunk against pixels_per_block for szip compression 2020-02-07 07:03:40 -07:00
Edward Hartnett
ff7280512e checking for some bad pixels_per_block values for szip 2020-02-07 06:53:52 -07:00
Edward Hartnett
6d2d751e4e disallow zlib if szip already in use 2020-02-07 05:01:06 -07:00
Edward Hartnett
dc4e880c37 disallow szip if zlib already in use 2020-02-07 04:46:15 -07:00
Edward Hartnett
6b2947813f adding test for zlib+szip in HDF5 2020-02-07 03:38:43 -07:00
Edward Hartnett
1817790c6b rely completely on nc_def_var_filter for setting szip 2020-02-05 10:32:16 -07:00
Edward Hartnett
517ef4f257 use nc_def_var_filter in nc_def_var_szip 2020-02-05 10:25:30 -07:00
Edward Hartnett
52d745de68 now remember szip setting in filter fields 2020-02-04 08:40:15 -07:00
Ward Fisher
aadd5a2d81
Merge pull request #1589 from NOAA-GSD/ejh_szip
re-implement the nc_def_var_szip() function, including for parallel I/O
2020-01-22 16:27:33 -07:00
Edward Hartnett
6103f442cb fixed compile error 2020-01-21 08:50:38 -07:00
Edward Hartnett
c839a2d6c5
Merge branch 'master' into ejh_var_cache 2020-01-21 07:48:59 -07:00
Edward Hartnett
e94615a0e5
Merge branch 'master' into ejh_szip 2020-01-16 08:49:12 -07:00
Ward Fisher
8771d0bdf4
Merge pull request #1582 from NOAA-GSD/ejh_parallel_zlib
Allow user to turn on zlib, shuffle, and/or fletcher32 filters with parallel I/O for HDF5-1.10.2+
2020-01-13 16:06:51 -07:00
Dennis Heimbigner
f587654670 Make the dap4 code resistant to various server errors.
Some versions of some servers are returning malformed responses.
Make the library either handle them or gracefully fail.
The three server errors "fixed" here are as follows.
1. The attribute _NCProperties sometimes has a trailing nul character
   in its value. Soln is to elide the nul(s).
2. Sometimes a DAP response has no data part, only a DMR.
   Soln is to detect and return an error code instead of crashing.
3. Sometimes a server returns a redirection, but our current
   openmagic() function was not following the redirect. Soln
   is to follow redirects.
Also because of #2, I am temporarily making --disable-dap-remote-tests
be the default.
2020-01-08 15:18:31 -07:00
Ward Fisher
d9eb078bfd
Merge pull request #1592 from Unidata/travis_typo_fix.wif
Correct a typo in travis.yml
2020-01-07 18:31:53 -07:00
Ward Fisher
438119dd69
Merge pull request #1560 from NOAA-GSD/ejh_cache_docs
increase default cache size for netCDF-4/HDF5 files, also improve cache docs and add benchmarking program
2020-01-07 11:46:30 -07:00
Ward Fisher
72b79ac376 Cleaned up an 'uninitialized variable' issue reported by static analysis. Minor fix, rolling in to this PR rather than spinning up a separate one. 2020-01-07 11:43:50 -07:00
Edward Hartnett
184507be5f now using members in NC_VAR_INFO_T to hold szip info 2020-01-06 08:46:03 -07:00
Edward Hartnett
3e3b83bdbc whitespace cleanup 2020-01-06 08:09:20 -07:00
Edward Hartnett
6af1b0bd91 changed error code in nc_def_var_szip() to NC_EFILTER 2020-01-06 07:51:04 -07:00
Edward Hartnett
e703a7678c first stab at re-adding nc_def_var_szip() 2020-01-03 11:38:45 -07:00
Edward Hartnett
808a0e2be9 merged ejh_parallel_zlib 2020-01-02 14:25:31 -07:00
Ward Fisher
8f2be58d95
Merge pull request #1566 from NetCDF-World-Domination-Council/ejh_unlim_dims
Fix problems with read past end of dataset but within dimension length for vars with multiple unlimited dimensions
2019-12-23 15:08:56 -07:00
Edward Hartnett
680e44f628 changed name of macro 2019-12-20 13:58:01 -07:00
Edward Hartnett
995cfdad96 merged master 2019-12-20 11:16:11 -07:00
Edward Hartnett
a06df0e4eb fixing for non-parallel builds 2019-12-20 07:52:00 -07:00
Edward Hartnett
accb83a8b5 even more documentation updates 2019-12-20 07:20:02 -07:00
Edward Hartnett
4b7f839666 swtich to collective access when filters are applied 2019-12-20 07:00:12 -07:00
Edward Hartnett
f86c0fb8f9 now check that HDF5 version supports parallel zlib 2019-12-20 05:54:21 -07:00
Edward Hartnett
d534b1298a adding another zlib parallel I/O test 2019-12-20 05:28:20 -07:00
Ward Fisher
6c75e97764
Merge pull request #1570 from NOAA-GSD/ejh_compact
enable compact storage for netcdf-4 vars
2019-12-19 16:47:05 -07:00
Edward Hartnett
3e00967879 allow parallel writes to use zlib 2019-12-19 09:19:23 -07:00
Ward Fisher
29d070c50f
Merge pull request #1564 from NetCDF-World-Domination-Council/ejh_docs_cleanup
fix memory issue that may occur for some HDF5 file opens
2019-12-17 16:18:44 -07:00
Edward Hartnett
bacf017699 better handling of var cache for parallel builds 2019-12-17 06:39:09 -07:00
Edward Hartnett
fd604ddb06 fixed comment 2019-12-16 15:44:14 -07:00
Edward Hartnett
66a2b4c05e more testing for compact vars 2019-12-16 09:37:54 -07:00
Edward Hartnett
06896f432d got compact storage test working 2019-12-04 08:49:37 -07:00
Edward Hartnett
82df2876b6 starting to support compact storage 2019-12-04 07:53:37 -07:00
Edward Hartnett
e52a74520e tests and fix for multiple unlimited dim bug 2019-12-01 15:05:09 -07:00
Edward Hartnett
c5c38148bd moved udata.grps initialization to avoid memory problem on BAIL 2019-12-01 07:37:32 -07:00
Edward Hartnett
5ab7bf7796 now always relax! 2019-11-26 05:36:16 -07:00
Edward Hartnett
2682ffd68d improved docs for cache functions, added libhdf5/hdf5cache.c to Doxyfile.in, added benchmark program for cache settings 2019-11-25 16:33:04 -07:00
Ward Fisher
923d4ccbff
Merge pull request #1530 from NetCDF-World-Domination-Council/ejh_endianness
now testing that endianness can only be set on atomic ints and floats
2019-11-15 15:27:35 -07:00
edwardhartnett
965da1de01 now testing that endianness can only be set on atomic ints and floats 2019-11-15 11:10:10 -07:00
edwardhartnett
8083b3596e fixed problem of unlim dim and var sharing the same name but not being related 2019-11-15 09:18:42 -07:00
Ward Fisher
1a6351dab2
Merge pull request #1521 from ckhroulev/netcdf4-repeated-attribute-modification
Improve the fix for #350 included in #1119
2019-11-14 17:01:09 -07:00
edwardhartnett
d73611de73 now handle two anon dimensions of same size used in same HDF5 var 2019-11-14 06:54:22 -07:00
edwardhartnett
6b9248cef8 adding test 2019-11-14 06:09:45 -07:00
Constantine Khrulev
6abbf8d429 Whitespace changes 2019-11-13 10:07:10 -09:00
Constantine Khrulev
098f2c1056 Modify the condition used to check if an attribute can be re-used
This should make the code a bit cleaner.
2019-11-13 08:38:12 -09:00
Constantine Khrulev
dd181deca9 Improve the fix for #350 included in #1119
1) We have to use H5Tequal() to compare HDF5 type IDs.
2) When checking if we can re-use an NC_CHAR attribute it is enough to
   compare data types (H5Tequal() takes care of the size comparison).
3) This commit adds missing code (reuse_att was set but not used).

Now an attribute in a NetCDF-4 file can be modified as many times as
necessary, as long its type and length remain the same.

Modifications changing either type or length of an attribute require
deleting and re-creating an attribute which increments the attribute
order creation index. Once this index reaches 65535 all attribute
modifications (for a particular group or variable) will fail.

For reference:

Issue 350 title: NetCDF-4 limits the number of times an attribute can
be modified

Pull request 1119 title: Fix checking for HDF5 max dims, no longer
re-create atts if not needed, confirm behavior for HDF5 cyclical
files, allow user to set mpiexec
2019-11-12 21:45:47 -09:00
Dennis Heimbigner
f1506d552e Change (again), and hopefully simplify, the file model inference algorithm.
* For URL paths, the new approach essentially centralizes all information
  in the URL into the "#mode=" fragment key and uses that value
  to determine the dispatcher for (most) URLs.

* The new approach has the following steps:

  1. canonicalize the path if it is a URL.
  2. use the mode= fragment key to determine the dispatcher
  3. if dispatcher still not determined, then use the mode flags
     argument to nc_open/nc_create to determine the dispatcher.
  4. if the path points to something readable, attempt to read the
     magic number at the front, and use that to determine the dispatcher.
     this case may override all previous cases.

* Misc changes.

  1. Update documentation
  2. Moved some unit tests from libdispatch to unit_test directory.
  3. Fixed use of wrong #ifdef macro in test_filter_reg.c
     [I think this may fix an previously reported esupport query].
2019-09-29 12:59:28 -06:00
Greg Sjaardema
56c0d5cf8a Spelling fixes 2019-09-18 08:03:01 -06:00
edwardhartnett
2077729abc removed base_pe functions from dispatch table 2019-08-15 06:51:06 -06:00
edwardhartnett
3c9a25b688 whitespace cleanup 2019-08-03 09:04:58 -06:00
edwardhartnett
7ce322a6f1 now have libhdf5 use nc4_file_list_add() 2019-08-02 09:29:18 -06:00
edwardhartnett
fc1d9baf43 fixed spacing 2019-08-01 18:24:11 -06:00
edwardhartnett
cb3101c59c clean up 2019-08-01 16:13:32 -06:00
edwardhartnett
170c5b0901 removed NC from open in dispatch table 2019-08-01 14:30:20 -06:00
edwardhartnett
e71ff09a0c removed need for NC4_open to have NC passed in 2019-08-01 11:23:58 -06:00
edwardhartnett
f410b31b72 whitespace cleanup of hdf5open.c, plus extra documentation 2019-08-01 11:17:31 -06:00
Ed Hartnett
abfec2ee6e adding missing semicolon 2019-07-28 15:48:25 -06:00
Ed Hartnett
9d128c35b1 comment fix 2019-07-28 13:52:05 -06:00
Ed Hartnett
fb54bf7808 removed unneeded setting of int_ncid by libhdf5 layer 2019-07-28 13:43:01 -06:00
Ward Fisher
9b7472f7ca
Merge pull request #1442 from rouault/fix_NC4_get_vars_with_unlimited_dim
NC4_get_vars(): fix out-of-bounds write with unlimited dimension
2019-07-24 13:48:25 -06:00
Dennis Heimbigner
4c92fc3405 Remove netcdf-4 conditional on the dispatch table.
Partially address: https://github.com/Unidata/netcdf-c/issues/1056

Currently, some of the entries in the dispatch table
are conditional'd on USE_NETCDF4.

As a step in upgrading the dispatch table for use
with user-defined tables, we remove that conditional.
This means that all dispatch tables must implement the
netcdf-4 specific functions even if only to make them
return NC_ENOTNC4. To simplify this, a set of default
functions are defined in libdispatch/dnotnc4.c to provide this
behavior. The file libdispatch/dnotnc3.c is also relevant to
this.

The primary fix is to modify the various dispatch tables to
remove the conditional and use the functions in
libdispatch/dnotnc4.c as appropriate. In practice, all of the
existing tables are prepared to handle this, so the only
real change is to remove the conditionals.

Misc. Unrelated fixes
1. Fix some annoying warnings in ncvalidator.

Notes:
1. This has not been tested with either pnetcdf or hdf4 enabled.
   When those are enabled, it is possible that there are still
   some conditionals that need to be fixed.
2019-07-20 13:59:40 -06:00
Even Rouault
77ffbce43b
NC4_get_vars(): fix out-of-bounds write with unlimited dimension
This fixes an issue hit by GDAL, and that is found in netcdf 4.6.3
and 4.7.0

git bisect pointed the problem to have started with

```
77ab979c5f is the first bad commit
commit 77ab979c5f
Author: Ed Hartnett <edwardjameshartnett@gmail.com>
Date:   Sat Jun 16 09:58:48 2018 -0600

    using get_vars but not put_vars

:040000 040000 8611e77aae fc9ffd1d13 M	libsrc4
```

where nc_get_vara_double() started using nc4_get_vars() underneath.

It turns out that nc4_get_vars() was buggy in the situation exercised by GDAL.

This can be reproduced with the following simple test case:

```

int main()
{
    int status;
    int cdfid = -1;
    int first_dim;
    int varid;
    int other_var;
    size_t anStart[NC_MAX_DIMS];
    size_t anCount[NC_MAX_DIMS];
    double* val = (double*)calloc(3, sizeof(double));

    status = nc_create("foo.nc", NC_NETCDF4, &cdfid);
    assert( status == NC_NOERR );

    status = nc_def_dim(cdfid, "unlimited_dim", NC_UNLIMITED, &first_dim);
    assert( status == NC_NOERR );

    status = nc_def_var(cdfid, "my_var", NC_DOUBLE, 1, &first_dim, &varid);
    assert( status == NC_NOERR );

    status = nc_def_var(cdfid, "other_var", NC_DOUBLE, 1, &first_dim, &other_var);
    assert( status == NC_NOERR );

    status = nc_enddef(cdfid);
    assert( status == NC_NOERR );

    /* Write 3 elements to set the size of the unlimited dim to 3 */
    anStart[0] = 0;
    anCount[0] = 3;
    status = nc_put_vara_double(cdfid, other_var, anStart, anCount, val);
    assert( status == NC_NOERR );

    /* Read 2 elements starting with index=1 */
    anStart[0] = 1;
    anCount[0] = 2;
    status = nc_get_vara_double(cdfid, varid, anStart, anCount, val);
    assert( status == NC_NOERR );

    status = nc_close(cdfid);
    assert( status == NC_NOERR );

    free(val);

    return 0;
}
```

Running it under Valgrind without this patch leads to
```
==19637==
==19637== Invalid write of size 8
==19637==    at 0x4C326CB: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19637==    by 0x4EDBE3D: NC4_get_vars (hdf5var.c:2131)
==19637==    by 0x4EDA24C: NC4_get_vara (hdf5var.c:1342)
==19637==    by 0x4E68878: NC_get_vara (dvarget.c:104)
==19637==    by 0x4E69FDB: nc_get_vara_double (dvarget.c:815)
==19637==    by 0x400C08: main (in /home/even/netcdf-c/build/test)
==19637==  Address 0xb70e3e8 is 8 bytes before a block of size 24 alloc'd
==19637==    at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19637==    by 0x4009E8: main (in /home/even/netcdf-c/build/test)
==19637==
```
2019-07-18 01:25:21 +02:00
Ed Hartnett
76d6b55eff moved call to nc4_rec_grp_del() to inside nc4_nc4f_list_del() 2019-07-16 16:29:06 -06:00
Ed Hartnett
4398cad8f5 whitespace cleanup 2019-07-16 16:17:07 -06:00
Ed Hartnett
b8e50c9254 moved freeing of allvars, alldims, alltypes lists to nc4_nc4f_list_del 2019-07-16 16:16:11 -06:00
Ed Hartnett
e9666f7333 moved free(h5) intonc4_nc4f_list_del 2019-07-16 16:07:21 -06:00
Ed Hartnett
d840c1864c removed unused prototype 2019-07-16 16:02:08 -06:00
Ward Fisher
d6a3944199
Merge pull request #1409 from Unidata/nccopydefault.dmh
Nccopy was overriding default chunking when it should not.
2019-05-29 15:26:09 -06:00
Dennis Heimbigner
112b2cc5e2 Convert to use LOGGING 2019-05-25 12:35:52 -06:00