Commit Graph

660 Commits

Author SHA1 Message Date
Ward Fisher
7d2a646f25 Merge branch 'ejh_fix_redef' of https://github.com/NOAA-GSD/netcdf-c into NOAA-GSD-ejh_fix_redef 2020-07-09 13:55:37 -06:00
Edward Hartnett
3e60a863de fixed warning in hdf5filter.c 2020-07-08 11:24:54 -06:00
Edward Hartnett
832fbf19c8 now dont return error on second redef call for netcdf/HDF5 files 2020-07-08 11:10:15 -06:00
Edward Hartnett
ac3b77d418 merged in changes from ejh_test_szip_unlim 2020-07-04 07:43:50 -06:00
Edward Hartnett
4b78c0c4a3 merged master 2020-07-03 13:57:47 -06:00
Edward Hartnett
6c112efb8e fixed problem setting szip on var with unlimited dim and added test 2020-07-02 10:55:34 -06:00
Edward Hartnett
dc37446a5f more test development 2020-06-29 09:01:24 -06:00
Edward Hartnett
467f342ae9 further test development 2020-06-29 08:35:11 -06:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Greg Sjaardema
edf0ca6c98
Avoid potential integer overrun
It is possible for the values stored to `file_value_size` to overrun the storage capacity of a 32-bit integer.  The value does need to store negative values potentially, so can be `size_t` or `hsize_t`, so use the `hssize_t` which is a signed 64-bit value.  Could also use `ssize_t`, but that is not used in this routine...
2020-06-10 15:42:22 -06:00
Dennis Heimbigner
84c69afca7 Allow redefinition of variable filters
re: Github issue https://github.com/Unidata/netcdf-c/issues/1713

If nc_def_var_filter or nc_def_var_deflate or nc_def_var_szip is
called multiple times with the same filter id, but possibly with
different sets of parameters, then the first invocation is
sticky and later invocations are ignored. The desired behavior
is to have the last invocation be used.

This PR implements that desired behavior, with some special
cases.  If you call nc_def_var_deflate multiple times, then the
last invocation rule applies with respect to deflate. However,
the shuffle filter, if enabled, is always applied just before
applying deflate.

Misc unrelated changes:
1. Make client-side filters be disabled by default
2. Fix the definition of uintptr_t and use in oc2 and libdap4
3. Add some test cases
4. modify filter order tests to use plugin filters rather
   than client-side filters
2020-05-11 09:42:31 -06:00
Edward Hartnett
6aa6eff710 now properly setting HDF5 file cache for files created/opened sequentially on parallel IO builds 2020-05-08 11:00:56 -06:00
Edward Hartnett
e3c9e83ecf adding internal function, plus some documentation 2020-05-08 08:58:42 -06:00
Greg Sjaardema
3e919a568f
Remove line that was missed in original patch 2020-04-30 14:00:18 -06:00
Greg Sjaardema
1db3d07beb
Proof-of-Concept: Avoid N^2 behavior in NC4_inq_dim
The current library seems to have some behavior which is N^2 in the number of vars in a file.

The `NC4_inq_dim` routine calls down to `nc4_find_dim_len` which iterates through each `var` in the file/group and calls `find_var_dim_max_length` on each var and finds the largest length of the dim on each of those vars. This is done only for unlimited vars.

I have a file with 129 dim and 1630 vars.  The unlimited dimension is of length 41.  In my test program, I am reading data from 4 files which have the same dim and var count and reading every 4th time step (unlimited dimension).  If I run a profile, I see that 98.2% of the program time is in the `nc_get_vara_float` call tree and most of that is in `find_var_dim_max_length` (94.8%).

There are 66,142 calls to `nc_get_vara_float` resulting in 107,307,290 calls to `find_var_dim_max_length` with twice that number of calls to `malloc/free` and calls to 5 HDF5 routines.  All of this, at least in my case, to return the same `41` each time.

The proof of concept patch here will check whether the file is read-only (or no_write) and if so, it will cache the value of the dim length the first time it is calculated.   With this change, my example run is sped up by a factor of 60.  The time for `NC4_inq_dim` and below drops from 97.2% down to 2.7%.

I'm not sure whether this is the correct fix, or if there is some behavior that I am overlooking, but my users would definitely like a 10 second run compared to a 10 minute run... 

This is on current Netcdf master branch.

I will try to attach some valgrind/callgrind profiles.
2020-04-30 11:01:10 -06:00
Scot Breitenfeld
7b1b06b5ca Merge remote-tracking branch 'upstream/master' 2020-04-23 15:36:14 -05:00
Dennis Heimbigner
b0e0d81aa9 Fix reclamation of the ->format_XXX_info fields
nc4internal.c contains code to free the format_XXX_info
fields. Since these are format specific, this code
was moved to the dispatch code (libhdf5 and libhdf4
in the current case).

Additionally, there are some fields in nc4internal.h (e.g.
dimscale fields) that are specific to HDF5 and have been moved
to the corresponding HDF5 data structures and code.

Misc. other changes:
1. NC_VAR_INFO_T->hdf5_name renamed to alt_name to avoid
   implying it is necessarily HDF5 specific.
2. prefix NC_FILE_INFO_T with an instance of NC_OBJ for consistency.
   this also requires wrapping move_in_NCList() to keep
   hdr.id consistent.
2020-03-29 12:48:59 -06:00
Edward Hartnett
e7b9b1b587 fixed documentation of cache int functions 2020-03-24 15:02:42 -06:00
Scot Breitenfeld
c5d2e99417 Updated to use H5O_info2_t for HDF5 1.12 and the use of H5Oget_info3 instead of H5Gget_objinfo 2020-03-12 15:50:24 +00:00
Edward Hartnett
b29f9f34a0 whitespace cleanup 2020-03-08 09:10:07 -06:00
Edward Hartnett
4c7e162f34 less use of contiguous/compact field 2020-03-08 07:31:21 -06:00
Edward Hartnett
053752440b stop setting contiguous field in nc4hdf5.c 2020-03-08 07:18:52 -06:00
Edward Hartnett
04eafff166 stop setting contiguous field in hdf5filter.c 2020-03-08 07:18:11 -06:00
Edward Hartnett
5574317db7 stop setting contiguous/compact fields at file open 2020-03-08 07:17:01 -06:00
Edward Hartnett
61357cfd4d more use of storage field 2020-03-08 07:09:15 -06:00
Edward Hartnett
1761850795 continuing to switch to storage field 2020-03-08 07:05:51 -06:00
Edward Hartnett
b98a37e0b3 using storage field in nc4var.c 2020-03-08 06:38:44 -06:00
Edward Hartnett
119e8e9465 using storage in hdf5filter.c 2020-03-08 06:31:34 -06:00
Edward Hartnett
8dec9f6c99 now setting storage field when setting var storage 2020-03-08 06:29:49 -06:00
Edward Hartnett
d87a073a34 starting to use storage field when opening file 2020-03-08 06:21:08 -06:00
Edward Hartnett
0c419ec582 removed commented-out code 2020-03-06 09:57:33 -07:00
Edward Hartnett
502336c2c7 now return NC_EINVAL on attempt to set chunking on scalar var 2020-03-03 11:57:16 -07:00
Dennis Heimbigner
73537603e2 Make scalar X filter return an error instead of ignoring it 2020-03-02 15:10:54 -07:00
Dennis Heimbigner
420fdf4625 fix memory allocation failure in hdf5var.c 2020-03-02 11:45:41 -07:00
Dennis Heimbigner
7d1ca9ac85 fix references to var->deflate' 2020-03-02 11:12:30 -07:00
Dennis Heimbigner
e66c727c28 Fix Filters x compact 2020-02-29 15:33:27 -07:00
Dennis Heimbigner
f376c23329 Make utilities support NC_COMPACT
re: https://github.com/Unidata/netcdf-c/issues/1642

Modify ncdump, nccopy, and ncgen to support the NC_COMPACT storage option.
Added test cases and added description to the man pages for the utilities.

1. ncdump: For compact storage variable, print special attribute __Storage_ as
````
    <var>: _Storage = "compact";
````

2. ncgen: parse and implement
````
    <var>: _Storage = "compact";
````
in a .cdl file

3. nccopy: Extend the chunk specification (-c flag) to support
   compact using the forms
````
nccopy ... -c <var>:compact
and
nccopy ... -c <var>:contiguous
````

Misc. other changes
1. cleanup the copy_chunking function in ncdump/nccopy.c
2020-02-29 12:06:21 -07:00
Dennis Heimbigner
10d227fc1b fix parallel filter error discovered by Hartnett 2020-02-28 11:36:58 -07:00
Dennis Heimbigner
a3a3e15cb1 fix bad edit 2020-02-27 15:33:39 -07:00
Dennis Heimbigner
afe5a2998c
Merge branch 'master' into multifilter.dmh 2020-02-27 15:02:27 -07:00
Dennis Heimbigner
b488c272d5 Fix conflicts with master 2020-02-27 14:06:45 -07:00
Edward Hartnett
6f95f655dd
Merge branch 'master' into ejh_dispatch 2020-02-26 16:12:57 -07:00
Edward Hartnett
418e428a05 fixed problem with scalar compact 2020-02-26 09:13:12 -07:00
Edward Hartnett
b31aedcc8e all tests passing but compact storage for scalars not being properly written in file yet 2020-02-26 08:14:06 -07:00
Edward Hartnett
6241a6e7a0 more tests for storage, changed var names to reflect stortage 2020-02-25 15:55:34 -07:00
Edward Hartnett
2ff24bd6fe more tests for compact storage 2020-02-25 13:30:38 -07:00
Dennis Heimbigner
44d0dcaad2 Add support for multiple filters per variable.
re: https://github.com/Unidata/netcdf-c/issues/1584

Support has been added for multiple filters per variable.  This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.

The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
    - Obtain a list of the filters associated with a variable
    - Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
  about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
  an extended format for specifying a sequence of filters.
  The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
  filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
  for filters known to the netcdf library, which in turn is taken
  from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
  is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
  to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.

Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.

Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
2020-02-16 12:59:33 -07:00
Edward Hartnett
a8684c730c fixed merge conflict in RELEASE_NOTES 2020-02-15 06:42:49 -07:00
Edward Hartnett
05a6ff74b2 merged changes from master 2020-02-11 17:19:53 -07:00
Edward Hartnett
15059a18b7 merged changes from master 2020-02-11 17:19:25 -07:00
Edward Hartnett
a0839a2a7a added version to dispatch table 2020-02-09 13:07:58 -07:00
Edward Hartnett
b7ac19a43f only close non-zero typeids 2020-02-09 12:03:21 -07:00
Edward Hartnett
af6b6787bf fix for memory leak due to HDF5 types 2020-02-09 11:47:13 -07:00
Edward Hartnett
8057a552ef move nc_def_var_szip function so it will appear in the documentation 2020-02-07 09:09:01 -07:00
Edward Hartnett
558988bb18 fixed docs, removed unneeded defines in test 2020-02-07 07:54:12 -07:00
Edward Hartnett
c4d3937099 now check number of elements in chunk against pixels_per_block for szip compression 2020-02-07 07:03:40 -07:00
Edward Hartnett
ff7280512e checking for some bad pixels_per_block values for szip 2020-02-07 06:53:52 -07:00
Edward Hartnett
6d2d751e4e disallow zlib if szip already in use 2020-02-07 05:01:06 -07:00
Edward Hartnett
dc4e880c37 disallow szip if zlib already in use 2020-02-07 04:46:15 -07:00
Edward Hartnett
6b2947813f adding test for zlib+szip in HDF5 2020-02-07 03:38:43 -07:00
Edward Hartnett
1817790c6b rely completely on nc_def_var_filter for setting szip 2020-02-05 10:32:16 -07:00
Edward Hartnett
517ef4f257 use nc_def_var_filter in nc_def_var_szip 2020-02-05 10:25:30 -07:00
Edward Hartnett
52d745de68 now remember szip setting in filter fields 2020-02-04 08:40:15 -07:00
Ward Fisher
aadd5a2d81
Merge pull request #1589 from NOAA-GSD/ejh_szip
re-implement the nc_def_var_szip() function, including for parallel I/O
2020-01-22 16:27:33 -07:00
Edward Hartnett
6103f442cb fixed compile error 2020-01-21 08:50:38 -07:00
Edward Hartnett
c839a2d6c5
Merge branch 'master' into ejh_var_cache 2020-01-21 07:48:59 -07:00
Edward Hartnett
e94615a0e5
Merge branch 'master' into ejh_szip 2020-01-16 08:49:12 -07:00
Ward Fisher
8771d0bdf4
Merge pull request #1582 from NOAA-GSD/ejh_parallel_zlib
Allow user to turn on zlib, shuffle, and/or fletcher32 filters with parallel I/O for HDF5-1.10.2+
2020-01-13 16:06:51 -07:00
Dennis Heimbigner
f587654670 Make the dap4 code resistant to various server errors.
Some versions of some servers are returning malformed responses.
Make the library either handle them or gracefully fail.
The three server errors "fixed" here are as follows.
1. The attribute _NCProperties sometimes has a trailing nul character
   in its value. Soln is to elide the nul(s).
2. Sometimes a DAP response has no data part, only a DMR.
   Soln is to detect and return an error code instead of crashing.
3. Sometimes a server returns a redirection, but our current
   openmagic() function was not following the redirect. Soln
   is to follow redirects.
Also because of #2, I am temporarily making --disable-dap-remote-tests
be the default.
2020-01-08 15:18:31 -07:00
Ward Fisher
d9eb078bfd
Merge pull request #1592 from Unidata/travis_typo_fix.wif
Correct a typo in travis.yml
2020-01-07 18:31:53 -07:00
Ward Fisher
438119dd69
Merge pull request #1560 from NOAA-GSD/ejh_cache_docs
increase default cache size for netCDF-4/HDF5 files, also improve cache docs and add benchmarking program
2020-01-07 11:46:30 -07:00
Ward Fisher
72b79ac376 Cleaned up an 'uninitialized variable' issue reported by static analysis. Minor fix, rolling in to this PR rather than spinning up a separate one. 2020-01-07 11:43:50 -07:00
Edward Hartnett
184507be5f now using members in NC_VAR_INFO_T to hold szip info 2020-01-06 08:46:03 -07:00
Edward Hartnett
3e3b83bdbc whitespace cleanup 2020-01-06 08:09:20 -07:00
Edward Hartnett
6af1b0bd91 changed error code in nc_def_var_szip() to NC_EFILTER 2020-01-06 07:51:04 -07:00
Edward Hartnett
e703a7678c first stab at re-adding nc_def_var_szip() 2020-01-03 11:38:45 -07:00
Edward Hartnett
808a0e2be9 merged ejh_parallel_zlib 2020-01-02 14:25:31 -07:00
Ward Fisher
8f2be58d95
Merge pull request #1566 from NetCDF-World-Domination-Council/ejh_unlim_dims
Fix problems with read past end of dataset but within dimension length for vars with multiple unlimited dimensions
2019-12-23 15:08:56 -07:00
Edward Hartnett
680e44f628 changed name of macro 2019-12-20 13:58:01 -07:00
Edward Hartnett
995cfdad96 merged master 2019-12-20 11:16:11 -07:00
Edward Hartnett
a06df0e4eb fixing for non-parallel builds 2019-12-20 07:52:00 -07:00
Edward Hartnett
accb83a8b5 even more documentation updates 2019-12-20 07:20:02 -07:00
Edward Hartnett
4b7f839666 swtich to collective access when filters are applied 2019-12-20 07:00:12 -07:00
Edward Hartnett
f86c0fb8f9 now check that HDF5 version supports parallel zlib 2019-12-20 05:54:21 -07:00
Edward Hartnett
d534b1298a adding another zlib parallel I/O test 2019-12-20 05:28:20 -07:00
Ward Fisher
6c75e97764
Merge pull request #1570 from NOAA-GSD/ejh_compact
enable compact storage for netcdf-4 vars
2019-12-19 16:47:05 -07:00
Edward Hartnett
3e00967879 allow parallel writes to use zlib 2019-12-19 09:19:23 -07:00
Ward Fisher
29d070c50f
Merge pull request #1564 from NetCDF-World-Domination-Council/ejh_docs_cleanup
fix memory issue that may occur for some HDF5 file opens
2019-12-17 16:18:44 -07:00
Edward Hartnett
bacf017699 better handling of var cache for parallel builds 2019-12-17 06:39:09 -07:00
Edward Hartnett
fd604ddb06 fixed comment 2019-12-16 15:44:14 -07:00
Edward Hartnett
66a2b4c05e more testing for compact vars 2019-12-16 09:37:54 -07:00
Edward Hartnett
06896f432d got compact storage test working 2019-12-04 08:49:37 -07:00
Edward Hartnett
82df2876b6 starting to support compact storage 2019-12-04 07:53:37 -07:00
Edward Hartnett
e52a74520e tests and fix for multiple unlimited dim bug 2019-12-01 15:05:09 -07:00
Edward Hartnett
c5c38148bd moved udata.grps initialization to avoid memory problem on BAIL 2019-12-01 07:37:32 -07:00
Edward Hartnett
5ab7bf7796 now always relax! 2019-11-26 05:36:16 -07:00
Edward Hartnett
2682ffd68d improved docs for cache functions, added libhdf5/hdf5cache.c to Doxyfile.in, added benchmark program for cache settings 2019-11-25 16:33:04 -07:00
Ward Fisher
923d4ccbff
Merge pull request #1530 from NetCDF-World-Domination-Council/ejh_endianness
now testing that endianness can only be set on atomic ints and floats
2019-11-15 15:27:35 -07:00
edwardhartnett
965da1de01 now testing that endianness can only be set on atomic ints and floats 2019-11-15 11:10:10 -07:00
edwardhartnett
8083b3596e fixed problem of unlim dim and var sharing the same name but not being related 2019-11-15 09:18:42 -07:00
Ward Fisher
1a6351dab2
Merge pull request #1521 from ckhroulev/netcdf4-repeated-attribute-modification
Improve the fix for #350 included in #1119
2019-11-14 17:01:09 -07:00
edwardhartnett
d73611de73 now handle two anon dimensions of same size used in same HDF5 var 2019-11-14 06:54:22 -07:00
edwardhartnett
6b9248cef8 adding test 2019-11-14 06:09:45 -07:00
Constantine Khrulev
6abbf8d429 Whitespace changes 2019-11-13 10:07:10 -09:00
Constantine Khrulev
098f2c1056 Modify the condition used to check if an attribute can be re-used
This should make the code a bit cleaner.
2019-11-13 08:38:12 -09:00
Constantine Khrulev
dd181deca9 Improve the fix for #350 included in #1119
1) We have to use H5Tequal() to compare HDF5 type IDs.
2) When checking if we can re-use an NC_CHAR attribute it is enough to
   compare data types (H5Tequal() takes care of the size comparison).
3) This commit adds missing code (reuse_att was set but not used).

Now an attribute in a NetCDF-4 file can be modified as many times as
necessary, as long its type and length remain the same.

Modifications changing either type or length of an attribute require
deleting and re-creating an attribute which increments the attribute
order creation index. Once this index reaches 65535 all attribute
modifications (for a particular group or variable) will fail.

For reference:

Issue 350 title: NetCDF-4 limits the number of times an attribute can
be modified

Pull request 1119 title: Fix checking for HDF5 max dims, no longer
re-create atts if not needed, confirm behavior for HDF5 cyclical
files, allow user to set mpiexec
2019-11-12 21:45:47 -09:00
Dennis Heimbigner
f1506d552e Change (again), and hopefully simplify, the file model inference algorithm.
* For URL paths, the new approach essentially centralizes all information
  in the URL into the "#mode=" fragment key and uses that value
  to determine the dispatcher for (most) URLs.

* The new approach has the following steps:

  1. canonicalize the path if it is a URL.
  2. use the mode= fragment key to determine the dispatcher
  3. if dispatcher still not determined, then use the mode flags
     argument to nc_open/nc_create to determine the dispatcher.
  4. if the path points to something readable, attempt to read the
     magic number at the front, and use that to determine the dispatcher.
     this case may override all previous cases.

* Misc changes.

  1. Update documentation
  2. Moved some unit tests from libdispatch to unit_test directory.
  3. Fixed use of wrong #ifdef macro in test_filter_reg.c
     [I think this may fix an previously reported esupport query].
2019-09-29 12:59:28 -06:00
Greg Sjaardema
56c0d5cf8a Spelling fixes 2019-09-18 08:03:01 -06:00
edwardhartnett
2077729abc removed base_pe functions from dispatch table 2019-08-15 06:51:06 -06:00
edwardhartnett
3c9a25b688 whitespace cleanup 2019-08-03 09:04:58 -06:00
edwardhartnett
7ce322a6f1 now have libhdf5 use nc4_file_list_add() 2019-08-02 09:29:18 -06:00
edwardhartnett
fc1d9baf43 fixed spacing 2019-08-01 18:24:11 -06:00
edwardhartnett
cb3101c59c clean up 2019-08-01 16:13:32 -06:00
edwardhartnett
170c5b0901 removed NC from open in dispatch table 2019-08-01 14:30:20 -06:00
edwardhartnett
e71ff09a0c removed need for NC4_open to have NC passed in 2019-08-01 11:23:58 -06:00
edwardhartnett
f410b31b72 whitespace cleanup of hdf5open.c, plus extra documentation 2019-08-01 11:17:31 -06:00
Ed Hartnett
abfec2ee6e adding missing semicolon 2019-07-28 15:48:25 -06:00
Ed Hartnett
9d128c35b1 comment fix 2019-07-28 13:52:05 -06:00
Ed Hartnett
fb54bf7808 removed unneeded setting of int_ncid by libhdf5 layer 2019-07-28 13:43:01 -06:00
Ward Fisher
9b7472f7ca
Merge pull request #1442 from rouault/fix_NC4_get_vars_with_unlimited_dim
NC4_get_vars(): fix out-of-bounds write with unlimited dimension
2019-07-24 13:48:25 -06:00
Dennis Heimbigner
4c92fc3405 Remove netcdf-4 conditional on the dispatch table.
Partially address: https://github.com/Unidata/netcdf-c/issues/1056

Currently, some of the entries in the dispatch table
are conditional'd on USE_NETCDF4.

As a step in upgrading the dispatch table for use
with user-defined tables, we remove that conditional.
This means that all dispatch tables must implement the
netcdf-4 specific functions even if only to make them
return NC_ENOTNC4. To simplify this, a set of default
functions are defined in libdispatch/dnotnc4.c to provide this
behavior. The file libdispatch/dnotnc3.c is also relevant to
this.

The primary fix is to modify the various dispatch tables to
remove the conditional and use the functions in
libdispatch/dnotnc4.c as appropriate. In practice, all of the
existing tables are prepared to handle this, so the only
real change is to remove the conditionals.

Misc. Unrelated fixes
1. Fix some annoying warnings in ncvalidator.

Notes:
1. This has not been tested with either pnetcdf or hdf4 enabled.
   When those are enabled, it is possible that there are still
   some conditionals that need to be fixed.
2019-07-20 13:59:40 -06:00
Even Rouault
77ffbce43b
NC4_get_vars(): fix out-of-bounds write with unlimited dimension
This fixes an issue hit by GDAL, and that is found in netcdf 4.6.3
and 4.7.0

git bisect pointed the problem to have started with

```
77ab979c5f is the first bad commit
commit 77ab979c5f
Author: Ed Hartnett <edwardjameshartnett@gmail.com>
Date:   Sat Jun 16 09:58:48 2018 -0600

    using get_vars but not put_vars

:040000 040000 8611e77aae fc9ffd1d13 M	libsrc4
```

where nc_get_vara_double() started using nc4_get_vars() underneath.

It turns out that nc4_get_vars() was buggy in the situation exercised by GDAL.

This can be reproduced with the following simple test case:

```

int main()
{
    int status;
    int cdfid = -1;
    int first_dim;
    int varid;
    int other_var;
    size_t anStart[NC_MAX_DIMS];
    size_t anCount[NC_MAX_DIMS];
    double* val = (double*)calloc(3, sizeof(double));

    status = nc_create("foo.nc", NC_NETCDF4, &cdfid);
    assert( status == NC_NOERR );

    status = nc_def_dim(cdfid, "unlimited_dim", NC_UNLIMITED, &first_dim);
    assert( status == NC_NOERR );

    status = nc_def_var(cdfid, "my_var", NC_DOUBLE, 1, &first_dim, &varid);
    assert( status == NC_NOERR );

    status = nc_def_var(cdfid, "other_var", NC_DOUBLE, 1, &first_dim, &other_var);
    assert( status == NC_NOERR );

    status = nc_enddef(cdfid);
    assert( status == NC_NOERR );

    /* Write 3 elements to set the size of the unlimited dim to 3 */
    anStart[0] = 0;
    anCount[0] = 3;
    status = nc_put_vara_double(cdfid, other_var, anStart, anCount, val);
    assert( status == NC_NOERR );

    /* Read 2 elements starting with index=1 */
    anStart[0] = 1;
    anCount[0] = 2;
    status = nc_get_vara_double(cdfid, varid, anStart, anCount, val);
    assert( status == NC_NOERR );

    status = nc_close(cdfid);
    assert( status == NC_NOERR );

    free(val);

    return 0;
}
```

Running it under Valgrind without this patch leads to
```
==19637==
==19637== Invalid write of size 8
==19637==    at 0x4C326CB: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19637==    by 0x4EDBE3D: NC4_get_vars (hdf5var.c:2131)
==19637==    by 0x4EDA24C: NC4_get_vara (hdf5var.c:1342)
==19637==    by 0x4E68878: NC_get_vara (dvarget.c:104)
==19637==    by 0x4E69FDB: nc_get_vara_double (dvarget.c:815)
==19637==    by 0x400C08: main (in /home/even/netcdf-c/build/test)
==19637==  Address 0xb70e3e8 is 8 bytes before a block of size 24 alloc'd
==19637==    at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19637==    by 0x4009E8: main (in /home/even/netcdf-c/build/test)
==19637==
```
2019-07-18 01:25:21 +02:00
Ed Hartnett
76d6b55eff moved call to nc4_rec_grp_del() to inside nc4_nc4f_list_del() 2019-07-16 16:29:06 -06:00
Ed Hartnett
4398cad8f5 whitespace cleanup 2019-07-16 16:17:07 -06:00
Ed Hartnett
b8e50c9254 moved freeing of allvars, alldims, alltypes lists to nc4_nc4f_list_del 2019-07-16 16:16:11 -06:00
Ed Hartnett
e9666f7333 moved free(h5) intonc4_nc4f_list_del 2019-07-16 16:07:21 -06:00
Ed Hartnett
d840c1864c removed unused prototype 2019-07-16 16:02:08 -06:00
Ward Fisher
d6a3944199
Merge pull request #1409 from Unidata/nccopydefault.dmh
Nccopy was overriding default chunking when it should not.
2019-05-29 15:26:09 -06:00
Dennis Heimbigner
112b2cc5e2 Convert to use LOGGING 2019-05-25 12:35:52 -06:00
Dennis Heimbigner
7901353cf5 Restore nc_perf/CMakeLists.txt 2019-05-25 12:15:56 -06:00
Dennis Heimbigner
06498ff16a various fixes 2019-05-23 16:35:03 -06:00
Ed Hartnett
150662dd0b changes to support build of libsrc4 without libhdf5 2019-05-22 07:50:12 -06:00
Dennis Heimbigner
6ebc108f00 Nccopy was overriding default chunking when it should not.
re: issue https://github.com/Unidata/netcdf-c/issues/1398
re: esupport NDY-294972

The new chunking code added to nccopy missed one case.
In the event that there are no chunking specifications
of any kind, and the input is not netcdf-4, and the output
is netcdf-4 and must be chunked, then use the default chunking
that the library computes as part of the nc_def_var() function.

Misc. changes:
1. add some chunking debug code to hdf5var.c
2019-05-21 15:59:27 -06:00
Dennis Heimbigner
331a1f1c63 Centralize calls to curl_global_init and curl_global_cleanup
re: https://github.com/Unidata/netcdf-c/issues/1388

1. Centralize calls to curl_global_init and curl_global_cleanup
   to libdispatch/ddispatch.c
2. Make the above calls if options require curl: currently
   any of DAP2, DAP4, or byterange.
3. Side issue: Fix obscure bug in mmapio.c involving non-persistent mmap.
2019-05-03 13:22:54 -06:00
Ward Fisher
ae1b30990d
Merge pull request #1379 from Unidata/threads_part1.dmh
Thread safety: step 1: cleanup
2019-05-02 10:47:46 -06:00
Ward Fisher
44fce0904e
Merge pull request #1387 from Unidata/fixffilter.dmh
Minor config.h changes to support filters in Fortran
2019-05-01 14:44:53 -06:00
Ward Fisher
3b34a82e19 Merge branch 'master' into threads_part1.dmh 2019-05-01 14:41:13 -06:00
Ward Fisher
5410967b00 Merge branch 'master' into addfilter.dmh 2019-04-30 14:51:25 -06:00
Dennis Heimbigner
62e2b472b4 Minor config.h changes to support filters in Fortran 2019-04-29 16:36:08 -06:00
Dennis Heimbigner
2eb1a8d8cf For some reason, the code for this was incorrect.
Anyway, I repaired it as follows:
1. Created NC4_write_provenance as parallel to NC4_read_provenance
2. Modified hdf5file.c to use NC4_write_provenance
3. Modified hdf5open.c to use NC4_read_provenance (was NC4_read_ncproperties).
4. The creation of the _NCProperties string was seriously hosed:
   was using all the wrong fields.
2019-04-18 14:23:20 -06:00
Ward Fisher
c776039eba Updated call to NC4_read_provenance. 2019-04-18 10:53:16 -06:00
Dennis Heimbigner
6934aa2e8b Thread safety: step 1: cleanup
re: https://github.com/Unidata/netcdf-c/issues/1373 (partial)

* Mark some global constants be const to indicate to make them easier to track.
* Hide direct access to the ncrc_globalstate behind a function call.
* Convert dispatch tables to constants (except the user defined ones)
  This has some consequences in terms of function arguments needing to be marked
  as const also.
* Remove some no longer needed global fields
* Aggregate all the globals in nclog.c
* Uniformly replace nc_sizevector{0,1} with NC_coord_{zero,one}
* Uniformly replace nc_ptrdffvector1 with NC_stride_one
* Remove some obsolete code
2019-03-30 14:06:20 -06:00
Dennis Heimbigner
8d0bced60d Allow in-line definition of filters
Priority: Low

re: issue https://github.com/Unidata/netcdf-c/issues/1329

HDF5 has the ability to programmatically define new filters,
as opposed to using HDF5_PLUGIN_PATH env variable.
This PR adds support for that feature.
Not clear how useful this is, though.
See docs/filters.md for details.
2019-03-21 11:33:27 -06:00
Ward Fisher
e2b31ffae4
Merge branch 'master' into byterange.dmh 2019-03-19 12:05:44 -06:00
Dennis Heimbigner
88a7a1753c Simplify libhdf5/nc5info.c to move to lazy parsing
re: https://github.com/Unidata/netcdf-c/issues/1352

When nc4info.c encounters an _NCProperties attribute
with a version number it does not recognize, it does not
show it correctly.

Solution chosen is to arrange so that accessing the attribute
returns the raw value of the Attribute from the file. This way,
even if the version is unrecognized, it will return something
usable.

The changes were primarily to never attempt to parse the value
of _NCProperties until actually required. Which since they
are currently not used means that parsing never occurs.

Also modified ncdump/tst_fileinfo.sh to include some extra testing

I tested the original failure by changing the value of NCPROPS to 3.
However, there is no way to test this at build time.

Misc. Changes
* Inlined the provenance info in the NC_FILE_INFO_T structure
* Centralized stuff from elsewhere into include/nc_provenance.h

Misc. Unrelated Changes
* Removed/turned off some misc debug output left on by accident
* Fix CPPFLAGS name error in libhdf5/Makefile.am
2019-03-09 20:35:57 -07:00
Ed Hartnett
1eb7e7a8e8 added comment describing netcdf-4 behavior in data mode dim renames with longer names 2019-02-25 06:36:39 -07:00
Dennis Heimbigner
0c59e13bf7 Master merge, conflict resolution, cleanup 2019-02-24 16:54:13 -07:00
Dennis Heimbigner
959c213c18 conflict resolution 2019-02-23 22:17:53 -07:00
Dennis Heimbigner
45a8a265b8 master merge 2019-02-23 17:14:12 -07:00
Ed Hartnett
cfa8e3808f cleanup of whitespace in HDF5 directory 2019-02-19 05:19:37 -07:00
Ed Hartnett
b1d30a0f67 cleanup of whitespace in HDF5 directory 2019-02-19 05:18:53 -07:00
Ed Hartnett
c771443e43 cleanup of whitespace in HDF5 directory 2019-02-19 05:18:25 -07:00
Ed Hartnett
8b1f5a8fad cleanup of whitespace in HDF5 directory 2019-02-19 05:18:02 -07:00
Ed Hartnett
384a6f1303 cleanup of whitespace in HDF5 directory 2019-02-19 05:17:47 -07:00
Ward Fisher
1fde39c8d7
Merge branch 'master' into byterange.dmh 2019-02-07 14:28:23 -07:00
Ed Hartnett
70201adb51 comment cleanup 2019-02-03 07:43:56 -07:00
Ed Hartnett
5d908a0bbb now preserve order of varids after a var rename 2019-02-03 06:56:03 -07:00
Ed Hartnett
8e6f38b099 detecting conditions for mandatory rename of vars with varid > renamed var 2019-02-03 06:35:29 -07:00
Ed Hartnett
e30a2bf208 added come comments 2019-02-02 07:32:31 -07:00
Ed Hartnett
8acde75e3c converted hdf5gtp.c to use H5Lmove instead of deprecated H5Gmove 2019-02-02 07:24:02 -07:00
Ed Hartnett
f25f050be8 converted hdf5var to use H5Lmove instead of deprecated H5Gmove 2019-02-02 07:20:14 -07:00
Ed Hartnett
1dd76c996e added tst_rename3.c for more rename testing 2019-02-02 05:53:45 -07:00
Ed Hartnett
828304ed41 now using secret hdf5 var name during renames if needed 2019-01-27 11:33:06 -07:00
Ed Hartnett
1b3f397c4c added name parameter to give_var_secret_name to base secret name on 2019-01-27 11:29:49 -07:00
Ed Hartnett
784dc0e0ad now creating secret hdf5 name on var rename, if needed 2019-01-27 11:17:57 -07:00
Ed Hartnett
660bda1be3 made function give_var_secret_name() static again 2019-01-27 11:14:29 -07:00
Ed Hartnett
e74ec6f2a0 made function give_var_secret_name() not static, fixed warning 2019-01-27 11:10:41 -07:00
Ed Hartnett
c95887cc53 removed name param from function give_var_secret_name() 2019-01-27 11:07:57 -07:00
Ed Hartnett
627a55cf78 added function give_var_secret_name() 2019-01-27 11:06:02 -07:00
Ed Hartnett
42c64598dc created function create_dim_wo_var() 2019-01-27 10:51:25 -07:00
Ward Fisher
b27c7d899d Merge branch 'master' into byterange.dmh 2019-01-25 14:50:23 -07:00
Ed Hartnett
a89f9ddeb9
Merge branch 'master' into ejh_rename_bug 2019-01-25 07:05:13 -07:00
Ed Hartnett
66cc9c5020 fixed rename bug 2019-01-24 10:20:46 -07:00
Ward Fisher
237f0c6e65 Merge branch 'ejh_tidy' of https://github.com/NetCDF-World-Domination-Council/netcdf-c into pr-aggregation.wif 2019-01-23 15:11:59 -07:00
Ward Fisher
cb2affbef5 Merge branch 'patch-32' of https://github.com/gsjaardema/netcdf-c into pr-aggregation.wif 2019-01-23 15:09:35 -07:00
Ed Hartnett
840d51d035 changed NC_GRP_INFO_T to use atts_read instead of atts_not_read 2019-01-22 08:11:52 -07:00
Ed Hartnett
243cef8fa5 changed var atts_not_read to atts_read 2019-01-21 08:40:04 -07:00
Ed Hartnett
60132a0ed7 made function static, removed unneeded if statement 2019-01-20 09:53:00 -07:00
Ed Hartnett
9d40e0a2af made function static, removed unneeded if statement 2019-01-20 09:52:42 -07:00
Ed Hartnett
281f67da6e removed unneeded vars, fixed and added comments 2019-01-20 09:46:15 -07:00
Ed Hartnett
adb3356aff merged ejh_test_pnetcdf, fixes broken logging statement, added comments 2019-01-20 09:42:18 -07:00
Ed Hartnett
c6a9948a8e removed unneeded var, fixed broken log statements that cause segfaults 2019-01-20 09:37:13 -07:00
Ed Hartnett
15e6a782db removed unneeded variable, shortened function name 2019-01-20 09:25:04 -07:00
Ed Hartnett
e1cd4018c5 removed unneeded variable 2019-01-20 09:18:14 -07:00
Greg Sjaardema
1ab53924cb
Tests on equalp are always true
If `equalp` is NULL, then the function returns early, so all subsequent tests on `equalp` are not needed.
2019-01-04 17:39:48 -07:00
Dennis Heimbigner
ac2e6f9a10 typo5 2019-01-02 21:37:31 -07:00
Dennis Heimbigner
8cd450206a typeo4 2019-01-02 20:53:44 -07:00
Dennis Heimbigner
dc491f9bf9 typos3 2019-01-02 20:31:06 -07:00
Dennis Heimbigner
fe3ba0904c Another typo (sigh\!) 2019-01-02 16:48:11 -07:00
Dennis Heimbigner
6e76972c99 Fix typo 2019-01-02 16:35:22 -07:00
Dennis Heimbigner
a7fa2d8d95 It turns out the the type H5FD_class_t was changed
between HDF5 versions 1.8 and 1.10.
So modify H5FDhttp.c to be conditional on the
HDF5 major+minor version from H5public.h
2019-01-02 14:37:23 -07:00
Dennis Heimbigner
84c2bc0d78 Merge branch 'master' into byterange.dmh 2019-01-02 13:18:45 -07:00
Dennis Heimbigner
bf2746b8ea Provide byte-range reading of remote datasets
re: issue https://github.com/Unidata/netcdf-c/issues/1251

Assume that you have the URL to a remote dataset
which is a normal netcdf-3 or netcdf-4 file.

This PR allows the netcdf-c to read that dataset's
contents as a netcdf file using HTTP byte ranges
if the remote server supports byte-range access.

Originally, this PR was set up to access Amazon S3 objects,
but it can also access other remote datasets such as those
provided by a Thredds server via the HTTPServer access protocol.
It may also work for other kinds of servers.

Note that this is not intended as a true production
capability because, as is known, this kind of access to
can be quite slow. In addition, the byte-range IO drivers
do not currently do any sort of optimization or caching.

An additional goal here is to gain some experience with
the Amazon S3 REST protocol.

This architecture and its use documented in
the file docs/byterange.dox.

There are currently two test cases:

1. nc_test/tst_s3raw.c - this does a simple open, check format, close cycle
   for a remote netcdf-3 file and a remote netcdf-4 file.
2. nc_test/test_s3raw.sh - this uses ncdump to investigate some remote
   datasets.

This PR also incorporates significantly changed model inference code
(see the superceded PR https://github.com/Unidata/netcdf-c/pull/1259).

1. It centralizes the code that infers the dispatcher.
2. It adds support for byte-range URLs

Other changes:

1. NC_HDF5_finalize was not being properly called by nc_finalize().
2. Fix minor bug in ncgen3.l
3. fix memory leak in nc4info.c
4. add code to walk the .daprc triples and to replace protocol=
   fragment tag with a more general mode= tag.

Final Note:
Th inference code is still way too complicated. We need to move
to the validfile() model used by netcdf Java, where each
dispatcher is asked if it can process the file. This decentralizes
the inference code. This will be done after all the major new
dispatchers (PIO, Zarr, etc) have been implemented.
2019-01-01 18:27:36 -07:00
Ed Hartnett
029aa5f626 now using hidden coordinates att to speed file opens 2018-12-20 05:59:31 -07:00
Ed Hartnett
7249350c3f now remember whether coords att has been read for a var 2018-12-19 09:43:32 -07:00
Ed Hartnett
6e3d284fcc write coordinates hidden attribute for all variables 2018-12-19 09:16:21 -07:00
Ed Hartnett
f5c7209838 comment and code cleanup 2018-12-19 09:10:15 -07:00
Ed Hartnett
070214f81a more comments, code cleanup 2018-12-19 06:52:26 -07:00
Ed Hartnett
25184f3843 moved code to get_attached_info() 2018-12-18 09:16:03 -07:00
Ed Hartnett
1b38d9aef8 lazy read of some var metadata 2018-12-18 07:48:22 -07:00