Commit Graph

51 Commits

Author SHA1 Message Date
Edward Hartnett
c967557787 starting to make logging output files for each process for parallel IO builds 2022-02-04 10:35:15 -07:00
Dennis Heimbigner
d2316f866c Additional Fixes to NCZarr
Primary Fixes:
* Add a whole variable optimization -- used in the rare case that nc_get/put_vara covers the whole of a variable and the variable has a single chunk.
* Fix chunking error when stride causes whole chunks to be skipped.
* Fix some memory leaks
* Add test cases
* Add one performance test to nczarr_test/. This uses the timer utils from unit_test: timer_utils.[ch].
* Move ncdumpchunks utility from ncdump to nczarr_test

Misc. Other Changes:
* Make check for aws libraries conditional on --enable-nczarr-s3
* Remove all but one bm tests from nczarr_test until they are working.
* Remove another dependency on HDF5 from supposedly non-HDF5 specific code; specifically hdf5_log_hdf5.
* Make the BAIL2 macro be hdf5 specific and replace elsewhere with an HDF5 independent equivalent.
* Move hdf5cache.c to libsrc4/nc4cache.c because it is used by nczarr.
* Modify unit_tests so that some of them are run even if using Windows.
* Misc. small bug fixes and refactors and memory leaks.
* Rename some conflicting tests for cmake.
* Attempted to make nc_perf work with cmake and failed.
2020-12-16 20:48:02 -07:00
Dennis Heimbigner
aeb3ac2809 Mostly revert the filter code to reduce its complexity of use.
re: https://github.com/Unidata/netcdf-c/issues/1836

Revert the internal filter code to simplify it. From the user's
point of view, the only visible changes should be:

1. The functions that convert text to filter specs have had their signature reverted and have been moved to netcdf_aux.h
2. Some filter API functions now return NC_ENOFILTER when inquiry is made about some filter.

Internally,the dispatch table has been modified to get rid of the filter_actions
entry and associated complex structures. It has been replaced with
inq_var_filter_ids and inq_var_filter_info entries and the dispatch table
version has been bumped to 3. Corresponding NOOP and NOTNC4 functions
were added to libdispatch/dnotnc4.c. Also, the filter_action entries
in dispatch tables were replaced for all dispatch code bases (HDF5, DAP2,
etc). This should only impact UDF users.

In the process, it became clear that the form of the filters
field in NC_VAR_INFO_T was format dependent, so I converted it to
be of type void* and pushed its management into the various dispatch
code bases. Specifically libhdf5 and libnczarr now manage the filters
field in their own way.

The auxilliary functions for parsing textual filter specifications
were moved to netcdf_aux.h and were renamed to the following:
* ncaux_h5filterspec_parse
* ncaux_h5filterspec_parselist
* ncaux_h5filterspec_free
* ncaux_h5filter_fix8

Misc. Other Changes:

1. Document NUG/filters.md updated to reflect the changes above.
2. All the old data types (structs and enums)
   used by filter_actions actions were deleted.
   The exception is the NC_H5_Filterspec because it is needed
   by ncaux_h5filterspec_parselist.
3. Clientside filters were removed -- another enhancement
   for which no-one ever asked.
4. The ability to remove filters was itself removed.
5. Some functionality needed by nczarr was moved from libhdf5
   to libsrc4 e.g. nc4_find_default_chunksizes
6. All the filterx code was removed
7. ncfilter.h and nc4filter.c no longer used

Misc. Unrelated Changes:

1. The nczarr_test makefile clean was leaving some directories; so
   add clean-local to take care of them.
2020-09-27 12:43:46 -06:00
Dennis Heimbigner
f3218a2e2c Use the built-in HDF5 byte-range reader, if available.
re: Issue https://github.com/Unidata/netcdf-c/issues/1848

The existing Virtual File Driver built to support byte-range
read-only file access is quite old. It turns out to be extremely
slow (reason unknown at the moment).

Starting with HDF5 1.10.6, the HDF5 library has its own version
of such a file driver. The HDF5 developers have better knowledge
about building such a driver and what incantations are needed to
get good performance.

This PR modifies the byte-range code in hdf5open.c so
that if the HDF5 file driver is available, then it is used
in preference to the one written by the Netcdf group.

Misc. Other Changes:

1. Moved all of nc4print code to ncdump to keep appveyor quiet.
2020-09-24 14:33:58 -06:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Dennis Heimbigner
44d0dcaad2 Add support for multiple filters per variable.
re: https://github.com/Unidata/netcdf-c/issues/1584

Support has been added for multiple filters per variable.  This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.

The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
    - Obtain a list of the filters associated with a variable
    - Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
  about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
  an extended format for specifying a sequence of filters.
  The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
  filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
  for filters known to the netcdf library, which in turn is taken
  from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
  is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
  to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.

Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.

Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
2020-02-16 12:59:33 -07:00
Ward Fisher
763750fab5 More copyright stanza updates. 2018-12-06 14:56:42 -07:00
Ed Hartnett
9b0192fe94 moved memfile code to libhdf5 2018-07-19 07:26:27 -06:00
Ed Hartnett
c402b79ed8 merged changes from master 2018-05-08 12:13:03 -06:00
Ed Hartnett
5526ca65d1 created libhdf5, moved some files 2018-05-08 11:58:01 -06:00
Dennis Heimbigner
4739cd3225 Master merge and conflict resolution 2018-04-12 21:51:17 -06:00
Dennis Heimbigner
25f062528b This completes (for now) the refactoring of libsrc4.
The file docs/indexing.dox tries to provide design
information for the refactoring.

The primary change is to replace all walking of linked
lists with the use of the NCindex data structure.
Ncindex is a combination of a hash table (for name-based
lookup) and a vector (for walking the elements in the index).
Additionally, global vectors are added to NC_HDF5_FILE_INFO_T
to support direct mapping of an e.g. dimid to the NC_DIM_INFO_T
object. These global vectors exist for dimensions, types, and groups
because they have globally unique id numbers.

WARNING:
1. since libsrc4 and libsrchdf4 share code, there are also
   changes in libsrchdf4.
2. Any outstanding pull requests that change libsrc4 or libhdf4
   are likely to cause conflicts with this code.
3. The original reason for doing this was for performance improvements,
   but as noted elsewhere, this may not be significant because
   the meta-data read performance apparently is being dominated
   by the hdf5 library because we do bulk meta-data reading rather
   than lazy reading.
2018-03-16 11:46:18 -06:00
Dennis Heimbigner
ccc70d640b re: esupport MQO-415619
and https://github.com/Unidata/netcdf-c/issues/708

Expand the NC_INMEMORY capabilities to support writing and accessing
the final modified memory.

Three new functions have been added:
nc_open_memio, nc_create_mem, and nc_close_memio.

The following new capabilities were added.
1. nc_open_memio() allows the NC_WRITE mode flag
   so a chunk of memory can be passed in and be modified
2. nc_create_mem() allows the NC_INMEMORY flag to be set
   to cause the created file to be kept in memory.
3. nc_close_mem() allows the final in-memory contents to be
   retrieved at the time the file is closed.
4. A special flag, NC_MEMIO_LOCK, is provided to ensure that
   the provided memory will not be freed or reallocated.

Note the following.
1. If nc_open_memio() is called with NC_WRITE, and NC_MEMIO_LOCK is not set,
   then the netcdf-c library will take control of the incoming memory.
   This means that the original memory block should not be freed
   but the block returned by nc_close_mem() must be freed.
2. If nc_open_memio() is called with NC_WRITE, and NC_MEMIO_LOCK is set,
   then modifications to the original memory may fail if the space available
   is insufficient.

Documentation is provided in the file docs/inmemory.md.
A test case is provided: nc_test/tst_inmemory.c driven by
nc_test/run_inmemory.sh

WARNING: changes were made to the dispatch table for
the close entry. From int (*close)(int) to int (*close)(int,void*).
2018-02-25 21:45:31 -07:00
Ed Hartnett
4de61e21f2 more docs, more cleaning 2017-12-04 12:21:14 -07:00
Dennis Heimbigner
3db4f013bf Primary change: add dap4 support
Specific changes:
1. Add dap4 code: libdap4 and dap4_test.
   Note that until the d4ts server problem is solved, dap4 is turned off.
2. Modify various files to support dap4 flags:
	configure.ac, Makefile.am, CMakeLists.txt, etc.
3. Add nc_test/test_common.sh. This centralizes
   the handling of the locations of various
   things in the build tree: e.g. where is
   ncgen.exe located. See nc_test/test_common.sh
   for details.
4. Modify .sh files to use test_common.sh
5. Obsolete separate oc2 by moving it to be part of
   netcdf-c. This means replacing code with netcdf-c
   equivalents.
5. Add --with-testserver to configure.ac to allow
   override of the servers to be used for --enable-dap-remote-tests.
6. There were multiple versions of nctypealignment code. Try to
   centralize in libdispatch/doffset.c and include/ncoffsets.h
7. Add a unit test for the ncuri code because of its complexity.
8. Move the findserver code out of libdispatch and into
   a separate, self contained program in ncdap_test and dap4_test.
9. Move the dispatch header files (nc{3,4}dispatch.h) to
   .../include because they are now shared by modules.
10. Revamp the handling of TOPSRCDIR and TOPBUILDDIR for shell scripts.
11. Make use of MREMAP if available
12. Misc. minor changes e.g.
	- #include <config.h> -> #include "config.h"
	- Add some no-install headers to /include
	- extern -> EXTERNL and vice versa as needed
	- misc header cleanup
	- clean up checking for misc. unix vs microsoft functions
13. Change copyright decls in some files to point to LICENSE file.
14. Add notes to RELEASENOTES.md
2017-03-08 17:01:10 -07:00
Dennis Heimbigner
0cf1e2c49f re: Github issue netcdf-c 300
Modified provenance code to allocate the minimal space
needed for _NCProperties attribute in file.  Basically
required using malloc in the provenance code and in ncdump.
Otherwise should cause no externally visible effects.
Also removed the ENABLE_FILEINFO from configure.ac since
the provenance code is no longer optional.
2016-08-08 09:24:19 -06:00
Dennis Heimbigner
11a259ad86 Add provenance info for netcdf-4 files.
This consists of a persistent attribute named
_NCProperties plus two computed attributes
_IsNetcdf4 and _SuperblockVersion.
See the 'Provenance Attributes' section
of docs/attribute_conventions.md for details.
2016-05-07 14:32:07 -06:00
Quincey Koziol
b2dfacbcfa Big clean up to type handling in libsrc4, which makes fill-values work
correctly for variables with string datatype, plus a few other minor changes.
2014-02-11 17:12:08 -06:00
Ward Fisher
ddf3c31bb0 Corrected a handful of syntax issues in CMake config files,
probably introduced more.  

Added CMake-related files to Makefile.am files for inclusion
when creating a distribution package.
2013-02-20 23:28:28 +00:00
Ed Hartnett
d58c18c623 added diskless files API, subsetting not working, classic model only 2011-08-16 21:04:33 +00:00
Ed Hartnett
e7b3fa19c1 removed mention of pnetcdf and hdf4 link options as they are handled by configure 2011-05-31 15:03:31 +00:00
Ed Hartnett
8d99e0b43a more changes for mingw/DLL builds 2011-05-11 19:51:11 +00:00
Ed Hartnett
2f4cf5382e took out ldflags in Makefile.am files 2011-05-11 17:09:12 +00:00
Dennis Heimbigner
0b477e29cc rebuilt dap constraints 2011-04-16 20:56:36 +00:00
Ed Hartnett
0259975e26 changes in response to user feedback to rc1, plus cleaned up H5 test dependancy 2011-03-21 15:45:17 +00:00
Ed Hartnett
d334706ffe added math library 2011-03-18 12:20:13 +00:00
Ed Hartnett
965a3aac70 minor refactor of the build system to work better for cross-compiling 2011-03-15 10:19:08 +00:00
Ed Hartnett
b9358f5890 updated (and fixed) shared library version numbers for all netcdf libraries 2011-02-17 12:58:08 +00:00
Ed Hartnett
668ed2e0a5 fixed chunking bug: default chunks must always be under 4GB in size 2011-02-02 14:09:15 +00:00
Ed Hartnett
cdd254c983 added -lm to please fedore netcdf package maintainer 2011-01-31 11:40:56 +00:00
Ed Hartnett
03f63a5f1c many changes for memory fixes 2010-11-29 22:23:16 +00:00
Ed Hartnett
a658960034 added more valgrind testing 2010-10-13 22:53:25 +00:00
Ed Hartnett
c9033810e1 added many valgrind tests in libsrc4, fixed some memory problems 2010-10-12 09:12:08 +00:00
Ed Hartnett
d8cdc2085d more testing of HDF5 vlen type 2010-10-11 14:52:29 +00:00
Ed Hartnett
54297ac014 changed way that netcdf-4 scans HDF5 file on open for big performance boost 2010-09-22 13:38:15 +00:00
Dennis Heimbigner
6edff3f964 cleanup pre-dispatch stuff 2010-07-30 22:16:15 +00:00
Ed Hartnett
f1bea72f66 fixed position of test in list, which matters for extra-test builds, and fixed its error handling 2010-06-29 13:03:46 +00:00
Ed Hartnett
f0a72cbefb fixed broken classic-only build 2010-06-28 17:17:43 +00:00
Ed Hartnett
4220dca07a broke out test into libsrc/tst_interops6.c to help find memory problem 2010-06-25 14:51:16 +00:00
Ed Hartnett
663e344099 changes 2010-06-23 18:43:18 +00:00
Ed Hartnett
9da0b67627 fixed some automake issues to get distcheck working 2010-06-23 17:48:10 +00:00
Ed Hartnett
f6447ee94c fixed extra-test libsrc4/tst_h_vl2 problem and valgrind test problems 2010-06-23 13:57:33 +00:00
Ed Hartnett
bb57cf2e4b moved functions out of m4 into c files in libdispatch/netcdf.m4 2010-06-18 14:01:51 +00:00
Ed Hartnett
0b54b49991 moved some tests which used v2 API from libsrc4 to nc_test4, to allow the library to be built before the tests 2010-06-15 15:24:28 +00:00
Ed Hartnett
552151d61a fix libnetcdf2 problems with enable-c-only build in libsrc4 and libncdap4 2010-06-14 18:41:52 +00:00
Ed Hartnett
57d743dddb fixed build problems for libcf/gridspec, pnetcdf, and hdf4 2010-06-10 17:18:48 +00:00
Ed Hartnett
bb0035c95d fixed Makefile.am includes to find new netcdf.h location 2010-06-07 15:40:31 +00:00
Ed Hartnett
1ba136e4cd fixed makefile problems to allow make dist to work 2010-06-04 12:23:10 +00:00
Ed Hartnett
cdabf7bbf4 build system clean-up 2010-06-03 20:33:02 +00:00
Ed Hartnett
92ccf1c5fa moved headers to include directory 2010-06-03 20:22:55 +00:00