Commit Graph

64 Commits

Author SHA1 Message Date
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Ward Fisher
16e83896c6 Disable byterange tests if DAP remote tests are disabled. 2020-01-07 17:40:03 -07:00
Dennis Heimbigner
c9d16d82d6 Fix cmake X mmap
supercede PR: https://github.com/Unidata/netcdf-c/pull/1384

Since we have an mmap user, undeprecate it and make sure
it works. Other changes:

* fix test cases to work with make -j
* fix exposed ncgen error.
2019-04-19 20:32:26 -06:00
Dennis Heimbigner
0c59e13bf7 Master merge, conflict resolution, cleanup 2019-02-24 16:54:13 -07:00
Dennis Heimbigner
bf2746b8ea Provide byte-range reading of remote datasets
re: issue https://github.com/Unidata/netcdf-c/issues/1251

Assume that you have the URL to a remote dataset
which is a normal netcdf-3 or netcdf-4 file.

This PR allows the netcdf-c to read that dataset's
contents as a netcdf file using HTTP byte ranges
if the remote server supports byte-range access.

Originally, this PR was set up to access Amazon S3 objects,
but it can also access other remote datasets such as those
provided by a Thredds server via the HTTPServer access protocol.
It may also work for other kinds of servers.

Note that this is not intended as a true production
capability because, as is known, this kind of access to
can be quite slow. In addition, the byte-range IO drivers
do not currently do any sort of optimization or caching.

An additional goal here is to gain some experience with
the Amazon S3 REST protocol.

This architecture and its use documented in
the file docs/byterange.dox.

There are currently two test cases:

1. nc_test/tst_s3raw.c - this does a simple open, check format, close cycle
   for a remote netcdf-3 file and a remote netcdf-4 file.
2. nc_test/test_s3raw.sh - this uses ncdump to investigate some remote
   datasets.

This PR also incorporates significantly changed model inference code
(see the superceded PR https://github.com/Unidata/netcdf-c/pull/1259).

1. It centralizes the code that infers the dispatcher.
2. It adds support for byte-range URLs

Other changes:

1. NC_HDF5_finalize was not being properly called by nc_finalize().
2. Fix minor bug in ncgen3.l
3. fix memory leak in nc4info.c
4. add code to walk the .daprc triples and to replace protocol=
   fragment tag with a more general mode= tag.

Final Note:
Th inference code is still way too complicated. We need to move
to the validfile() model used by netcdf Java, where each
dispatcher is asked if it can process the file. This decentralizes
the inference code. This will be done after all the major new
dispatchers (PIO, Zarr, etc) have been implemented.
2019-01-01 18:27:36 -07:00
Ward Fisher
763750fab5 More copyright stanza updates. 2018-12-06 14:56:42 -07:00
Dennis Heimbigner
571cf6b933 Fix test run_diskless2.sh with LARGE_FILE_TESTS
ret: https://github.com/Unidata/netcdf-c/issues/1162

The test nc_test/run_diskless2.sh fails
when LARGE_FILE_TESTS is enabled.
Since the goal of the test was to test out
diskless+persist on a reasonably large file,
I fixed by just limiting the file size to
1000000000L bytes.
2018-11-03 11:51:10 -06:00
Dennis Heimbigner
245961de00 re: github issues
https://github.com/Unidata/netcdf-c/issues/1168
    https://github.com/Unidata/netcdf-c/issues/1163
    https://github.com/Unidata/netcdf-c/issues/1162

This PR partially fixes memory leaks in the netcdf-c library,
in the ncdump utility, and in some test cases.

The netcdf-c library now runs memory clean with the assumption
that the --disable-utilities option is used. The primary remaining
problem is ncgen. Once that is fixed, I believe the netcdf-c library
will run memory clean with no limitations.

Notes
-----------
1. Memory checking was performed using gcc -fsanitize=address.
   Valgrind-based testing has yet to be performed.
2. The pnetcdf, hdf4, and examples code has not been tested.

Misc. Non-leak changes
1. Make tst_diskless2 only run when netcdf4 is enabled (issue 1162)
2. Fix CmakeLists.txt to turn off logging if ENABLE_NETCDF_4 is OFF
3. Isolated all my debug scripts into a single top-level directory
   called debug
4. Fix some USE_NETCDF4 dependencies in nc_test and nc_test4 Makefile.am
2018-10-30 20:48:12 -06:00
Dennis Heimbigner
553de966d6 Add some visual studio bug fixes 2018-10-11 12:09:42 -06:00
Dennis Heimbigner
4636584d5b Revert/Improve nc_create + NC_DISKLESS behavior
re: https://github.com/Unidata/netcdf-c/issues/1154

Inadvertently, the behavior of NC_DISKLESS with nc_create() was
changed in release 4.6.1. Previously, the NC_WRITE flag needed
to be explicitly used with NC_DISKLESS in order to cause the
created file to be persisted to disk.

Additional analyis indicated that the current NC_DISKLESS
implementation was seriously flawed.

This PR attempts to clean up and regularize the situation with
respect to NC_DISKLESS control. One important aspect of diskless
operation is that there are two different notions of write.

1. The file is read-write vs read-only when using the netcdf API.
2. The file is persisted or not to disk at nc_close().

Previously, these two were conflated. The rules now are
as follows.

1. NC_DISKLESS + NC_WRITE means that the file is read/write using the netcdf API
2. NC_DISKLESS + NC_PERSIST means that the file is persisted to a disk file at nc_close.
3. NC_DISKLESS + NC_PERSIST + NC_WRITE means both 1 and 2.

The NC_PERSIST flag is new and takes over the obsolete NC_MPIPOSIX flag.
NC_MPIPOSIX is still defined, but is now an alias for the NC_MPIIO flag.

It is also now the case that for netcdf-4, NC_DISKLESS is independent
of NC_INMEMORY and in fact it is an error to specify both flags
simultaneously.

Finally, the MMAP code was fixed to use NC_PERSIST as well.
Also marked MMAP as deprecated.

Also added a test case to test various combinations of NC_DISKLESS,
NC_PERSIST, and NC_WRITE.

This PR affects a number of files and especially test cases
that used NC_DISKLESS.

Misc. Unrelated fixes
1. fixed some warnings in ncdump/dumplib.c
2018-10-10 13:32:17 -06:00
Wei-keng Liao
0ee68a3263 This commit fixes the logical problem of using the default file formats.
The fix includes the following changes.
1. Checking and using the default file format at file create time is now
   done only when the create mode (argument cmode) does not include any
   format related flags, i.e. NC_64BIT_OFFSET, NC_64BIT_DATA,
   NC_CLASSIC_MODEL, and NC_NETCDF4.
2. Adjustment of cmode based on the default format is now done in
   NC_create() only. The idea is to adjust cmode before entering the
   dispatcher's file create subroutine.
3. Any adjustment of cmode is removed from all I/O dispatchers, i.e.
   NC4_create(), NC3_create(), and NCP_create().
4. Checking for illegal cmode has been done in check_create_mode() called
   in NC_create(). This commit removes the redundant checking from
   NCP_create().
5. Remove PnetCDF tests in nc_test/tst_names.c, so it can focus on testing
   all classic formats and netCDF4 formats.

Two new test programs are added. They can be used to test netCDF with and
without this commit.
1. nc_test/tst_default_format.c
2. nc_test/tst_default_format_pnetcdf.c (use when PnetCDF is enabled).
2018-07-28 11:18:28 -05:00
Dennis Heimbigner
4b936ee26a Fix use of 'int' to represent 'hid_t' that caused HDF5 1.10 to fail.
re: github issue https://github.com/Unidata/netcdf-fortran/issues/82

This was originally discovered in the Fortran tests, but is
a problem in the C library.

The problem only occurred when using HDF5-1.10.x.  The reason it
failed is that starting with 1.10, the hid_t type was changed
from 32 bits to 64 bits.
The function libsrc4/nc4memcb.c#NC4_image_init was using type int (doh!)
to return the hdf fileid instead of hid_t type.  This, of course,
caused the id to be truncated and in turn later use of the id
caused hdf5 to fail.

Fix is trivial: replace int with hid_t. This also requires a related
change in nc4mem.c.

Also added the test case derived from the original Fortran code.

You would think I would learn...
2018-05-30 14:47:37 -06:00
Dennis Heimbigner
3c7ffcc6d1 Fix https://github.com/Unidata/netcdf-c/issues/963
Fix https://github.com/Unidata/netcdf-c/issues/962

1. remove the --disable-diskless option since it is no
   longer needed. Similarly for CMakeLists.txt.
2. Fixed nc4files.c where BAIL and return were mixed
   leading to situation where cleanup code was not
   being invoked. This probably occurs elsewhere,
   but I did not find any specifically.
2018-05-11 15:30:19 -06:00
Ward Fisher
9f716a228f
Merge branch 'master' into inmemory.dmh 2018-05-04 16:15:22 -06:00
Dennis Heimbigner
0a44d9ae3a Merge branch 'master' into inmemory.dmh 2018-04-23 11:30:14 -06:00
Ed Hartnett
b5e282996c moved tests 2018-04-23 03:19:11 -06:00
Ward Fisher
46e39c3647 Merge branch 'err_endef_close' of https://github.com/wkliao/netcdf-c into gh479-conflict-resolution 2018-04-18 10:22:53 -06:00
Dennis Heimbigner
ccc70d640b re: esupport MQO-415619
and https://github.com/Unidata/netcdf-c/issues/708

Expand the NC_INMEMORY capabilities to support writing and accessing
the final modified memory.

Three new functions have been added:
nc_open_memio, nc_create_mem, and nc_close_memio.

The following new capabilities were added.
1. nc_open_memio() allows the NC_WRITE mode flag
   so a chunk of memory can be passed in and be modified
2. nc_create_mem() allows the NC_INMEMORY flag to be set
   to cause the created file to be kept in memory.
3. nc_close_mem() allows the final in-memory contents to be
   retrieved at the time the file is closed.
4. A special flag, NC_MEMIO_LOCK, is provided to ensure that
   the provided memory will not be freed or reallocated.

Note the following.
1. If nc_open_memio() is called with NC_WRITE, and NC_MEMIO_LOCK is not set,
   then the netcdf-c library will take control of the incoming memory.
   This means that the original memory block should not be freed
   but the block returned by nc_close_mem() must be freed.
2. If nc_open_memio() is called with NC_WRITE, and NC_MEMIO_LOCK is set,
   then modifications to the original memory may fail if the space available
   is insufficient.

Documentation is provided in the file docs/inmemory.md.
A test case is provided: nc_test/tst_inmemory.c driven by
nc_test/run_inmemory.sh

WARNING: changes were made to the dispatch table for
the close entry. From int (*close)(int) to int (*close)(int,void*).
2018-02-25 21:45:31 -07:00
Wei-keng Liao
0f4a85b9f2 a clean commit for #383 2017-12-20 20:53:30 -06:00
Ward Fisher
1b4ab683b5 Added mmap to tests that run serially 2017-12-01 15:23:26 -07:00
Wei-keng Liao
2f239d34df add test program, tst_err_enddef.c 2017-09-07 20:29:40 -05:00
Ward Fisher
85e9aaf368 Wiring in a large test to check against a regression for the issue described in https://github.com/Unidata/netcdf-c/pull/457 2017-08-11 18:18:11 -06:00
Dennis Heimbigner
5a0c7abb8f Add testcase 2017-05-03 09:58:49 -06:00
Ward Fisher
8dddd222a3 Merged master, DAP4 support into branch. 2017-04-19 09:29:35 -06:00
Ward Fisher
b5b99cbebb Merge branch 'master' into ghpull-375 2017-04-05 16:24:55 -06:00
Dennis Heimbigner
5f15c9e777 1) master merge 2)fix uname -o problem 2017-03-30 16:21:31 -06:00
Ward Fisher
16169a27af Wiring in test in support of https://github.com/Unidata/netcdf-c/issues/388 2017-03-29 15:25:38 -06:00
Ward Fisher
6b036cfc28 Adding test in support of https://github.com/Unidata/netcdf-c/issues/384 2017-03-29 11:23:47 -06:00
Dennis Heimbigner
3db4f013bf Primary change: add dap4 support
Specific changes:
1. Add dap4 code: libdap4 and dap4_test.
   Note that until the d4ts server problem is solved, dap4 is turned off.
2. Modify various files to support dap4 flags:
	configure.ac, Makefile.am, CMakeLists.txt, etc.
3. Add nc_test/test_common.sh. This centralizes
   the handling of the locations of various
   things in the build tree: e.g. where is
   ncgen.exe located. See nc_test/test_common.sh
   for details.
4. Modify .sh files to use test_common.sh
5. Obsolete separate oc2 by moving it to be part of
   netcdf-c. This means replacing code with netcdf-c
   equivalents.
5. Add --with-testserver to configure.ac to allow
   override of the servers to be used for --enable-dap-remote-tests.
6. There were multiple versions of nctypealignment code. Try to
   centralize in libdispatch/doffset.c and include/ncoffsets.h
7. Add a unit test for the ncuri code because of its complexity.
8. Move the findserver code out of libdispatch and into
   a separate, self contained program in ncdap_test and dap4_test.
9. Move the dispatch header files (nc{3,4}dispatch.h) to
   .../include because they are now shared by modules.
10. Revamp the handling of TOPSRCDIR and TOPBUILDDIR for shell scripts.
11. Make use of MREMAP if available
12. Misc. minor changes e.g.
	- #include <config.h> -> #include "config.h"
	- Add some no-install headers to /include
	- extern -> EXTERNL and vice versa as needed
	- misc header cleanup
	- clean up checking for misc. unix vs microsoft functions
13. Change copyright decls in some files to point to LICENSE file.
14. Add notes to RELEASENOTES.md
2017-03-08 17:01:10 -07:00
Wei-keng Liao
beac085c7d more updates due to renaming test_read.c and test_write.c to m4 files 2016-10-13 02:56:50 -05:00
Wei-keng Liao
ac86d8c9fb Update CMakeLists.txt due to test_read.c and test_write.c have been renamed to m4 files 2016-10-13 02:25:21 -05:00
Ward Fisher
486f69522c Merged utility-based tests fix into cmake-based builds. 2016-08-31 15:48:10 -06:00
Ward Fisher
93641eb2d6 Refactored types test. 2016-03-28 18:29:47 +00:00
Ward Fisher
90541e6c17 Cleanup: Removing temporary tests. 2015-12-28 21:17:25 +00:00
Ward Fisher
5cb4efe02a Tweaked new test. 2015-12-22 15:48:05 -07:00
Ward Fisher
2ea988f84d Added a check for signed char. 2015-12-22 12:02:18 -07:00
Ward Fisher
b7a4f1c7b5 Updated tests so that tst_addvar.c is run after tst_pnetcdf.c 2015-10-29 10:27:31 -06:00
Ward Fisher
ccad3765ca Adding a new test for a particular pnetcdf bug. See YRZ-543552. 2015-10-28 10:51:53 -06:00
Ward Fisher
5349c99e8a Removed tst_swap4b for now, it is causing difficulties that are proving problematic to work around. 2015-08-17 16:57:31 -06:00
Ward Fisher
2664eca9e5 Wired the refactored tst_swap4b into autotools and cmake. 2015-07-28 12:24:13 -06:00
Ward Fisher
da82e6dbbb Refactoring of tst_swap4b 2015-07-28 11:14:07 -06:00
Ward Fisher
b2bdd83a7f Added skeleton of new test for [NCF338]. 2015-07-27 11:20:44 -06:00
Ward Fisher
017f196628 Added the [NCF-326] test to CMake build system. 2015-03-10 09:53:13 -06:00
Ward Fisher
fe1b96cdd9 Updated CMakelists to remove debug, release subdirectories on Windows. Updating shell scripts to work with MSYS paths. 2015-02-02 14:46:50 -07:00
Nico Schlömer
16ce3507f4 nc_test explicitly uses libm 2014-10-08 12:41:48 +02:00
Ward Fisher
069820158d Added a test to catch errors introduced into netcdf_meta.h 2014-09-11 15:44:56 -06:00
Ward Fisher
7f812b367e Manual merge of pull request https://github.com/Unidata/netcdf-c/pull/64 contributed by nschloe. Assorted CMake improvements. 2014-06-11 15:51:31 -06:00
Ward Fisher
44fae42214 Cleaned up indentation, white space in multiple CMakeLists.txt files. 2014-04-21 11:15:33 -06:00
Ward Fisher
78d19f69fd Updating for Windows Large File Support. 2013-10-16 16:44:11 -06:00
Ward Fisher
a595c62d7f Updated RELEASE_NOTES with reference to LFS Jira ticket. Temporarily removed an LFS test on Windows. 2013-08-28 16:30:41 -06:00