Commit Graph

36 Commits

Author SHA1 Message Date
Edward Hartnett
4103faf0d5 added some ncdump tests for szip when it is present 2020-07-02 13:59:37 -06:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Dennis Heimbigner
751300ec59 Fix more memory leaks in netcdf-c library
This is a follow up to PR https://github.com/Unidata/netcdf-c/pull/1173

Sorry that it is so big, but leak suppression can be complex.

This PR fixes all remaining memory leaks -- as determined by
-fsanitize=address, and with the exceptions noted below.

Unfortunately. there remains a significant leak that I cannot
solve. It involves vlens, and it is unclear if the leak is
occurring in the netcdf-c library or the HDF5 library.

I have added a check_PROGRAM to the ncdump directory to show the
problem.  The program is called tst_vlen_demo.c To exercise it,
build the netcdf library with -fsanitize=address enabled. Then
go into ncdump and do a "make clean check".  This should build
tst_vlen_demo without actually executing it.  Then do the
command "./tst_vlen_demo" to see the output of the memory
checker.  Note the the lost malloc is deep in the HDF5 library
(in H5Tvlen.c).

I am temporarily working around this error in the following way.
1. I modified several test scripts to not execute known vlen tests
   that fail as described above.
2. Added an environment variable called NC_VLEN_NOTEST.
   If set, then those specific tests are suppressed.

This should mean that the --disable-utilities option to
./configure should not need to be set to get a memory leak clean
build.  This should allow for detection of any new leaks.

Note: I used an environment variable rather than a ./configure
option to control the vlen tests. This is because it is
temporary (I hope) and because it is a bit tricky for shell
scripts to access ./configure options.

Finally, as before, this only been tested with netcdf-4 and hdf5 support.
2018-11-15 10:00:38 -07:00
Ward Fisher
6efa25e7e4 Updated and removed a check for _NCProperties during file comparison, as this test fails if the environment is different from the environment the reference file was built on. 2018-09-06 15:53:25 -06:00
Dennis Heimbigner
2ea1cf5f1b There was a request to extend the provenance information
stored in the _NCProperties attribute to allow two things:
1. capture of additional library dependencies (over and above
   hdf5)
2. Recognition of non-netcdf libraries that create netcdf-4 format
   files.

To this end, the _NCProperties format has been extended to be
and arbitrary set of key=value pairs separated by commas.
This new format has version = 2, and uses commas as the pair separator.
Thus the general form is:
    _NCProperties = "version=2,key1=value,key2=value2..." ;

This new version is accompanied by a new ./configure option of the form
    --with-ncproperties="key1=value1,key2=value2..."
that specifies pairs to add to the _NCProperties attribute for all
files created with that netcdf library.

At this point, what is missing is some programmatic way to
specify either all the pairs or additional pairs
to the _NCProperties attribute. Not sure of the best way
to do this.

Builders using non-netcdf libraries can specify
whatever they want in the key value pairs (as long
as the version=2 is specified first).

By convention, the primary library is expected to be the
the first pair after the leading version=2 pair, but this
is convention only and is neither required nor enforced.

Related changes:
1. Fixed the tests that check _NCProperties to properly operate with version=2.
2. When reading a version 1 _NCProperties attribute, convert it to look
   like a version 2 attribute.
2. Added some version 2 tests to ncdump/tst_fileinfo.c and
   ncdump/tst_fileinfo.sh

Misc Changes:
1. Fix minor problem in ncdap_test/testurl.sh where a parameter to
   buildurl needed to be quoted.
2. Minor fix to ncgen to swap switches -H and -h to be consistent
   with other utilities.
3. Document the -M flag in nccopy usage() and the nccopy man page.
4. Modify a test case to use the nccopy -M flag.
2018-08-25 21:44:41 -06:00
Ward Fisher
40a929575c sh compatibility. 2018-08-13 00:01:53 -06:00
Ward Fisher
77c3fd023e Removed 'function' keyword for sh compatibility. 2018-08-12 23:23:59 -06:00
Ward Fisher
a996ed554e Swapped /bin/bash for /bin/sh to test on osx. 2018-08-12 23:01:08 -06:00
Ed Hartnett
4adc7e5cc6 fixed comment 2018-05-15 08:24:27 -06:00
Ed Hartnett
d167786796 moved irish rover ncdump test to netcdf-4 test script 2018-05-15 08:13:25 -06:00
Ward Fisher
98dd736a40 Oops. Corrected error properly, should check before I commit. 2017-12-18 14:57:25 -07:00
Ward Fisher
1703e33c61 Removed 'function' keyword, non-portable on ARM. 2017-12-18 14:56:32 -07:00
Ward Fisher
84a77be0c4 Added explicit error checking to tst_netcdf4.sh 2017-12-18 15:26:33 -06:00
Ward Fisher
58972c8cc0 Working out issues on Windows. 2017-12-12 10:44:22 -06:00
Ed Hartnett
c8605e07c6 removing dependencies between tests 2017-11-25 05:19:11 -07:00
Ed Hartnett
e064855a10 removed ncdump test dependencies for tst_netcdf4.sh 2017-11-25 04:31:58 -07:00
Ed Hartnett
87d59b80fb Makefile.am cleanup 2017-11-18 14:20:04 -07:00
Ed Hartnett
693a47ad04 move C program invokations to scripts to clean up dependencies in build 2017-11-16 13:03:35 -07:00
Dennis Heimbigner
9983b9d911 re e-support UBS-599337
re pull request https://github.com/Unidata/netcdf-c/pull/405
re pull request https://github.com/Unidata/netcdf-c/pull/446

Notes:
1. This branch is a cleanup of the magic.dmh branch.
2. magic.dmh was originally merged, but caused problems with parallel IO.
   It was re-issued as pull request https://github.com/Unidata/netcdf-c/pull/446.
3. This branch + pull request replace any previous pull requests and magic.dmh branch.

Given an otherwise valid netCDF file that has a corrupted header,
the netcdf library currently crashes. Instead, it should return
NC_ENOTNC.

Additionally, the NC_check_file_type code does not do the
forward search required by hdf5 files. It currently only looks
at file position 0 instead of 512, 1024, 2048,... Also, it turns
out that the HDF4 magic number is assumed to always be at the
beginning of the file (unlike HDF5).
The change is localized to libdispatch/dfile.c See
https://support.hdfgroup.org/release4/doc/DSpec_html/DS.pdf

Also, it turns out that the code in NC_check_file_type is duplicated
(mostly) in the function libsrc4/nc4file.c#nc_check_for_hdf.

This branch does the following.
1. Make NC_check_file_type return NC_ENOTNC instead of crashing.
2. Remove nc_check_for_hdf and centralize all file format checking
   NC_check_file_type.
3. Add proper forward search for HDF5 files (but not HDF4 files)
   to look for the magic number at offsets of 0, 512, 1024...
4. Add test tst_hdf5_offset.sh. This tests that hdf5 files with
   an offset are properly recognized. It does so by prefixing
   a legal file with some number of zero bytes: 512, 1024, etc.
5. Off-topic: Added -N flag to ncdump to force a specific output dataset name.
2017-10-24 16:25:09 -06:00
Dennis Heimbigner
3db4f013bf Primary change: add dap4 support
Specific changes:
1. Add dap4 code: libdap4 and dap4_test.
   Note that until the d4ts server problem is solved, dap4 is turned off.
2. Modify various files to support dap4 flags:
	configure.ac, Makefile.am, CMakeLists.txt, etc.
3. Add nc_test/test_common.sh. This centralizes
   the handling of the locations of various
   things in the build tree: e.g. where is
   ncgen.exe located. See nc_test/test_common.sh
   for details.
4. Modify .sh files to use test_common.sh
5. Obsolete separate oc2 by moving it to be part of
   netcdf-c. This means replacing code with netcdf-c
   equivalents.
5. Add --with-testserver to configure.ac to allow
   override of the servers to be used for --enable-dap-remote-tests.
6. There were multiple versions of nctypealignment code. Try to
   centralize in libdispatch/doffset.c and include/ncoffsets.h
7. Add a unit test for the ncuri code because of its complexity.
8. Move the findserver code out of libdispatch and into
   a separate, self contained program in ncdap_test and dap4_test.
9. Move the dispatch header files (nc{3,4}dispatch.h) to
   .../include because they are now shared by modules.
10. Revamp the handling of TOPSRCDIR and TOPBUILDDIR for shell scripts.
11. Make use of MREMAP if available
12. Misc. minor changes e.g.
	- #include <config.h> -> #include "config.h"
	- Add some no-install headers to /include
	- extern -> EXTERNL and vice versa as needed
	- misc header cleanup
	- clean up checking for misc. unix vs microsoft functions
13. Change copyright decls in some files to point to LICENSE file.
14. Add notes to RELEASENOTES.md
2017-03-08 17:01:10 -07:00
Dennis Heimbigner
11a259ad86 Add provenance info for netcdf-4 files.
This consists of a persistent attribute named
_NCProperties plus two computed attributes
_IsNetcdf4 and _SuperblockVersion.
See the 'Provenance Attributes' section
of docs/attribute_conventions.md for details.
2016-05-07 14:32:07 -06:00
dmh
47e10591b4 ckp 2015-11-19 13:44:55 -07:00
Russ Rew
9a60dc612f Use short "-k" codes instead of deprecated version numbers for ncgen and nccopy tests 2014-12-28 22:42:05 -07:00
dmh
7e582ad3f2 re: Jira NCF-309
The code for handling character constants
in datalists in ncgen has some problems.
1. It failed on large constants
2. It did not handle e.g. var = 'a', 'b', ...
   in the same way that ncgen3 did.
3. The code for generate.c and genchar.c needed
   some refactoring to make it a little simpler
   (but not simple).
2014-09-18 18:26:06 -06:00
dmh
cc95bd3d47 1. [NCF-276]/XXI-796914
Columbia server does not serve up proper
   opendap DDS replies. The Dataset {...} name
   changes depending on if the request has certain
   kinds of constraints.
   Code for a hack was not being used, so restore it.
   The fix is to effectively ignore differences in
   Dataset node names if the code is coming from
   columbia.edu.

2. [NCF-278]
   The ncgen code is improperly typing int64 integer constants
   as uint64.

3. [NCF-279]
   Empty string constants were not being properly
   filled when their target array is length 1 or more.
2013-11-17 14:26:14 -07:00
dmh
e7414e16d0 [NCF-265] again.
1. Updated the ncgen manual (ncgen.1)
   to discuss handling of ambiguous
   enumeration constant references.
2. Fixed the test case. It is currently
   XFAIL'd until such time as ncdump
   is modified to output properly
   disambiguated enumeration constant
   references.
2013-09-22 12:08:27 -06:00
Dennis Heimbigner
b083b9e758 fix == in shell scripts 2012-07-17 20:13:17 +00:00
Ward Fisher
2e96987a41 Merged latest changes from trunk, including deletion of win32 directory. 2012-06-13 19:29:01 +00:00
Russ Rew
79cde861ac Delete obsolete libdiskless directory, replaced by new diskless
implementation.  Deleted obsolete win32, soon to be replaced by Ward's
Windows 32- and 64-bit fixes for building with MSYS/MinGW.  Made
cosmetic cleanup to output of "make check" to make it easier for users
to interpret.  Fixed bug NCF-175: ncdump -t incorrectly interpreting
units attribute (such as "days") without a base time (such as "since
2007-01-01") as a time unit.

Changed name to 4.2.1-beta.
2012-06-12 21:50:02 +00:00
Ward Fisher
a5c4bf581f o Moved a couple more scripts around to use a uniform
naming convention.

o Modified ncdump shell, test scripts so that the extra
  digit in an exponent on windows wouldn't be a problem.

o Modified configure.ac to check for the zlib dll provided
  by the zlib group; they recommend using the official dll for
  windows builds.
2012-05-25 21:22:09 +00:00
Ward Fisher
ed3f676414 Corrected type in previous commit. 2012-05-11 20:27:34 +00:00
Ward Fisher
f620046772 Excluded absolute path ncdump test from running on windows platforms for the time being. See notes in tst_netcdf4.sh 2012-05-11 20:26:11 +00:00
Ward Fisher
08c29d0f06 o ncdump.c: set PRINTF_EXPONENT_DIGITS=2 when in windows to control the number of digits in the exponent.
o *.sh: Added stanza's to ensure that srcdir is set if it's not already set.
2012-05-08 22:37:56 +00:00
Ward Fisher
9fc359f29b Added -b flag to diff in shell scripts, to ignore whitespace, newlines, carriage returns. 2012-05-08 19:36:27 +00:00
Russ Rew
b37a176fa0 Clarified documentation for nc_inq_grp_ncid(), differentiating it from nc_inq_ncid(). Fixed problem returning values from functions declared void in some libsrc4/ test programs. Added test for bug in ncdump for dimensions with same name in nested groups, and fix for bug. Fixed nccopy bug involving dimensions with smae name in nested groups. Added code for specifying chunking by dimension to nccopy (but not implemented yet). 2010-12-30 18:17:04 +00:00
Ed Hartnett
18f4bca367 moving to trunk subdir 2010-06-03 13:24:43 +00:00