supercede PR: https://github.com/Unidata/netcdf-c/pull/1384
Since we have an mmap user, undeprecate it and make sure
it works. Other changes:
* fix test cases to work with make -j
* fix exposed ncgen error.
re: issue https://github.com/Unidata/netcdf-c/issues/1251
Assume that you have the URL to a remote dataset
which is a normal netcdf-3 or netcdf-4 file.
This PR allows the netcdf-c to read that dataset's
contents as a netcdf file using HTTP byte ranges
if the remote server supports byte-range access.
Originally, this PR was set up to access Amazon S3 objects,
but it can also access other remote datasets such as those
provided by a Thredds server via the HTTPServer access protocol.
It may also work for other kinds of servers.
Note that this is not intended as a true production
capability because, as is known, this kind of access to
can be quite slow. In addition, the byte-range IO drivers
do not currently do any sort of optimization or caching.
An additional goal here is to gain some experience with
the Amazon S3 REST protocol.
This architecture and its use documented in
the file docs/byterange.dox.
There are currently two test cases:
1. nc_test/tst_s3raw.c - this does a simple open, check format, close cycle
for a remote netcdf-3 file and a remote netcdf-4 file.
2. nc_test/test_s3raw.sh - this uses ncdump to investigate some remote
datasets.
This PR also incorporates significantly changed model inference code
(see the superceded PR https://github.com/Unidata/netcdf-c/pull/1259).
1. It centralizes the code that infers the dispatcher.
2. It adds support for byte-range URLs
Other changes:
1. NC_HDF5_finalize was not being properly called by nc_finalize().
2. Fix minor bug in ncgen3.l
3. fix memory leak in nc4info.c
4. add code to walk the .daprc triples and to replace protocol=
fragment tag with a more general mode= tag.
Final Note:
Th inference code is still way too complicated. We need to move
to the validfile() model used by netcdf Java, where each
dispatcher is asked if it can process the file. This decentralizes
the inference code. This will be done after all the major new
dispatchers (PIO, Zarr, etc) have been implemented.
Primary fixes to get -ansi to work.
1. Convert all '//' C++ style comments to /*...*/ or to use #if 0...#endif
2. It turns out that when -ansi is specified, then a number of
functions no longer are defined in the header -- but they are still
in the .so file.<br>
The big example is strdup(). So, added code to include/ncconfig.h to define
externs for those missing functions that occur in more than one place.
These are enabled if !_WIN32 && __STDC__ == 1 (__STDC__ is supposed to
be the equivalent compile time flag to -ansi). Note that this requires
config.h (which references ncconfig.h) to be included in files where it is
currently not included. Single uses will be only in the file that uses them.
3. Added mmap test for the MAP_ANONYMOUS flag to configure.ac. Apparently
this is not always defined with -ansi.
4. fix some large integer constants in nc_test4/tst_atts3.c and nc_test4/tst_filterparser.c
to avoid compiler complaints.
5. fix a double constant in nc_test4/tst_filterparser.c to avoid compiler complaints.
[Note I suspect #4 and #5 will be a problem on big-endian machines, but we have no way to test]
Misc. Changes:
1. convert more instances of _MSC_VER to _WIN32.
2. added some debugging code to include/nctestserver.h
3. added comment about libdispatch/drc.c always being compiled.
4. modify parser generation in ncgen to remove unneeded files.
re: pull request https://github.com/Unidata/netcdf-c/pull/1219
This pr should go in after or at the same time as 1219.
It updates nc_test/tst_inmemory to add a test to cover
the case fixed in 1219. It also disables a couple of tests
that no longer are relevant.
https://github.com/Unidata/netcdf-c/issues/1168https://github.com/Unidata/netcdf-c/issues/1163https://github.com/Unidata/netcdf-c/issues/1162
This PR partially fixes memory leaks in the netcdf-c library,
in the ncdump utility, and in some test cases.
The netcdf-c library now runs memory clean with the assumption
that the --disable-utilities option is used. The primary remaining
problem is ncgen. Once that is fixed, I believe the netcdf-c library
will run memory clean with no limitations.
Notes
-----------
1. Memory checking was performed using gcc -fsanitize=address.
Valgrind-based testing has yet to be performed.
2. The pnetcdf, hdf4, and examples code has not been tested.
Misc. Non-leak changes
1. Make tst_diskless2 only run when netcdf4 is enabled (issue 1162)
2. Fix CmakeLists.txt to turn off logging if ENABLE_NETCDF_4 is OFF
3. Isolated all my debug scripts into a single top-level directory
called debug
4. Fix some USE_NETCDF4 dependencies in nc_test and nc_test4 Makefile.am
re: https://github.com/Unidata/netcdf-c/issues/1154
Inadvertently, the behavior of NC_DISKLESS with nc_create() was
changed in release 4.6.1. Previously, the NC_WRITE flag needed
to be explicitly used with NC_DISKLESS in order to cause the
created file to be persisted to disk.
Additional analyis indicated that the current NC_DISKLESS
implementation was seriously flawed.
This PR attempts to clean up and regularize the situation with
respect to NC_DISKLESS control. One important aspect of diskless
operation is that there are two different notions of write.
1. The file is read-write vs read-only when using the netcdf API.
2. The file is persisted or not to disk at nc_close().
Previously, these two were conflated. The rules now are
as follows.
1. NC_DISKLESS + NC_WRITE means that the file is read/write using the netcdf API
2. NC_DISKLESS + NC_PERSIST means that the file is persisted to a disk file at nc_close.
3. NC_DISKLESS + NC_PERSIST + NC_WRITE means both 1 and 2.
The NC_PERSIST flag is new and takes over the obsolete NC_MPIPOSIX flag.
NC_MPIPOSIX is still defined, but is now an alias for the NC_MPIIO flag.
It is also now the case that for netcdf-4, NC_DISKLESS is independent
of NC_INMEMORY and in fact it is an error to specify both flags
simultaneously.
Finally, the MMAP code was fixed to use NC_PERSIST as well.
Also marked MMAP as deprecated.
Also added a test case to test various combinations of NC_DISKLESS,
NC_PERSIST, and NC_WRITE.
This PR affects a number of files and especially test cases
that used NC_DISKLESS.
Misc. Unrelated fixes
1. fixed some warnings in ncdump/dumplib.c
re: issue https://github.com/Unidata/netcdf-c/issues/1151
Modify DAP2 and DAP4 code to handle case when _FillValue type is not
same as the parent variable type.
Specifically:
1. Define a parameter [fillmismatch] to allow this mismatch;
default is to disallow.
2. If allowed, forcibly change the type of the _FillValue to match
the parent variable.
3. If allowed Convert the values to match new type
4. Generate a log message
5. if not allowed, then fail
Implementing this required some changes to ncdap_test/dapcvt.c
Also added test cases.
Minor Unrelated Changes:
1. There were a number of warnings about e.g.
assigning a const char* to a char*. Fix these
2. In nccopy.1, replace .NP with .IP "n"
(re PR https://github.com/Unidata/netcdf-c/pull/1144)
3. fix minor error in ncdump/ocprint
re: github issue https://github.com/Unidata/netcdf-c/issues/1111
One of the less common use cases for the in-memory feature is
apparently failing with HDF5-1.10.x. The fix is complicated and
requires significant changes to libhdf5/nc4memcb.c. The current
setup is detailed in the file docs/inmeminternal.dox.
Additionally, it was discovered that the program
nc_test/tst_inmemory.c, which is invoked by
nc_test/run_inmemory.sh, actually was failing because of the
above problem. But the failure is not detected since the script
does not return non-zero value.
Other Changes:
1. Fix nc_test_tst_inmemory to return errors correctly.
2. Make ncdap_tests/findtestserver.c and dap4_tests/findtestserver4.c
be generated from ncdap_test/findtestserver.c.in.
3. Make LOG() print output to stderr instead of stdout to
avoid contaminating e.g. ncdump output.
4. Modify the handling of NC_INMEMORY and NC_DISKLESS flags
to properly handle that NC_DISKLESS => NC_INMEMORY. This
affects a number of code pieces, especially memio.c.
The fix includes the following changes.
1. Checking and using the default file format at file create time is now
done only when the create mode (argument cmode) does not include any
format related flags, i.e. NC_64BIT_OFFSET, NC_64BIT_DATA,
NC_CLASSIC_MODEL, and NC_NETCDF4.
2. Adjustment of cmode based on the default format is now done in
NC_create() only. The idea is to adjust cmode before entering the
dispatcher's file create subroutine.
3. Any adjustment of cmode is removed from all I/O dispatchers, i.e.
NC4_create(), NC3_create(), and NCP_create().
4. Checking for illegal cmode has been done in check_create_mode() called
in NC_create(). This commit removes the redundant checking from
NCP_create().
5. Remove PnetCDF tests in nc_test/tst_names.c, so it can focus on testing
all classic formats and netCDF4 formats.
Two new test programs are added. They can be used to test netCDF with and
without this commit.
1. nc_test/tst_default_format.c
2. nc_test/tst_default_format_pnetcdf.c (use when PnetCDF is enabled).
Fix https://github.com/Unidata/netcdf-c/issues/962
1. remove the --disable-diskless option since it is no
longer needed. Similarly for CMakeLists.txt.
2. Fixed nc4files.c where BAIL and return were mixed
leading to situation where cleanup code was not
being invoked. This probably occurs elsewhere,
but I did not find any specifically.
Fix github issue https://github.com/Unidata/netcdf-c/issues/899
which came from e-support UOY-859712.
The problem was that the vlen_max parameter
to libsrc/var.c#NC_check_vlen was of type size_t.
However, it is being called, sometimes, with values
of size X_INT64_MAX. The resulting truncation was
causing dimension failures as noted in the e-support
report.
Fix is to change the vlen_max argument (and some
local variables in NC_check_vlen) to be of declared
as unsigned long long.
The file docs/indexing.dox tries to provide design
information for the refactoring.
The primary change is to replace all walking of linked
lists with the use of the NCindex data structure.
Ncindex is a combination of a hash table (for name-based
lookup) and a vector (for walking the elements in the index).
Additionally, global vectors are added to NC_HDF5_FILE_INFO_T
to support direct mapping of an e.g. dimid to the NC_DIM_INFO_T
object. These global vectors exist for dimensions, types, and groups
because they have globally unique id numbers.
WARNING:
1. since libsrc4 and libsrchdf4 share code, there are also
changes in libsrchdf4.
2. Any outstanding pull requests that change libsrc4 or libhdf4
are likely to cause conflicts with this code.
3. The original reason for doing this was for performance improvements,
but as noted elsewhere, this may not be significant because
the meta-data read performance apparently is being dominated
by the hdf5 library because we do bulk meta-data reading rather
than lazy reading.