re: https://github.com/Unidata/thredds/issues/1224
[note that this is an issue in thredds, but the fix is in netcdf-c]
A thredds server can encode a netcdf-4 file into DAP2
by flattening names to include the containing group path,
where the group names are separated by '/'.
But the '/' is prohibited in netcdf names even if escaped
(a decision before my time).
So, if the netcdf-c/libdap2 code encounters a DAP2 name with '/'
characters, the '/' characters are converted to the string
%2f. Unfortunately, there is a glitch, namely that converting
the leading '/' produces a name that is still illegal. This PR
modifies the code to just drop the leading '/' character.
re: issue https://github.com/Unidata/netcdf-c/issues/1278
re: issue https://github.com/Unidata/netcdf-c/issues/876
re: issue https://github.com/Unidata/netcdf-c/issues/806
* Major change to the handling of 8-byte parameters for nc_def_var_filter.
The old code was not well thought out.
* The new algorithm is documented in docs/filters.md.
* Added new utility file plugins/H5Zutil.c to support
* Modified plugins/H5Zmisc.c to use new algorithm
the new algorithm.
* Renamed include/ncfilter.h to include/netcdf_filter.h
and made it an installed header so clients can access the
new algorithm utility.
* Fixed nc_test4/tst_filterparser.c and nc_test4/test_filter_misc.c
to use the new algorithm
* libdap4/ fixes:
* d4swap.c has an error in the endian pre-processing such
that record counts were not being swapped correctly.
* d4data.c had an error in that checksums were being computed
after endian swapping rather than before.
* ocinitialize() was never being called, so xxdr bigendian handling
was never set correctly.
* Required adding debug statements to occompile
* Found and fixed memory leak in ncdump.c
Not tested:
* HDF4
* Pnetcdf
* parallel HDF5
Primary fixes to get -ansi to work.
1. Convert all '//' C++ style comments to /*...*/ or to use #if 0...#endif
2. It turns out that when -ansi is specified, then a number of
functions no longer are defined in the header -- but they are still
in the .so file.<br>
The big example is strdup(). So, added code to include/ncconfig.h to define
externs for those missing functions that occur in more than one place.
These are enabled if !_WIN32 && __STDC__ == 1 (__STDC__ is supposed to
be the equivalent compile time flag to -ansi). Note that this requires
config.h (which references ncconfig.h) to be included in files where it is
currently not included. Single uses will be only in the file that uses them.
3. Added mmap test for the MAP_ANONYMOUS flag to configure.ac. Apparently
this is not always defined with -ansi.
4. fix some large integer constants in nc_test4/tst_atts3.c and nc_test4/tst_filterparser.c
to avoid compiler complaints.
5. fix a double constant in nc_test4/tst_filterparser.c to avoid compiler complaints.
[Note I suspect #4 and #5 will be a problem on big-endian machines, but we have no way to test]
Misc. Changes:
1. convert more instances of _MSC_VER to _WIN32.
2. added some debugging code to include/nctestserver.h
3. added comment about libdispatch/drc.c always being compiled.
4. modify parser generation in ncgen to remove unneeded files.
This is a follow up to PR https://github.com/Unidata/netcdf-c/pull/1173
Sorry that it is so big, but leak suppression can be complex.
This PR fixes all remaining memory leaks -- as determined by
-fsanitize=address, and with the exceptions noted below.
Unfortunately. there remains a significant leak that I cannot
solve. It involves vlens, and it is unclear if the leak is
occurring in the netcdf-c library or the HDF5 library.
I have added a check_PROGRAM to the ncdump directory to show the
problem. The program is called tst_vlen_demo.c To exercise it,
build the netcdf library with -fsanitize=address enabled. Then
go into ncdump and do a "make clean check". This should build
tst_vlen_demo without actually executing it. Then do the
command "./tst_vlen_demo" to see the output of the memory
checker. Note the the lost malloc is deep in the HDF5 library
(in H5Tvlen.c).
I am temporarily working around this error in the following way.
1. I modified several test scripts to not execute known vlen tests
that fail as described above.
2. Added an environment variable called NC_VLEN_NOTEST.
If set, then those specific tests are suppressed.
This should mean that the --disable-utilities option to
./configure should not need to be set to get a memory leak clean
build. This should allow for detection of any new leaks.
Note: I used an environment variable rather than a ./configure
option to control the vlen tests. This is because it is
temporary (I hope) and because it is a bit tricky for shell
scripts to access ./configure options.
Finally, as before, this only been tested with netcdf-4 and hdf5 support.
https://github.com/Unidata/netcdf-c/issues/1168https://github.com/Unidata/netcdf-c/issues/1163https://github.com/Unidata/netcdf-c/issues/1162
This PR partially fixes memory leaks in the netcdf-c library,
in the ncdump utility, and in some test cases.
The netcdf-c library now runs memory clean with the assumption
that the --disable-utilities option is used. The primary remaining
problem is ncgen. Once that is fixed, I believe the netcdf-c library
will run memory clean with no limitations.
Notes
-----------
1. Memory checking was performed using gcc -fsanitize=address.
Valgrind-based testing has yet to be performed.
2. The pnetcdf, hdf4, and examples code has not been tested.
Misc. Non-leak changes
1. Make tst_diskless2 only run when netcdf4 is enabled (issue 1162)
2. Fix CmakeLists.txt to turn off logging if ENABLE_NETCDF_4 is OFF
3. Isolated all my debug scripts into a single top-level directory
called debug
4. Fix some USE_NETCDF4 dependencies in nc_test and nc_test4 Makefile.am
re: issue https://github.com/Unidata/netcdf-c/issues/1151
Modify DAP2 and DAP4 code to handle case when _FillValue type is not
same as the parent variable type.
Specifically:
1. Define a parameter [fillmismatch] to allow this mismatch;
default is to disallow.
2. If allowed, forcibly change the type of the _FillValue to match
the parent variable.
3. If allowed Convert the values to match new type
4. Generate a log message
5. if not allowed, then fail
Implementing this required some changes to ncdap_test/dapcvt.c
Also added test cases.
Minor Unrelated Changes:
1. There were a number of warnings about e.g.
assigning a const char* to a char*. Fix these
2. In nccopy.1, replace .NP with .IP "n"
(re PR https://github.com/Unidata/netcdf-c/pull/1144)
3. fix minor error in ncdump/ocprint
re: github issue https://github.com/Unidata/netcdf-c/issues/1111
One of the less common use cases for the in-memory feature is
apparently failing with HDF5-1.10.x. The fix is complicated and
requires significant changes to libhdf5/nc4memcb.c. The current
setup is detailed in the file docs/inmeminternal.dox.
Additionally, it was discovered that the program
nc_test/tst_inmemory.c, which is invoked by
nc_test/run_inmemory.sh, actually was failing because of the
above problem. But the failure is not detected since the script
does not return non-zero value.
Other Changes:
1. Fix nc_test_tst_inmemory to return errors correctly.
2. Make ncdap_tests/findtestserver.c and dap4_tests/findtestserver4.c
be generated from ncdap_test/findtestserver.c.in.
3. Make LOG() print output to stderr instead of stdout to
avoid contaminating e.g. ncdump output.
4. Modify the handling of NC_INMEMORY and NC_DISKLESS flags
to properly handle that NC_DISKLESS => NC_INMEMORY. This
affects a number of code pieces, especially memio.c.
Add the ability to set some additional curlopt values via .daprc (aka .dodsrc).
This effects both DAP2 and DAP4 protocols.
Related issues:
[1] re: esupport: KOZ-821332
[2] re: github issue https://github.com/Unidata/netcdf4-python/issues/836
[3] re: github issue https://github.com/Unidata/netcdf-c/issues/1074
1. CURLOPT_BUFFERSIZE: Relevant to [1]. Allow user to set the read/write
buffersizes used by curl.
This is done by adding the following to .daprc (aka .dodsrc):
HTTP.READ.BUFFERSIZE=n
where n is the buffersize in bytes. There is a built-in (to curl)
limit of 512k for this value.
2. CURLOPT_TCP_KEEPALIVE (and CURLOPT_TCP_KEEPIDLE and CURLOPT_TCP_KEEPINTVL):
Relevant (maybe) to [2] and [3]. Allow the user to turn on KEEPALIVE
This is done by adding the following to .daprc (aka .dodsrc):
HTTP.KEEPALIVE=on|n/m
If the value is "on", then simply enable default KEEPALIVE. If the value
is n/m, then enable KEEPALIVE and set KEEPIDLE to n and KEEPINTVL to m.
After a long discussion, I implemented the rules at the end of that issue.
They are documented in nccopy.1.
Additionally, I added a new, per-variable, -c flag that allows
for the direct setting of the chunking parameters for a variable.
The form is
-c var:c1,c2,...ck
where var is the name of the variable (possibly a fully qualified name)
and the ci are the chunksizes for that variable. It must be the case
that the rank of the variable is k. If the new form is used as well
as the old form, then the new form overrides the old form for the
specified variable. Note that multiple occurrences of the new form
-c flag may be specified.
Misc. Other fixes
1. Added -M <size> option to nccopy to specify the minimum
allowable chunksize.
2. Removed the unused variables from bigmeta.c
(Issue https://github.com/Unidata/netcdf-c/issues/1079)
3. Fixed failure of nc_test4/tst_filter.sh by using the new -M
flag (#1) to allow filter test on a small chunk size.
Fix https://github.com/Unidata/netcdf-c/issues/962
1. remove the --disable-diskless option since it is no
longer needed. Similarly for CMakeLists.txt.
2. Fixed nc4files.c where BAIL and return were mixed
leading to situation where cleanup code was not
being invoked. This probably occurs elsewhere,
but I did not find any specifically.
and https://github.com/Unidata/netcdf-c/issues/708
Expand the NC_INMEMORY capabilities to support writing and accessing
the final modified memory.
Three new functions have been added:
nc_open_memio, nc_create_mem, and nc_close_memio.
The following new capabilities were added.
1. nc_open_memio() allows the NC_WRITE mode flag
so a chunk of memory can be passed in and be modified
2. nc_create_mem() allows the NC_INMEMORY flag to be set
to cause the created file to be kept in memory.
3. nc_close_mem() allows the final in-memory contents to be
retrieved at the time the file is closed.
4. A special flag, NC_MEMIO_LOCK, is provided to ensure that
the provided memory will not be freed or reallocated.
Note the following.
1. If nc_open_memio() is called with NC_WRITE, and NC_MEMIO_LOCK is not set,
then the netcdf-c library will take control of the incoming memory.
This means that the original memory block should not be freed
but the block returned by nc_close_mem() must be freed.
2. If nc_open_memio() is called with NC_WRITE, and NC_MEMIO_LOCK is set,
then modifications to the original memory may fail if the space available
is insufficient.
Documentation is provided in the file docs/inmemory.md.
A test case is provided: nc_test/tst_inmemory.c driven by
nc_test/run_inmemory.sh
WARNING: changes were made to the dispatch table for
the close entry. From int (*close)(int) to int (*close)(int,void*).
(I hope) metadata mechanism. This mostly just adds new pieces of
code (e.g. nclistmap) and does some minor fixes.
It should be transparent to everything else.
The next set of changes will be the big step.
2. Fixed plugin building (nc_test4/hdf5plugins)
to be done properly by cmake and automake.
4. Duplicated part of the nc_test4 filter test code
in examples/C
An incomplete and untested set of hooks exist
for OS-X in nc_test4/findplugins.in. They need testing.
strlcat provides better protection against buffer overflows.
Code is taken from the FreeBSD project source code. Specifically:
https://github.com/freebsd/freebsd/blob/master/lib/libc/string/strlcat.c
License appears to be acceptable, but needs to be checked by e.g. Debian.
Step 1:
1. Add to netcdf-c/include/ncconfigure.h to use our version
if not already available as determined by HAVE_STRLCAT in config.h.
2. Add the strlcat code to libdispatch/dstring.c
3. Turns out that strlcat was already defined in several places.
So remove it from:
ncgen3/genlib.c
ncdump/dumplib.c
3. Define strlcat extern definition in ncconfigure.h.
4. Modify following directories to use strlcat:
libdap2 libdap4 ncdap_test dap4_test
Will do others in subsequent steps.
The nc_test/util.c error is a typo. The libdap4/d4meta.c error only is shown
when using a 64 bit machine because then |size_t| == 64 bits
and |int| = 32bits.
2. Factored out the parameter string parsing for ncgen and nccopy
int libdispatch/dfilter.c + include/ncfilter.h
3. Allow a parameter string to use constant types other than
unsigned int. See docs/filters.md for details.
4. Moved the old content of include/netcdf_filter.h into include/netcdf.h
and removed include/netcdf_filter.h as no longer needed.
5. Force the test filter (bzip2) in nc_test4/filter_test to
be built using BUILT_SOURCES.
The use of the following version-specific curl flags
is not always properly wrapped or aliased using
config.h HAVE_CURL... ifdefs.
# CURLOPT_USERNAME is not defined until curl version 7.19.1
# CURLOPT_PASSWORD is not defined until curl version 7.19.1
# CURLOPT_KEYPASSWD is not defined until curl version 7.16.4
-- aliased as needed to CURLOPT_SSLKEYPASSWD
# CURLINFO_RESPONSE_CODE is not defined until curl version 7.10.7
-- aliased as needed to CURLINFO_HTTP_CODE
# CURLOPT_CHUNK_BGN_FUNCTION is not defined until curl version 7.21.0
-- not used in our code
v4.5-release-candidate branch and master branch ASAP.
The bug occurs in d4rc.c where strcmp is being applied to NULL.
Also, the code in which it occurs is debugging code, so it needs
to be #ifdef'd. This fix may cause minor conflicts with other
outstanding pull requests that fix the same bug. But the
conflicts should be minor and easy to resolve.
Primary change is to cleanup code and remove duplicated code.
1. Unify the rc file reading into libdispatch/drc.c. Eventually extend
if we need rc file for netcdf itself as opposed to the dap code.
2. Unify the extraction from the rc file of DAP authorization info.
3. Misc. other small unifications: make temp file, read file.
4. Avoid use of libcurl when reading file:// because
there is some kind of problem with the Visual Studio version.
Might be related to the winpath problem.
In any case, do direct read instead.
5. Add new error code NC_ERCFILE for errors in reading RC file.
6. Complete documentation cleanup as indicated in this comment
https://github.com/Unidata/netcdf-c/pull/472#issuecomment-325926426
7. Convert some occurrences of #ifdef _WIN32 to #ifdef _MSC_VER
generates garbage. This in turn interferes with using .netrc
because the garbage user+pwd can will override the
.netrc. Note that this may work ok sometimes
if the garbage happens to start with a nul character.
2. It turns out that the user:pwd combination needs to support
character escaping. One reason is the user may contain an '@' character.
The other is that modern password rules make it not unlikely that
the password will contain characters that interfere with url parsing.
So, the rule I have implemented is that all occurrences of the user:pwd
format must escape any dodgy characters. The escape format is URL escaping
of the form %XX. This applies both to user:pwd
embedded in a URL as well as the use of HTTP.CREDENTIALS.USERPASSWORD
in a .dodsrc/.daprc file. The user and password in .netrc must not
be escaped. This is now documented in docs/auth.md
The fix for #2 actually obviated #1. Now, internally, the user and pwd
are stored separately and not in the user:pwd format. They are combined
(and escaped) only when needed.
where a null user+pwd generates
garbage. This in turn interferes
with using .netrc because the garbage
user+pwd can (sometimes) override
the .netrc.
Not entirely sure what is going on
because it works as is under e.g. cygwin.
In any case it needs fixing.
dap code will create a real temporary file in which to store the
converted metadata for the DAP .dds or .dmr.
It was assumed that the nc_close code would reclaim the
temporary file. For DAP2, reclamation occurs in the ncio
code. For DAP4, it was assumed that the libsrc4 code would do
the reclamation, but for whatever reason, this is not happening.
Thus, in this situation, a temporary file is left in the file
system. Aside from being irritating to users, this screws up
'make distcheck'.
So the DAP4 code is fixed to ensure that the temporary file is
properly reclaimed independent of the libsrc4 code.
Some temporary files are being left in a tempdir (e.g. /tmp
under *nix*).
The situation is described tersely in
netcdf-c/docs/auth.html#REDIR Basically, when a url is used that
requires redirection, a physical cookiejar file is required
to exist in the file system in order for this to work.
Since it was difficult to figure out when redirection was
being used (it was internal to libcurl) I needed to be prepared for that
eventuality. The result was that I always created a cookiejar file if one
was not specified in the rc file. This actually occurs in two places:
one inside oc2 and one inside libdap4.
The solution was two-fold:
1. do not use a cookiejar directory -- create cookiejar file directly
2. ensure that all cookiejar related files are reclaimed by nc_close().
Note that if nc_close (or nc_abort) is not called for whatever reason,
then reclamation will not occur.
This relies on the HDF5 capability to
dynamically load compression filters.
Note that a compression filter is just
a subcase of filters.
The primary user-visible changes are as follows:
1. Add a standard header "netcdf_filter.h" that defines
the necessary API extensions
2. Modify ncgen to support two new special attributes
"_Filter_ID" and "_Filter_Parameters" so that compression
can be turned on when creating a file using ncgen.
4. Add a detailed description of filtering support
to the user's guide; see the file filters.md
5. Add a test case directory for this: nc_test4/filter_test.
It is fragile and a ./configure flags (-enable-filter-test)
is defined (default disabled) to shut this off this test
to avoid spurious 'make check' failures.
Note that the HDF5 documentation is not up-to-date, so
much of what is encoded here comes from examining the
actual code in the file H5PL.c in the HDF5 source code.
1. When running under windows (as opposed to cygwin)
we need to make sure to not user /cygdrive/ file paths.
This was ocurring in libdap4/d4read.c, but may occur
elsewhere.
2. Shell scripts in the git repo are not being checked-out
with the executable mode set. Had core.filemode set to false.
Was a major hassle to fix.
Specific changes:
1. Add dap4 code: libdap4 and dap4_test.
Note that until the d4ts server problem is solved, dap4 is turned off.
2. Modify various files to support dap4 flags:
configure.ac, Makefile.am, CMakeLists.txt, etc.
3. Add nc_test/test_common.sh. This centralizes
the handling of the locations of various
things in the build tree: e.g. where is
ncgen.exe located. See nc_test/test_common.sh
for details.
4. Modify .sh files to use test_common.sh
5. Obsolete separate oc2 by moving it to be part of
netcdf-c. This means replacing code with netcdf-c
equivalents.
5. Add --with-testserver to configure.ac to allow
override of the servers to be used for --enable-dap-remote-tests.
6. There were multiple versions of nctypealignment code. Try to
centralize in libdispatch/doffset.c and include/ncoffsets.h
7. Add a unit test for the ncuri code because of its complexity.
8. Move the findserver code out of libdispatch and into
a separate, self contained program in ncdap_test and dap4_test.
9. Move the dispatch header files (nc{3,4}dispatch.h) to
.../include because they are now shared by modules.
10. Revamp the handling of TOPSRCDIR and TOPBUILDDIR for shell scripts.
11. Make use of MREMAP if available
12. Misc. minor changes e.g.
- #include <config.h> -> #include "config.h"
- Add some no-install headers to /include
- extern -> EXTERNL and vice versa as needed
- misc header cleanup
- clean up checking for misc. unix vs microsoft functions
13. Change copyright decls in some files to point to LICENSE file.
14. Add notes to RELEASENOTES.md