re pull request https://github.com/Unidata/netcdf-c/pull/405
re pull request https://github.com/Unidata/netcdf-c/pull/446
Notes:
1. This branch is a cleanup of the magic.dmh branch.
2. magic.dmh was originally merged, but caused problems with parallel IO.
It was re-issued as pull request https://github.com/Unidata/netcdf-c/pull/446.
3. This branch + pull request replace any previous pull requests and magic.dmh branch.
Given an otherwise valid netCDF file that has a corrupted header,
the netcdf library currently crashes. Instead, it should return
NC_ENOTNC.
Additionally, the NC_check_file_type code does not do the
forward search required by hdf5 files. It currently only looks
at file position 0 instead of 512, 1024, 2048,... Also, it turns
out that the HDF4 magic number is assumed to always be at the
beginning of the file (unlike HDF5).
The change is localized to libdispatch/dfile.c See
https://support.hdfgroup.org/release4/doc/DSpec_html/DS.pdf
Also, it turns out that the code in NC_check_file_type is duplicated
(mostly) in the function libsrc4/nc4file.c#nc_check_for_hdf.
This branch does the following.
1. Make NC_check_file_type return NC_ENOTNC instead of crashing.
2. Remove nc_check_for_hdf and centralize all file format checking
NC_check_file_type.
3. Add proper forward search for HDF5 files (but not HDF4 files)
to look for the magic number at offsets of 0, 512, 1024...
4. Add test tst_hdf5_offset.sh. This tests that hdf5 files with
an offset are properly recognized. It does so by prefixing
a legal file with some number of zero bytes: 512, 1024, etc.
5. Off-topic: Added -N flag to ncdump to force a specific output dataset name.
dap code will create a real temporary file in which to store the
converted metadata for the DAP .dds or .dmr.
It was assumed that the nc_close code would reclaim the
temporary file. For DAP2, reclamation occurs in the ncio
code. For DAP4, it was assumed that the libsrc4 code would do
the reclamation, but for whatever reason, this is not happening.
Thus, in this situation, a temporary file is left in the file
system. Aside from being irritating to users, this screws up
'make distcheck'.
So the DAP4 code is fixed to ensure that the temporary file is
properly reclaimed independent of the libsrc4 code.
nc4_check_name() checks that the provided string doesn't exceed NC_MAX_NAME,
but fails to do so after calling nc_utf8_normalize(). This extra check is
needed since a caller of nc4_check_name(), like NC4_def_dim, allocates
norm_name as char norm_name[NC_MAX_NAME + 1]
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=2840
Credit to OSS-Fuzz
were added to provide a path name converter from e.g. cygwin
paths to e.g. windows paths. This is necessary because
the shell scripts may produce cygwin paths, but the code
may have been compiled with Visual Studio. Similar issues
arise with Mingw.
At appropriate places, and if using Visual Studio or Mingw,
I added calls to the path conversion code.
Apparently I forgot to find all the places where this
conversion was needed. So this pr does the following:
1. Push the calls to the converter to the various libXXX
directories and out of libdispatch/dfile.c.
2. Add conversion calls to other parts of the code like oc2.
I also turns out that conversion code in dapcvt.c
had a bug when handling DAP Byte type under visual studio.
Notes:
1. there may still be places I missed that need to do path conversion.
2. need to make sure that calls to e.g. H5open also use converted path.
Current code only frees char* text in error cases. It should
also free it in success case.
Otherwise Valgrind reports a leak:
==28536== 64 bytes in 1 blocks are definitely lost in loss record 4 of 13
==28536== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==28536== by 0xE673496: NC4_buildpropinfo (nc4info.c:239)
==28536== by 0xE67313B: NC4_put_propattr (nc4info.c:162)
==28536== by 0xE65BF35: nc4_create_file (nc4file.c:468)
==28536== by 0xE65C0AC: NC4_create (nc4file.c:564)
==28536== by 0xE608A08: NC_create (dfile.c:1773)
==28536== by 0xE607E6A: nc__create (dfile.c:511)
==28536== by 0xE607E23: nc_create (dfile.c:440)
Credit to OSS Fuzz
This relies on the HDF5 capability to
dynamically load compression filters.
Note that a compression filter is just
a subcase of filters.
The primary user-visible changes are as follows:
1. Add a standard header "netcdf_filter.h" that defines
the necessary API extensions
2. Modify ncgen to support two new special attributes
"_Filter_ID" and "_Filter_Parameters" so that compression
can be turned on when creating a file using ncgen.
4. Add a detailed description of filtering support
to the user's guide; see the file filters.md
5. Add a test case directory for this: nc_test4/filter_test.
It is fragile and a ./configure flags (-enable-filter-test)
is defined (default disabled) to shut this off this test
to avoid spurious 'make check' failures.
Note that the HDF5 documentation is not up-to-date, so
much of what is encoded here comes from examining the
actual code in the file H5PL.c in the HDF5 source code.
Specific changes:
1. Add dap4 code: libdap4 and dap4_test.
Note that until the d4ts server problem is solved, dap4 is turned off.
2. Modify various files to support dap4 flags:
configure.ac, Makefile.am, CMakeLists.txt, etc.
3. Add nc_test/test_common.sh. This centralizes
the handling of the locations of various
things in the build tree: e.g. where is
ncgen.exe located. See nc_test/test_common.sh
for details.
4. Modify .sh files to use test_common.sh
5. Obsolete separate oc2 by moving it to be part of
netcdf-c. This means replacing code with netcdf-c
equivalents.
5. Add --with-testserver to configure.ac to allow
override of the servers to be used for --enable-dap-remote-tests.
6. There were multiple versions of nctypealignment code. Try to
centralize in libdispatch/doffset.c and include/ncoffsets.h
7. Add a unit test for the ncuri code because of its complexity.
8. Move the findserver code out of libdispatch and into
a separate, self contained program in ncdap_test and dap4_test.
9. Move the dispatch header files (nc{3,4}dispatch.h) to
.../include because they are now shared by modules.
10. Revamp the handling of TOPSRCDIR and TOPBUILDDIR for shell scripts.
11. Make use of MREMAP if available
12. Misc. minor changes e.g.
- #include <config.h> -> #include "config.h"
- Add some no-install headers to /include
- extern -> EXTERNL and vice versa as needed
- misc header cleanup
- clean up checking for misc. unix vs microsoft functions
13. Change copyright decls in some files to point to LICENSE file.
14. Add notes to RELEASENOTES.md
The nvars, ndims, and natts fields on the NC_HDF5_FILE_INFO struct are
never set. The nvars field is read, but since it is never written,
the value is always zero.
Update utf8proc.[ch] to use the version now
maintained by the Julia Language project
(https://github.com/JuliaLang/utf8proc/blob/master/LICENSE.md).
The license for the previous version was
unacceptable for the Debian and Ubuntu release
systems. The new version both updates the code
and addresses the license issue.
It turns out that the utf8proc software we are using
was turned over to the Julia Language developers
and the license terms changed to allow modification.
(https://github.com/JuliaLang/utf8proc/blob/master/LICENSE.md).
So the fix here is as follows:
1. Wrap the library with a fixed interface: libdispatch/dutf8.c
and include/ncutf8.h.
2. Replace the existing utf8proc code with the new version
from https://github.com/JuliaLang/utf8proc.
3. Add a couple more test cases: nc_test/tst_utf8_validate.c
and nc_test_utf8_phrases.c. If/when I can find a usable
normalization test, I will incorporate that later.
HDF5 does not permit a variable to have no_fill == TRUE if the variable type is variable length. This includes types NC_STRING and NC_VLEN.
The test above also excludes user-defined types which I'm not sure is needed or not.
In nc4 mode, the variables were ignoring the default "database" fill/no_fill mode set via a call to nc_set_fill(). This sets the no_fill mode on the variable to match the default database setting at the time the variable is defined.