Commit Graph

33 Commits

Author SHA1 Message Date
Dennis Heimbigner
211538cf25 Modify ncdump to print char-valued variables as utf8.
re: Issue https://github.com/Unidata/netcdf-c/issues/2916

Currently, ncdump prints char-valued variables as a mix
of ascii and octal characters. The octal format is used
for non-printable ascii character values.

This PR changes this to print the char variable values
as raw binary. This means in practice that utf-8 tags
are properly interpreted and printed as utf-8.
2024-05-07 10:36:14 -06:00
Peter Hill
b6eb730684
Silence various conversion warnings in ncdump 2024-01-15 15:46:13 +00:00
Peter Hill
472b30f313
Remove some unused variables 2024-01-15 15:46:13 +00:00
Peter Hill
d07dac918c
Silence conversion warnings from malloc arguments
Mostly just add an explicit cast when calling `malloc` and its
variants. Sometimes instead change the type of a local variable if
this would silence multiple warnings.
2023-11-24 18:20:52 +00:00
Dennis Heimbigner
8b9253fef2 Fix various problem around VLEN's
re: https://github.com/Unidata/netcdf-c/issues/541
re: https://github.com/Unidata/netcdf-c/issues/1208
re: https://github.com/Unidata/netcdf-c/issues/2078
re: https://github.com/Unidata/netcdf-c/issues/2041
re: https://github.com/Unidata/netcdf-c/issues/2143

For a long time, there have been known problems with the
management of complex types containing VLENs.  This also
involves the string type because it is stored as a VLEN of
chars.

This PR (mostly) fixes this problem. But note that it adds new
functions to netcdf.h (see below) and this may require bumping
the .so number.  These new functions can be removed, if desired,
in favor of functions in netcdf_aux.h, but netcdf.h seems the
better place for them because they are intended as alternatives
to the nc_free_vlen and nc_free_string functions already in
netcdf.h.

The term complex type refers to any type that directly or
transitively references a VLEN type. So an array of VLENS, a
compound with a VLEN field, and so on.

In order to properly handle instances of these complex types, it
is necessary to have function that can recursively walk
instances of such types to perform various actions on them.  The
term "deep" is also used to mean recursive.

At the moment, the two operations needed by the netcdf library are:
* free'ing an instance of the complex type
* copying an instance of the complex type.

The current library does only shallow free and shallow copy of
complex types. This means that only the top level is properly
free'd or copied, but deep internal blocks in the instance are
not touched.

Note that the term "vector" will be used to mean a contiguous (in
memory) sequence of instances of some type. Given an array with,
say, dimensions 2 X 3 X 4, this will be stored in memory as a
vector of length 2*3*4=24 instances.

The use cases are primarily these.

## nc_get_vars
Suppose one is reading a vector of instances using nc_get_vars
(or nc_get_vara or nc_get_var, etc.).  These functions will
return the vector in the top-level memory provided.  All
interior blocks (form nested VLEN or strings) will have been
dynamically allocated.

After using this vector of instances, it is necessary to free
(aka reclaim) the dynamically allocated memory, otherwise a
memory leak occurs.  So, the recursive reclaim function is used
to walk the returned instance vector and do a deep reclaim of
the data.

Currently functions are defined in netcdf.h that are supposed to
handle this: nc_free_vlen(), nc_free_vlens(), and
nc_free_string().  Unfortunately, these functions only do a
shallow free, so deeply nested instances are not properly
handled by them.

Note that internally, the provided data is immediately written so
there is no need to copy it. But the caller may need to reclaim the
data it passed into the function.

## nc_put_att
Suppose one is writing a vector of instances as the data of an attribute
using, say, nc_put_att.

Internally, the incoming attribute data must be copied and stored
so that changes/reclamation of the input data will not affect
the attribute.

Again, the code inside the netcdf library does only shallow copying
rather than deep copy. As a result, one sees effects such as described
in Github Issue https://github.com/Unidata/netcdf-c/issues/2143.

Also, after defining the attribute, it may be necessary for the user
to free the data that was provided as input to nc_put_att().

## nc_get_att
Suppose one is reading a vector of instances as the data of an attribute
using, say, nc_get_att.

Internally, the existing attribute data must be copied and returned
to the caller, and the caller is responsible for reclaiming
the returned data.

Again, the code inside the netcdf library does only shallow copying
rather than deep copy. So this can lead to memory leaks and errors
because the deep data is shared between the library and the user.

# Solution

The solution is to build properly recursive reclaim and copy
functions and use those as needed.
These recursive functions are defined in libdispatch/dinstance.c
and their signatures are defined in include/netcdf.h.
For back compatibility, corresponding "ncaux_XXX" functions
are defined in include/netcdf_aux.h.
````
int nc_reclaim_data(int ncid, nc_type xtypeid, void* memory, size_t count);
int nc_reclaim_data_all(int ncid, nc_type xtypeid, void* memory, size_t count);
int nc_copy_data(int ncid, nc_type xtypeid, const void* memory, size_t count, void* copy);
int nc_copy_data_all(int ncid, nc_type xtypeid, const void* memory, size_t count, void** copyp);
````
There are two variants. The first two, nc_reclaim_data() and
nc_copy_data(), assume the top-level vector is managed by the
caller. For reclaim, this is so the user can use, for example, a
statically allocated vector. For copy, it assumes the user
provides the space into which the copy is stored.

The second two, nc_reclaim_data_all() and
nc_copy_data_all(), allows the functions to manage the
top-level.  So for nc_reclaim_data_all, the top level is
assumed to be dynamically allocated and will be free'd by
nc_reclaim_data_all().  The nc_copy_data_all() function
will allocate the top level and return a pointer to it to the
user. The user can later pass that pointer to
nc_reclaim_data_all() to reclaim the instance(s).

# Internal Changes
The netcdf-c library internals are changed to use the proper
reclaim and copy functions.  It turns out that the places where
these functions are needed is quite pervasive in the netcdf-c
library code.  Using these functions also allows some
simplification of the code since the stdata and vldata fields of
NC_ATT_INFO are no longer needed.  Currently this is commented
out using the SEPDATA \#define macro.  When any bugs are largely
fixed, all this code will be removed.

# Known Bugs

1. There is still one known failure that has not been solved.
   All the failures revolve around some variant of this .cdl file.
   The proximate cause of failure is the use of a VLEN FillValue.
````
        netcdf x {
        types:
          float(*) row_of_floats ;
        dimensions:
          m = 5 ;
        variables:
          row_of_floats ragged_array(m) ;
              row_of_floats ragged_array:_FillValue = {-999} ;
        data:
          ragged_array = {10, 11, 12, 13, 14}, {20, 21, 22, 23}, {30, 31, 32},
                         {40, 41}, _ ;
        }
````
When a solution is found, I will either add it to this PR or post a new PR.

# Related Changes

* Mark nc_free_vlen(s) as deprecated in favor of ncaux_reclaim_data.
* Remove the --enable-unfixed-memory-leaks option.
* Remove the NC_VLENS_NOTEST code that suppresses some vlen tests.
* Document this change in docs/internal.md
* Disable the tst_vlen_data test in ncdump/tst_nccopy4.sh.
* Mark types as fixed size or not (transitively) to optimize the reclaim
  and copy functions.

# Misc. Changes

* Make Doxygen process libdispatch/daux.c
* Make sure the NC_ATT_INFO_T.container field is set.
2022-01-08 18:30:00 -07:00
Ward Fisher
ffa30d21c2 Correcting a formatting error for scalars when dumping with ncdump -f 2020-04-28 15:49:03 -06:00
Ward Fisher
02937d2d0e ncdump, other directories updated with copyright stanza. 2018-12-06 15:36:53 -07:00
Dennis Heimbigner
751300ec59 Fix more memory leaks in netcdf-c library
This is a follow up to PR https://github.com/Unidata/netcdf-c/pull/1173

Sorry that it is so big, but leak suppression can be complex.

This PR fixes all remaining memory leaks -- as determined by
-fsanitize=address, and with the exceptions noted below.

Unfortunately. there remains a significant leak that I cannot
solve. It involves vlens, and it is unclear if the leak is
occurring in the netcdf-c library or the HDF5 library.

I have added a check_PROGRAM to the ncdump directory to show the
problem.  The program is called tst_vlen_demo.c To exercise it,
build the netcdf library with -fsanitize=address enabled. Then
go into ncdump and do a "make clean check".  This should build
tst_vlen_demo without actually executing it.  Then do the
command "./tst_vlen_demo" to see the output of the memory
checker.  Note the the lost malloc is deep in the HDF5 library
(in H5Tvlen.c).

I am temporarily working around this error in the following way.
1. I modified several test scripts to not execute known vlen tests
   that fail as described above.
2. Added an environment variable called NC_VLEN_NOTEST.
   If set, then those specific tests are suppressed.

This should mean that the --disable-utilities option to
./configure should not need to be set to get a memory leak clean
build.  This should allow for detection of any new leaks.

Note: I used an environment variable rather than a ./configure
option to control the vlen tests. This is because it is
temporary (I hope) and because it is a bit tricky for shell
scripts to access ./configure options.

Finally, as before, this only been tested with netcdf-4 and hdf5 support.
2018-11-15 10:00:38 -07:00
Dennis Heimbigner
245961de00 re: github issues
https://github.com/Unidata/netcdf-c/issues/1168
    https://github.com/Unidata/netcdf-c/issues/1163
    https://github.com/Unidata/netcdf-c/issues/1162

This PR partially fixes memory leaks in the netcdf-c library,
in the ncdump utility, and in some test cases.

The netcdf-c library now runs memory clean with the assumption
that the --disable-utilities option is used. The primary remaining
problem is ncgen. Once that is fixed, I believe the netcdf-c library
will run memory clean with no limitations.

Notes
-----------
1. Memory checking was performed using gcc -fsanitize=address.
   Valgrind-based testing has yet to be performed.
2. The pnetcdf, hdf4, and examples code has not been tested.

Misc. Non-leak changes
1. Make tst_diskless2 only run when netcdf4 is enabled (issue 1162)
2. Fix CmakeLists.txt to turn off logging if ENABLE_NETCDF_4 is OFF
3. Isolated all my debug scripts into a single top-level directory
   called debug
4. Fix some USE_NETCDF4 dependencies in nc_test and nc_test4 Makefile.am
2018-10-30 20:48:12 -06:00
Ed Hartnett
0c0d066927 changed macro STREQ to NCSTREQ to avoid name collusion with HDF4 library 2018-05-12 08:55:51 -06:00
Ed Hartnett
93cb051b72 found a few more easy warnings 2017-11-09 06:39:43 -07:00
Dennis Heimbigner
3db4f013bf Primary change: add dap4 support
Specific changes:
1. Add dap4 code: libdap4 and dap4_test.
   Note that until the d4ts server problem is solved, dap4 is turned off.
2. Modify various files to support dap4 flags:
	configure.ac, Makefile.am, CMakeLists.txt, etc.
3. Add nc_test/test_common.sh. This centralizes
   the handling of the locations of various
   things in the build tree: e.g. where is
   ncgen.exe located. See nc_test/test_common.sh
   for details.
4. Modify .sh files to use test_common.sh
5. Obsolete separate oc2 by moving it to be part of
   netcdf-c. This means replacing code with netcdf-c
   equivalents.
5. Add --with-testserver to configure.ac to allow
   override of the servers to be used for --enable-dap-remote-tests.
6. There were multiple versions of nctypealignment code. Try to
   centralize in libdispatch/doffset.c and include/ncoffsets.h
7. Add a unit test for the ncuri code because of its complexity.
8. Move the findserver code out of libdispatch and into
   a separate, self contained program in ncdap_test and dap4_test.
9. Move the dispatch header files (nc{3,4}dispatch.h) to
   .../include because they are now shared by modules.
10. Revamp the handling of TOPSRCDIR and TOPBUILDDIR for shell scripts.
11. Make use of MREMAP if available
12. Misc. minor changes e.g.
	- #include <config.h> -> #include "config.h"
	- Add some no-install headers to /include
	- extern -> EXTERNL and vice versa as needed
	- misc header cleanup
	- clean up checking for misc. unix vs microsoft functions
13. Change copyright decls in some files to point to LICENSE file.
14. Add notes to RELEASENOTES.md
2017-03-08 17:01:10 -07:00
Greg Sjaardema
2d059f4580 Fix invalid array access
If an empty line is passed to this routine, then there is an invalid memory access at the cp[nn-1] line.  This becomes cp[-1] which is invalid.
2016-08-25 11:32:49 -06:00
Ward Fisher
ac9f02d01a Modified index passed for last row when printing with annotations, in support of fixing https://github.com/Unidata/netcdf-c/issues/181 2015-12-31 21:20:36 +00:00
tbeu
e2820e4d8a Fix common typos
Detected by https://github.com/vlajos/misspell_fixer
2015-08-20 11:42:05 +02:00
Russ Rew
374a7550c4 Better fix for NCF-336 2015-07-14 10:00:40 -06:00
Ward Fisher
6c071be031 Corrected issues where functions were not available in Windows. Specifically strndup. Also accomodated an included needed for getcwd. 2014-08-07 17:03:27 -06:00
Russ Rew
720e4ea82c Fixed bug NCF-310 (ncdump char vars with multiple unlimited dims).
Added associated tests and entry in RELEASE_NOTES.
2014-08-07 14:35:29 -06:00
Russ Rew
911bdab962 Fix for bug NCF-275, ncdump -b annotation 2013-11-13 11:03:18 -07:00
Russ Rew
8e18ede2fe Fix full annotation bug described in NCF-275 2013-11-12 06:40:03 -07:00
Russ Rew
3ef3b35a94 Refactor to share functions between ncdump and nccopy. Merge nccopy
enhancements, based on contributed code from Martin van Driel, to
support -v, -g, -V, and -G options for selecting groups and variables
in output.  Fix all clang warnings from nccopy and ncdump sources, as
well as a few other cleanup changes to testing code.
2013-01-23 17:45:29 +00:00
Russ Rew
8b63afda70 Fix portability bug, no arithmetic on void * pointers 2012-11-19 20:02:22 +00:00
Russ Rew
8e98e3727d Fixed bug NCF-144 (ncdump of variables with multiple unlimited
dimensions).  Added comprehensive tests that include variables with
lots of combinations of 0 through 4 fixed and unlimited dimensions.
2012-11-16 21:37:43 +00:00
Ward Fisher
18d507c00d Changed 'boolean' to 'boolen' to avoid a name conflict under windows. 2012-09-10 22:07:04 +00:00
Russ Rew
c6f399c731 Fix non-portable test that depends on nonstandardized floating-point format using e+08, not e+008. Fixes for some problems reported in scan-build static analysis. 2012-04-23 23:59:24 +00:00
Russ Rew
97c954a859 Add -g option to ncdump and test it 2011-09-21 23:10:03 +00:00
Russ Rew
51906ca254 Support Dave Allured's ncdump mods for time-valued attributes. Refactor all ncdump time code into nctime.c. Add '-i' option for strict ISO-8601 time notation. Fix backward compatibility test to not use DBL_MAX, which is the same as IEEE infinity when using only 15 significant digits of precision. Document ncdump time functionality in man page. 2011-09-12 21:31:08 +00:00
Russ Rew
50115604c8 Fixes for NCF-70, ncdump -t printing of time-values attributes. Haven't fixed printing of cell bounds for time-valued variables yet. 2011-08-31 18:28:08 +00:00
Russ Rew
52e525c19c Factor out common functionality in ncdump and nccopy and share between them. 2011-01-05 23:48:47 +00:00
Russ Rew
a46ef6091e another try at fixing MALLOC_CHECK_=3 error on spock 2010-08-11 04:58:14 +00:00
Russ Rew
817502e6e1 Fix a malloc bug 2010-08-03 16:09:01 +00:00
Russ Rew
86cfc908ad Get rid of uses of NC_MAX_DIMS in ncdump that are easy to eliminate.Get rid of some uses of NC_MAX_DIMS from nccopy. Add libsrc/pstdint.hfor systems that have no stdint.h. 2010-07-29 22:41:05 +00:00
Ed Hartnett
18f4bca367 moving to trunk subdir 2010-06-03 13:24:43 +00:00