netcdf-c

mirror of https://github.com/Unidata/netcdf-c.git synced 2024-12-21 08:39:46 +08:00

Author	SHA1	Message	Date
Dennis Heimbigner	d37147d0ce	Suppress nczarr_test/tst_unknown filter test (and its nczarr_test equivalen) The test case nc_test4/tst_unknown.sh deletes and then restores a filter in the plugins directory. The test nczarr_test/run_unknown.sh also does this. However if both are running at the same time in a parallel bit, they apparently can interfere and can cause a race condition failure. The solution is to suppress one of them. Since nczarr code is more unstable, we need to run this test. So suppress the corresponding test in nc_test4.	2022-11-15 16:15:32 -07:00
Dennis Heimbigner	517cb6e816	Fix master conflicts	2022-08-05 13:09:03 -06:00
Dennis Heimbigner	2b45c7ec84	Fix support for reading arrays of HDF5 fixed size strings re: https://github.com/Unidata/netcdf-c/issues/2159 There was error in libhdf5 that only allowed reading a single value HDF5 fixed string. Fix to allow reading an array of such strings. Also make sure it still works for scalars and for attributes. Add a testcase: nc_test4/tst_fixedstring.sh.	2022-07-29 14:47:07 -06:00
Dennis Heimbigner	3623e17920	Fix some bugs in the blosc filter wrapper re: Issue https://github.com/Unidata/netcdf-c/issues/2458 The above Github Issue revealed some bugs in the file netcdf-c/plugins/H5Zblosc.c. Fixed and added a testcase. Also discovered that the Blosc LZ sub-compressors do not work well with small datasets. Misc. Other Change(s): I noticed that the file "dap4_test/baselinethredds/GOES16_CONUS_20170821_020218_0.47_1km_33.3N_91.4W.nc4.thredds" is still causing tar errors during "make distcheck", so I made some changes to do rename at test-time.	2022-07-12 15:19:07 -06:00
Dennis Heimbigner	d16a894458	conflicts	2022-05-24 14:40:54 -06:00
Dennis Heimbigner	7b09290a3a	Improve filter installation process to avoid use of an extra shell script re: https://github.com/Unidata/netcdf-c/issues/2338 re: https://github.com/Unidata/netcdf-c/issues/2294 In issue https://github.com/Unidata/netcdf-c/issues/2338, Ed Hartnett suggested a better way to install filters to a user defined location -- for Automake, anyway. This PR implements that suggestion. It turns out to be more complicated than it appears, so there are fair number of changes; mostly to shell scripts. Most of the change is in plugins/Makefile.am. NOTE: this PR still does NOT address the use of HDF5_PLUGIN_PATH as the default; this turns out to be complex when dealing with NCZarr. So this will be addressed in a subsequent post 4.9.0 PR. ## Misc. Changes 1. Record the occurrences of incomplete codecs in libnczarr so that they can be included in _Codecs attribute correctly. This allows users to see what missing filters are referenced in the Zarr file. Primarily affects libnczarr/zfilter.[ch]. Also required creating a new no-effect filter: H5Zunknown.c. 2. Move the unknown filter test to a separate test file. 3. Incorporates PR https://github.com/Unidata/netcdf-c/pull/2343	2022-05-14 16:05:48 -06:00
Edward Hartnett	15fb4aba9b	added tst_parallel6 to CMake parallel build	2022-05-02 10:40:22 -06:00
Ward Fisher	cd0f1690e8	Merge pull request #2245 from DennisHeimbigner/filterenhance.dmh	2022-03-21 16:45:27 -06:00
Ward Fisher	d281be2333	Merge branch 'main' into open_mem_truncated_file	2022-03-14 15:02:02 -06:00
Dennis Heimbigner	3ffe7be446	Enhance/Fix filter support re: Discussion https://github.com/Unidata/netcdf-c/discussions/2214 The primary change is to support so-called "standard filters". A standard filter is one that is defined by the following netcdf-c API: ```` int nc_def_var_XXX(int ncid, int varid, size_t nparams, unsigned* params); int nc_inq_var_XXXX(int ncid, int varid, int* usefilterp, unsigned* params); ```` So for example, zstandard would be a standard filter by defining the functions nc_def_var_zstandard and nc_inq_var_zstandard. In order to define these functions, we need a new dispatch function: ```` int nc_inq_filter_avail(int ncid, unsigned filterid); ```` This function, combined with the existing filter API can be used to implement arbitrary standard filters using a simple code pattern. Note that I would have preferred that this function return a list of all available filters, but HDF5 does not support that functionality. So this PR implements the dispatch function and implements the following standard functions: + bzip2 + zstandard + blosc Specific test cases are also provided for HDF5 and NCZarr. Over time, other specific standard filters will be defined. ## Primary Changes * Add nc_inq_filter_avail() to netcdf-c API. * Add standard filter implementations to test use of nc_inq_filter_avail. * Bump the dispatch table version number and add to all the relevant dispatch tables (libsrc, libsrcp, etc). * Create a program to invoke nc_inq_filter_avail so that it is accessible to shell scripts. * Cleanup szip support to properly support szip when HDF5 is disabled. This involves detecting libsz separately from testing if HDF5 supports szip. * Integrate shuffle and fletcher32 into the existing filter API. This means that, for example, nc_def_var_fletcher32 is now a wrapper around nc_def_var_filter. * Extend the Codec defaulting to allow multiple default shared libraries. ## Misc. Changes * Modify configure.ac/CMakeLists.txt to look for the relevant libraries implementing standard filters. * Modify libnetcdf.settings to list available standard filters (including deflate and szip). * Add CMake test modules to locate libbz2 and libzstd. * Cleanup the HDF5 memory manager function use in the plugins. * remove unused file include//ncfilter.h * remove tests for the HDF5 memory operations e.g. H5allocate_memory. * Add flag to ncdump to force use of _Filter instead of _Deflate or _Shuffle or _Fletcher32. Used for testing.	2022-03-14 12:39:37 -06:00
Ward Fisher	7c113cfae4	Merge branch 'h5align.dmh' of https://github.com/DennisHeimbigner/netcdf-c into gh2206.wif	2022-03-01 09:49:13 -08:00
Dennis Heimbigner	9b7202bf06	Explicitly disallow variable length type compression re: https://github.com/Unidata/netcdf-c/issues/2189 Compression of a variable whose type is variable length fails for all current filters. This is because at some point, the compression buffer will contain pointers to data instead of the actual data. Compression of pointers of course is meaningless. The PR changes the behavior of nc_def_var_filter so that it will fail with error NC_EFILTER if an attempt is made to add a filter to a variable whose type is variable-length. A variable is variable-length if it is of type string or VLEN or transitively (via a compound type) contains a string or VLEN. Also added a test case for this. ## Misc Changes 1. Turn off a number of debugging statements	2022-02-19 16:47:31 -07:00
Dennis Heimbigner	f3e711e2b8	Add support for setting HDF5 alignment property when creating a file re: https://github.com/Unidata/netcdf-c/issues/2177 re: https://github.com/Unidata/netcdf-c/pull/2178 Provide get/set functions to store global data alignment information and apply it when a file is created. The api is as follows: ```` int nc_set_alignment(int threshold, int alignment); int nc_get_alignment(int* thresholdp, int* alignmentp); ```` If defined, then for every file created opened after the call to nc_set_alignment, for every new variable added to the file, the most recently set threshold and alignment values will be applied to that variable. The nc_get_alignment function return the last values set by nc_set_alignment. If nc_set_alignment has not been called, then it returns the value 0 for both threshold and alignment. The alignment parameters are stored in the NCglobalstate object (see below) for use as needed. Repeated calls to nc_set_alignment will overwrite any existing values in NCglobalstate. The alignment parameters are applied in libhdf5/hdf5create.c and libhdf5/hdf5open.c The set/get alignment functions are defined in libsrc4/nc4internal.c. A test program was added as nc_test4/tst_alignment.c. ## Misc. Changes Unrelated to Alignment * The NCRCglobalstate type was renamed to NCglobalstate to indicate that it represented more general global state than just .rc data. It was also moved to nc4internal.h. This led to a large number of small changes: mostly renaming. The global state management functions were moved to nc4internal.c. * The global chunk cache variables have been moved into NCglobalstate. As warranted, other global state will be moved as well. * Some misc. problems with the nczarr performance tests were corrected.	2022-01-29 15:27:52 -07:00
Dennis Heimbigner	8b9253fef2	Fix various problem around VLEN's re: https://github.com/Unidata/netcdf-c/issues/541 re: https://github.com/Unidata/netcdf-c/issues/1208 re: https://github.com/Unidata/netcdf-c/issues/2078 re: https://github.com/Unidata/netcdf-c/issues/2041 re: https://github.com/Unidata/netcdf-c/issues/2143 For a long time, there have been known problems with the management of complex types containing VLENs. This also involves the string type because it is stored as a VLEN of chars. This PR (mostly) fixes this problem. But note that it adds new functions to netcdf.h (see below) and this may require bumping the .so number. These new functions can be removed, if desired, in favor of functions in netcdf_aux.h, but netcdf.h seems the better place for them because they are intended as alternatives to the nc_free_vlen and nc_free_string functions already in netcdf.h. The term complex type refers to any type that directly or transitively references a VLEN type. So an array of VLENS, a compound with a VLEN field, and so on. In order to properly handle instances of these complex types, it is necessary to have function that can recursively walk instances of such types to perform various actions on them. The term "deep" is also used to mean recursive. At the moment, the two operations needed by the netcdf library are: * free'ing an instance of the complex type * copying an instance of the complex type. The current library does only shallow free and shallow copy of complex types. This means that only the top level is properly free'd or copied, but deep internal blocks in the instance are not touched. Note that the term "vector" will be used to mean a contiguous (in memory) sequence of instances of some type. Given an array with, say, dimensions 2 X 3 X 4, this will be stored in memory as a vector of length 234=24 instances. The use cases are primarily these. ## nc_get_vars Suppose one is reading a vector of instances using nc_get_vars (or nc_get_vara or nc_get_var, etc.). These functions will return the vector in the top-level memory provided. All interior blocks (form nested VLEN or strings) will have been dynamically allocated. After using this vector of instances, it is necessary to free (aka reclaim) the dynamically allocated memory, otherwise a memory leak occurs. So, the recursive reclaim function is used to walk the returned instance vector and do a deep reclaim of the data. Currently functions are defined in netcdf.h that are supposed to handle this: nc_free_vlen(), nc_free_vlens(), and nc_free_string(). Unfortunately, these functions only do a shallow free, so deeply nested instances are not properly handled by them. Note that internally, the provided data is immediately written so there is no need to copy it. But the caller may need to reclaim the data it passed into the function. ## nc_put_att Suppose one is writing a vector of instances as the data of an attribute using, say, nc_put_att. Internally, the incoming attribute data must be copied and stored so that changes/reclamation of the input data will not affect the attribute. Again, the code inside the netcdf library does only shallow copying rather than deep copy. As a result, one sees effects such as described in Github Issue https://github.com/Unidata/netcdf-c/issues/2143. Also, after defining the attribute, it may be necessary for the user to free the data that was provided as input to nc_put_att(). ## nc_get_att Suppose one is reading a vector of instances as the data of an attribute using, say, nc_get_att. Internally, the existing attribute data must be copied and returned to the caller, and the caller is responsible for reclaiming the returned data. Again, the code inside the netcdf library does only shallow copying rather than deep copy. So this can lead to memory leaks and errors because the deep data is shared between the library and the user. # Solution The solution is to build properly recursive reclaim and copy functions and use those as needed. These recursive functions are defined in libdispatch/dinstance.c and their signatures are defined in include/netcdf.h. For back compatibility, corresponding "ncaux_XXX" functions are defined in include/netcdf_aux.h. ```` int nc_reclaim_data(int ncid, nc_type xtypeid, void* memory, size_t count); int nc_reclaim_data_all(int ncid, nc_type xtypeid, void* memory, size_t count); int nc_copy_data(int ncid, nc_type xtypeid, const void* memory, size_t count, void* copy); int nc_copy_data_all(int ncid, nc_type xtypeid, const void* memory, size_t count, void** copyp); ```` There are two variants. The first two, nc_reclaim_data() and nc_copy_data(), assume the top-level vector is managed by the caller. For reclaim, this is so the user can use, for example, a statically allocated vector. For copy, it assumes the user provides the space into which the copy is stored. The second two, nc_reclaim_data_all() and nc_copy_data_all(), allows the functions to manage the top-level. So for nc_reclaim_data_all, the top level is assumed to be dynamically allocated and will be free'd by nc_reclaim_data_all(). The nc_copy_data_all() function will allocate the top level and return a pointer to it to the user. The user can later pass that pointer to nc_reclaim_data_all() to reclaim the instance(s). # Internal Changes The netcdf-c library internals are changed to use the proper reclaim and copy functions. It turns out that the places where these functions are needed is quite pervasive in the netcdf-c library code. Using these functions also allows some simplification of the code since the stdata and vldata fields of NC_ATT_INFO are no longer needed. Currently this is commented out using the SEPDATA \#define macro. When any bugs are largely fixed, all this code will be removed. # Known Bugs 1. There is still one known failure that has not been solved. All the failures revolve around some variant of this .cdl file. The proximate cause of failure is the use of a VLEN FillValue. ```` netcdf x { types: float() row_of_floats ; dimensions: m = 5 ; variables: row_of_floats ragged_array(m) ; row_of_floats ragged_array:_FillValue = {-999} ; data: ragged_array = {10, 11, 12, 13, 14}, {20, 21, 22, 23}, {30, 31, 32}, {40, 41}, _ ; } ```` When a solution is found, I will either add it to this PR or post a new PR. # Related Changes Mark nc_free_vlen(s) as deprecated in favor of ncaux_reclaim_data. * Remove the --enable-unfixed-memory-leaks option. * Remove the NC_VLENS_NOTEST code that suppresses some vlen tests. * Document this change in docs/internal.md * Disable the tst_vlen_data test in ncdump/tst_nccopy4.sh. * Mark types as fixed size or not (transitively) to optimize the reclaim and copy functions. # Misc. Changes * Make Doxygen process libdispatch/daux.c * Make sure the NC_ATT_INFO_T.container field is set.	2022-01-08 18:30:00 -07:00
Edward Hartnett	0ce463761c	Merge branch 'main' into ejh_quantize_2	2021-09-07 10:44:45 -06:00
Dennis Heimbigner	11fe00ea05	Add filter support to NCZarr Filter support has three goals: 1. Use the existing HDF5 filter implementations, 2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr, 3. Allow filters to be used even when HDF5 is disabled Detailed usage directions are define in docs/filters.md. For now, the existing filter API is left in place. So filters are defined using ''nc_def_var_filter'' using the HDF5 style where the id and parameters are unsigned integers. This is a big change since filters affect many parts of the code. In the following, the terms "compressor" and "filter" and "codec" are generally used synonomously. ### Filter-Related Changes: * In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms. * Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h. * Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out. * Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h. * Add a number of new test to test the new nczarr filters. * Let ncgen parse _Codecs attribute, although it is ignored. ### Plugin directory changes: * Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file * Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip * Add a Codec defaulter (see docs/filters.md) for the big four filters. * Make plugins work with windows by properly adding __declspec declaration. ### Misc. Non-Filter Changes * Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5. * Improve support for caching * More fixes for path conversion code * Fix misc. memory leaks * Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath. * Add a number of new test to test the non-filter fixes. * Update the parsers * Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'	2021-09-02 17:04:26 -06:00
Edward Hartnett	f880a63f73	added parallel I/O quantize test	2021-09-02 10:18:42 -06:00
Tobias Kölling	b47f61bc25	added test for opening a trunceded file using in-memory open This test currently fails. When trying to open an in-memory file which is only available partially (i.e. by accidental truncation), then the library should fail with an error in stead of an assertation, such that user code can react on this.	2021-08-24 17:04:40 +02:00
Edward Hartnett	9a18689ffa	getting ready for next try at quantization code	2021-08-24 00:45:38 -06:00
Ward Fisher	9f798e2ed6	Merge branch 'virtual_datasets' of https://github.com/d70-t/netcdf-c into gh1983.wif	2021-07-19 09:44:35 -07:00
Dennis Heimbigner	c984b3a428	fix CMake error	2021-04-16 18:54:35 -06:00
Dennis Heimbigner	e07e32d552	Add test case	2021-04-16 16:12:53 -06:00
Tobias Kölling	a80d473ca8	Merge remote-tracking branch 'upstream/master' into virtual_datasets	2020-09-15 09:50:25 +02:00
Tobias Kölling	b6b9f2fd6f	enable test for virtual datasets only for HDF5 >= 1.10.0	2020-09-14 18:06:34 +02:00
Dennis Heimbigner	2f0a6d22e9	Fix error where not converting fill data re: Github Issue https://github.com/Unidata/netcdf-c/issues/1826 It turns out that the common get code (NC4_get_vars) in libhdf5 (and libnczarr) has an optimization where it does not attempt to read from the file if the file is all fill values. Rather it just fills the output buffer with the fill value. The problem is that -- in that case -- it forgets that conversion might still be needed. So the conversion never occurs and the raw bits of the fill data are stored directly into the memory space. Solution: move some code around to properly do the conversion no matter how the data was obtained. Added a test cases nc_test4/test_fillonly.sh and nczarr_test/test_fillonlyz.sh	2020-09-12 14:49:59 -06:00
Tobias Kölling	1a7dd2332d	added test for reading virtual datasets	2020-07-22 17:55:28 +02:00
Greg Sjaardema	102758d3ce	Remove test since file was moved to nc_perf In commit ba6ab3, the `tst_gfs_data_1.c` file was moved from `nc_test4` to `nc_perf`, but the test/executable that uses that file was not removed from nc_test4 CMakeLists.txt	2020-07-10 15:14:12 -06:00
Edward Hartnett	b13fcf96e7	added tst_gfs_data1	2020-06-28 19:38:18 -06:00
Dennis Heimbigner	84c69afca7	Allow redefinition of variable filters re: Github issue https://github.com/Unidata/netcdf-c/issues/1713 If nc_def_var_filter or nc_def_var_deflate or nc_def_var_szip is called multiple times with the same filter id, but possibly with different sets of parameters, then the first invocation is sticky and later invocations are ignored. The desired behavior is to have the last invocation be used. This PR implements that desired behavior, with some special cases. If you call nc_def_var_deflate multiple times, then the last invocation rule applies with respect to deflate. However, the shuffle filter, if enabled, is always applied just before applying deflate. Misc unrelated changes: 1. Make client-side filters be disabled by default 2. Fix the definition of uintptr_t and use in oc2 and libdap4 3. Add some test cases 4. modify filter order tests to use plugin filters rather than client-side filters	2020-05-11 09:42:31 -06:00
Dennis Heimbigner	44d0dcaad2	Add support for multiple filters per variable. re: https://github.com/Unidata/netcdf-c/issues/1584 Support has been added for multiple filters per variable. This affects a number of components in netcdf. The new APIs are documented in NUG/filters.md. The primary changes are: * A set of new functions are provided (see __include/netcdf_filter.h__). - Obtain a list of the filters associated with a variable - Obtain the parameters for a specific filter. * The existing __nc_inq_var_filter__ function now returns info about the first defined filter. * The utilities (ncgen, ncdump, and nccopy) now support an extended format for specifying a sequence of filters. The general form is __<filter>\|<filter>..._. * The ncdump _Filter attribute now dumps a list of all the filters associated with a variable using the above new format. * Filter specifications can now use a filter name instead of number for filters known to the netcdf library, which in turn is taken from the HDF5 filter registration page. * New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter is returned if an attempt is made to access an unknown filter. * Internally, the dispatch table has been extended to add a function to handle all of the filter functions. * New, filter-related, tests were added to nc_test4. * A new plugin was added to the plugins directory to help with testing. Notes: 1. The shuffle and fletcher32 filters are not part of the multifilter system. Misc. changes: 1. A debug module was added to libhdf5 to help catch error locations.	2020-02-16 12:59:33 -07:00
Edward Hartnett	cf74f49fdb	moved tst_parallel_zlib2 to tst_parallel_compress	2020-02-05 17:40:31 -07:00
Edward Hartnett	6d2d92ddec	added new tests to cmake, also additional test development	2019-12-20 06:30:23 -07:00
Edward Hartnett	cde58d23c1	allowing parallel write of gzipped data, plus added test	2019-12-19 09:20:20 -07:00
edwardhartnett	cebe84157b	adding test for anonymous dims in HDF5 file	2019-11-13 12:51:34 -07:00
Ward Fisher	9a92201c94	Wiring unit test directory into cmake-based builds.	2019-08-21 14:50:09 -06:00
Even Rouault	0c7be1d278	Add test case for bugfix of #1442	2019-07-18 02:23:43 +02:00
Ward Fisher	44fce0904e	Merge pull request #1387 from Unidata/fixffilter.dmh Minor config.h changes to support filters in Fortran	2019-05-01 14:44:53 -06:00
Ward Fisher	5410967b00	Merge branch 'master' into addfilter.dmh	2019-04-30 14:51:25 -06:00
Dennis Heimbigner	62feacee00	missing cmake file	2019-04-29 20:55:28 -06:00
Dennis Heimbigner	8d0bced60d	Allow in-line definition of filters Priority: Low re: issue https://github.com/Unidata/netcdf-c/issues/1329 HDF5 has the ability to programmatically define new filters, as opposed to using HDF5_PLUGIN_PATH env variable. This PR adds support for that feature. Not clear how useful this is, though. See docs/filters.md for details.	2019-03-21 11:33:27 -06:00
Ed Hartnett	94d9cd7c8f	don't allow benchmarks for classic only builds	2019-03-18 08:40:18 -06:00
Ed Hartnett	51b8ba10b4	attempting to get cmake build working with new nc_perf directory	2019-03-18 08:09:46 -06:00
Ward Fisher	404f87b8c2	Turned of filterparser test when building static library.	2019-02-20 15:11:06 -07:00
Ed Hartnett	1dd76c996e	added tst_rename3.c for more rename testing	2019-02-02 05:53:45 -07:00
Ward Fisher	462ec93913	Whew! Updated copyright stanza in nc_test4.	2018-12-06 15:27:32 -07:00
Ed Hartnett	8847c843fb	further cleanup for benchmark builds	2018-08-24 12:48:42 -06:00
Ed Hartnett	e1bd6f2c20	fixing cmake benchmarks, also removing unneeded run_bm.sh	2018-08-24 09:04:01 -06:00
Ed Hartnett	77d3a6db22	removed tst_ar5 from Makefile.am and cmake build	2018-08-24 07:11:51 -06:00
Ed Hartnett	1b318a01fb	getting automake build working	2018-08-16 10:55:11 -06:00
Greg Sjaardema	c3f63b6ffe	Enable metadata_perf test in CMake build Currently defaults to enabled, can change top-level CMakeLists.txt file to change default to disabled	2018-07-11 13:40:31 -04:00

1 2 3

142 Commits