netcdf-c

mirror of https://github.com/Unidata/netcdf-c.git synced 2024-12-21 08:39:46 +08:00

Author	SHA1	Message	Date
Dennis Heimbigner	231ae96c4b	Add support for Zarr string type to NCZarr * re: https://github.com/Unidata/netcdf-c/pull/2278 * re: https://github.com/Unidata/netcdf-c/issues/2485 * re: https://github.com/Unidata/netcdf-c/issues/2474 This PR subsumes PR https://github.com/Unidata/netcdf-c/pull/2278. Actually is a bit an omnibus covering several issues. ## PR https://github.com/Unidata/netcdf-c/pull/2278 Add support for the Zarr string type. Zarr strings are restricted currently to be of fixed size. The primary issue to be addressed is to provide a way for user to specify the size of the fixed length strings. This is handled by providing the following new attributes special: 1. _nczarr_default_maxstrlen — This is an attribute of the root group. It specifies the default maximum string length for string types. If not specified, then it has the value of 64 characters. 2. _nczarr_maxstrlen — This is a per-variable attribute. It specifies the maximum string length for the string type associated with the variable. If not specified, then it is assigned the value of _nczarr_default_maxstrlen. This PR also requires some hacking to handle the existing netcdf-c NC_CHAR type, which does not exist in zarr. The goal was to choose numpy types for both the netcdf-c NC_STRING type and the netcdf-c NC_CHAR type such that if a pure zarr implementation read them, it would still work and an NC_CHAR type would be handled by zarr as a string of length 1. For writing variables and NCZarr attributes, the type mapping is as follows: * "\|S1" for NC_CHAR. * ">S1" for NC_STRING && MAXSTRLEN==1 * ">Sn" for NC_STRING && MAXSTRLEN==n Note that it is a bit of a hack to use endianness, but it should be ok since for string/char, the endianness has no meaning. For reading attributes with pure zarr (i.e. with no nczarr atribute types defined), they will always be interpreted as of type NC_CHAR. ## Issue: https://github.com/Unidata/netcdf-c/issues/2474 This PR partly fixes this issue because it provided more comprehensive support for Zarr attributes that are JSON valued expressions. This PR still does not address the problem in that issue where the _ARRAY_DIMENSION attribute is incorrectly set. Than can only be fixed by the creator of the datasets. ## Issue: https://github.com/Unidata/netcdf-c/issues/2485 This PR also fixes the scalar failure shown in this issue. It generally cleans up scalar handling. It also adds a note to the documentation describing that NCZarr supports scalars while Zarr does not and also how scalar interoperability is achieved. ## Misc. Other Changes 1. Convert the nczarr special attributes and keys to be all lower case. So "_NCZARR_ATTR" now used "_nczarr_attr. Support back compatibility for the upper case names. 2. Cleanup my too-clever-by-half handling of scalars in libnczarr.	2022-08-27 20:21:13 -06:00
Ward Fisher	0164512b0f	Merge branch 'tinyxml2.dmh' of https://github.com/DennisHeimbigner/netcdf-c into gh2170.wif	2022-03-29 11:31:31 -06:00
Dennis Heimbigner	6d44ec39f6	1. Fix conflicts with current master. 2. There is a bug in building tinyxml2 under OSX, so as a hack, the absence of an installed libxml2 under OSX will disable libxml2 and DAP4.	2022-03-15 15:33:13 -06:00
Dennis Heimbigner	3ffe7be446	Enhance/Fix filter support re: Discussion https://github.com/Unidata/netcdf-c/discussions/2214 The primary change is to support so-called "standard filters". A standard filter is one that is defined by the following netcdf-c API: ```` int nc_def_var_XXX(int ncid, int varid, size_t nparams, unsigned* params); int nc_inq_var_XXXX(int ncid, int varid, int* usefilterp, unsigned* params); ```` So for example, zstandard would be a standard filter by defining the functions nc_def_var_zstandard and nc_inq_var_zstandard. In order to define these functions, we need a new dispatch function: ```` int nc_inq_filter_avail(int ncid, unsigned filterid); ```` This function, combined with the existing filter API can be used to implement arbitrary standard filters using a simple code pattern. Note that I would have preferred that this function return a list of all available filters, but HDF5 does not support that functionality. So this PR implements the dispatch function and implements the following standard functions: + bzip2 + zstandard + blosc Specific test cases are also provided for HDF5 and NCZarr. Over time, other specific standard filters will be defined. ## Primary Changes * Add nc_inq_filter_avail() to netcdf-c API. * Add standard filter implementations to test use of nc_inq_filter_avail. * Bump the dispatch table version number and add to all the relevant dispatch tables (libsrc, libsrcp, etc). * Create a program to invoke nc_inq_filter_avail so that it is accessible to shell scripts. * Cleanup szip support to properly support szip when HDF5 is disabled. This involves detecting libsz separately from testing if HDF5 supports szip. * Integrate shuffle and fletcher32 into the existing filter API. This means that, for example, nc_def_var_fletcher32 is now a wrapper around nc_def_var_filter. * Extend the Codec defaulting to allow multiple default shared libraries. ## Misc. Changes * Modify configure.ac/CMakeLists.txt to look for the relevant libraries implementing standard filters. * Modify libnetcdf.settings to list available standard filters (including deflate and szip). * Add CMake test modules to locate libbz2 and libzstd. * Cleanup the HDF5 memory manager function use in the plugins. * remove unused file include//ncfilter.h * remove tests for the HDF5 memory operations e.g. H5allocate_memory. * Add flag to ncdump to force use of _Filter instead of _Deflate or _Shuffle or _Fletcher32. Used for testing.	2022-03-14 12:39:37 -06:00
Dennis Heimbigner	4077594a55	Remove conflicts; does not work with OSX	2022-01-31 17:16:23 -07:00
Milton Woods	b33a6348f1	Merge branch 'main' into mingw-w64-strcasecmp	2022-01-11 10:45:15 +11:00
Dennis Heimbigner	55a2643cac	Fix a number of OS specific bugs 1. Issue https://github.com/Unidata/netcdf-c/issues/2043 * FreeBSD build fails because of conflicts in defining the fileno() function. So removed all extern declarations of fileno. 2. Issue https://github.com/Unidata/netcdf-c/issues/2124 * There were a couple of problems here. * I was conflating msys with mingw and they need separate handling of paths. So treat mingw like windows. * memio.c was not always writing the full content of the memory to file. Untested fix by properly accounting for zero size writes. * Fix bug when skipping white space in tst_xcache.c 3. Issue https://github.com/Unidata/netcdf-c/pull/2105 * On MINGW, bash and other POSIX utilities use a mounted root directory, but executables compiled for Windows do not recognise the mount point. Ensure that Windows paths are used in tests of Windows executables. 4. Issue https://github.com/Unidata/netcdf-c/issues/2132 * Apparently the Intel C compiler on OSX defines isnan etc. So disable declaration in dutil.c under that condition. 5. Fix and re-enable test_rcmerge.sh by allowing override of where to look for .rc files 6. CMakeLists.txt suppresses certain ncdump directory tests because of differences in printing floats/doubles. * Extend the list to include those that also fail under mingw. * Suppress the mingw tests in ncdump/Makefile.am	2021-11-03 12:49:54 -06:00
Ward Fisher	7ec0ac0a08	Merge branch 'main' into mingw-w64-strcasecmp	2021-10-01 17:07:37 -05:00
Dennis Heimbigner	ca3dfe43b7	Fix FreeBSD fileno problem in the ncgen parsers	2021-09-28 14:03:19 -06:00
Milton Woods	7b3d71b718	Define strcasecmp for Windows in ncconfigure.h	2021-09-05 17:13:23 +10:00
Dennis Heimbigner	11fe00ea05	Add filter support to NCZarr Filter support has three goals: 1. Use the existing HDF5 filter implementations, 2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr, 3. Allow filters to be used even when HDF5 is disabled Detailed usage directions are define in docs/filters.md. For now, the existing filter API is left in place. So filters are defined using ''nc_def_var_filter'' using the HDF5 style where the id and parameters are unsigned integers. This is a big change since filters affect many parts of the code. In the following, the terms "compressor" and "filter" and "codec" are generally used synonomously. ### Filter-Related Changes: * In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms. * Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h. * Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out. * Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h. * Add a number of new test to test the new nczarr filters. * Let ncgen parse _Codecs attribute, although it is ignored. ### Plugin directory changes: * Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file * Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip * Add a Codec defaulter (see docs/filters.md) for the big four filters. * Make plugins work with windows by properly adding __declspec declaration. ### Misc. Non-Filter Changes * Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5. * Improve support for caching * More fixes for path conversion code * Fix misc. memory leaks * Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath. * Add a number of new test to test the non-filter fixes. * Update the parsers * Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'	2021-09-02 17:04:26 -06:00
Dennis Heimbigner	0428c38b1e	Regularize the scoping of types re: Github issue https://github.com/Unidata/netcdf-c/issues/1956 The function NC_compare_nc_types in libdispatch/dcopy.c uses an incorrect algorithm to search for types. The core of this is the function NC_rec_find_nc_type in libdispatch/dcopy.c. Currently it searchs the current group and its subtree. Additionally, the function NC4_inq_typeid in libsrc4/nc4internal.c has been extended to handle fully qualified names. It was originally designed to do this, but for some reason never completed. The NC_rec_find_nc_type algorithm has been altered to match the algorithm used by NC4_inq_typeid. It operates as follows. Given a file F, group G and a type T. It searches file F2, group G2, for another type T2 that is equivalent to T. The search order is as follows. 1. Search G2 for a type T2 equivalent to T. 2. Search upwards in the ancestor groups of G2 for a type T2 equivalent to T. 3. Search the complete group tree of F2 in pre-order, breadth-first order to locate T2 equivalent to T. Also add a test case to validate algorithm: ncdump/test_scope.sh. Note, this change may cause compatibility problems, though it is unlikely because two different equivalent type declarations in one dataset is unlikely.	2021-03-06 14:09:37 -07:00
Dennis Heimbigner	eb3d9eb0c9	Provide a Number of fixes/improvements to NCZarr Primary changes: * Add an improved cache system to speed up performance. * Fix NCZarr to properly handle scalar variables. Misc. Related Changes: * Added unit tests for extendible hash and for the generic cache. * Add config parameter to set size of the NCZarr cache. * Add initial performance tests but leave them unused. * Add CRC64 support. * Move location of ncdumpchunks utility from /ncgen to /ncdump. * Refactor auth support. Misc. Unrelated Changes: * More cleanup of the S3 support * Add support for S3 authentication in .rc files: HTTP.S3.ACCESSID and HTTP.S3.SECRETKEY. * Remove the hashkey from the struct OBJHDR since it is never used.	2020-11-19 17:01:04 -07:00
Dennis Heimbigner	62a4cc1ae0	Fix nccopy -c dim/x to actually use the dim/x value. As it was, nccopy -c dim/x was sometimes being ignored. So modify nccopy to properly take into account. This also required a change to the nczarr code because it was not applying default chunking in the same way as libhdf5. Modify ncdump/tst_nccopy4.sh to test this feature properly. Also add a similar test to nczarr_test. Additionally, fix some other things that were causing Visual Studio builds with testing to not work. * fix curl testing under CMake to properly handle case where DAP is disabled, but byterange support is enabled. * properly test and/or define uintptr_t * Convert _O_XXX to O_XXX flags used by open();	2020-09-01 13:44:24 -06:00
Dennis Heimbigner	59e04ae071	This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform \| Build System \| S3 support ------------------------------------ Linux+gcc \| Automake \| yes Linux+gcc \| CMake \| yes Visual Studio \| CMake \| no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.	2020-06-28 18:02:47 -06:00
Dennis Heimbigner	4e37d68cb1	uintptr for VS	2020-05-11 10:35:56 -06:00
Dennis Heimbigner	84c69afca7	Allow redefinition of variable filters re: Github issue https://github.com/Unidata/netcdf-c/issues/1713 If nc_def_var_filter or nc_def_var_deflate or nc_def_var_szip is called multiple times with the same filter id, but possibly with different sets of parameters, then the first invocation is sticky and later invocations are ignored. The desired behavior is to have the last invocation be used. This PR implements that desired behavior, with some special cases. If you call nc_def_var_deflate multiple times, then the last invocation rule applies with respect to deflate. However, the shuffle filter, if enabled, is always applied just before applying deflate. Misc unrelated changes: 1. Make client-side filters be disabled by default 2. Fix the definition of uintptr_t and use in oc2 and libdap4 3. Add some test cases 4. modify filter order tests to use plugin filters rather than client-side filters	2020-05-11 09:42:31 -06:00
Ben Boeckel	464e2953a0	windows: detect Windows using the correct define name	2019-11-07 07:55:47 -05:00
Dennis Heimbigner	fbb47d50c1	Fix ncconfigure.h to solve a -ansi problem with strdup() re: https://github.com/Unidata/netcdf-c/issues/1408 1. Add some function tests to configure.ac; these are functions not defined with -ansi. 2. When using -ansi, fix include/ncconfigure.h to check for the possibilty that certain functions are being defined by macros. Apparently Debian does this for some reason. No idea why. Unrelated: modify the debug/cf.cmake debug shell script.	2019-05-29 14:35:29 -06:00
Dennis Heimbigner	4026323383	Fix minor --ansi warnings in dinfermodel.c and bzlib.c re: Needed to provide centralized definitions of fileno and fdopen; also need to #include sys/types.h	2019-03-22 15:16:47 -06:00
Dennis Heimbigner	45a8a265b8	master merge	2019-02-23 17:14:12 -07:00
Ward Fisher	288c9a7c52	Update to fix an environmental issue in ncconfigure.h	2019-02-19 14:42:34 -07:00
Ward Fisher	1fde39c8d7	Merge branch 'master' into byterange.dmh	2019-02-07 14:28:23 -07:00
Ward Fisher	b27c7d899d	Merge branch 'master' into byterange.dmh	2019-01-25 14:50:23 -07:00
Ward Fisher	688c06d50c	Merge branch 'master' into fix-warnings	2019-01-23 15:12:20 -07:00
Ben Boeckel	eda449b947	ncconfigure: add NC_UNUSED macro This should be used to indicate that a variable is unused in a codeblock.	2019-01-16 15:51:50 -05:00
Ward Fisher	d66d642e24	Corrected an issue observed on OSX	2019-01-15 14:36:07 -07:00
Dennis Heimbigner	bf2746b8ea	Provide byte-range reading of remote datasets re: issue https://github.com/Unidata/netcdf-c/issues/1251 Assume that you have the URL to a remote dataset which is a normal netcdf-3 or netcdf-4 file. This PR allows the netcdf-c to read that dataset's contents as a netcdf file using HTTP byte ranges if the remote server supports byte-range access. Originally, this PR was set up to access Amazon S3 objects, but it can also access other remote datasets such as those provided by a Thredds server via the HTTPServer access protocol. It may also work for other kinds of servers. Note that this is not intended as a true production capability because, as is known, this kind of access to can be quite slow. In addition, the byte-range IO drivers do not currently do any sort of optimization or caching. An additional goal here is to gain some experience with the Amazon S3 REST protocol. This architecture and its use documented in the file docs/byterange.dox. There are currently two test cases: 1. nc_test/tst_s3raw.c - this does a simple open, check format, close cycle for a remote netcdf-3 file and a remote netcdf-4 file. 2. nc_test/test_s3raw.sh - this uses ncdump to investigate some remote datasets. This PR also incorporates significantly changed model inference code (see the superceded PR https://github.com/Unidata/netcdf-c/pull/1259). 1. It centralizes the code that infers the dispatcher. 2. It adds support for byte-range URLs Other changes: 1. NC_HDF5_finalize was not being properly called by nc_finalize(). 2. Fix minor bug in ncgen3.l 3. fix memory leak in nc4info.c 4. add code to walk the .daprc triples and to replace protocol= fragment tag with a more general mode= tag. Final Note: Th inference code is still way too complicated. We need to move to the validfile() model used by netcdf Java, where each dispatcher is asked if it can process the file. This decentralizes the inference code. This will be done after all the major new dispatchers (PIO, Zarr, etc) have been implemented.	2019-01-01 18:27:36 -07:00
Dennis Heimbigner	735ae80928	merge master and fix conflicts	2018-12-12 11:47:54 -07:00
Ward Fisher	5be0126920	More standardizing of the copyright stanza.	2018-12-06 14:13:56 -07:00
Dennis Heimbigner	1a7531392f	Make the netcdf-c library compile with gcc -ansi. Primary fixes to get -ansi to work. 1. Convert all '//' C++ style comments to /.../ or to use #if 0...#endif 2. It turns out that when -ansi is specified, then a number of functions no longer are defined in the header -- but they are still in the .so file.<br> The big example is strdup(). So, added code to include/ncconfig.h to define externs for those missing functions that occur in more than one place. These are enabled if !_WIN32 && __STDC__ == 1 (__STDC__ is supposed to be the equivalent compile time flag to -ansi). Note that this requires config.h (which references ncconfig.h) to be included in files where it is currently not included. Single uses will be only in the file that uses them. 3. Added mmap test for the MAP_ANONYMOUS flag to configure.ac. Apparently this is not always defined with -ansi. 4. fix some large integer constants in nc_test4/tst_atts3.c and nc_test4/tst_filterparser.c to avoid compiler complaints. 5. fix a double constant in nc_test4/tst_filterparser.c to avoid compiler complaints. [Note I suspect #4 and #5 will be a problem on big-endian machines, but we have no way to test] Misc. Changes: 1. convert more instances of _MSC_VER to _WIN32. 2. added some debugging code to include/nctestserver.h 3. added comment about libdispatch/drc.c always being compiled. 4. modify parser generation in ncgen to remove unneeded files.	2018-12-05 19:20:43 -07:00
Ward Fisher	9bea284949	Fixed a shell script error on vs15	2018-05-14 14:37:07 -06:00
Ward Fisher	ed6de1be89	Fixed an issue where ssize_t was being either incorrectly type-def'd, or double type-def'd	2018-03-26 15:35:48 -06:00
Dennis Heimbigner	8cb1fc4cfe	This is the second step in refactoring the libsrc4 code. The first was branch newhash0.dmh. As with newhash0.dmh, these changes should be transparent.	2018-02-24 20:36:24 -07:00
Dennis Heimbigner	4db4393e69	Begin changing over to use strlcat instead of strncat because strlcat provides better protection against buffer overflows. Code is taken from the FreeBSD project source code. Specifically: https://github.com/freebsd/freebsd/blob/master/lib/libc/string/strlcat.c License appears to be acceptable, but needs to be checked by e.g. Debian. Step 1: 1. Add to netcdf-c/include/ncconfigure.h to use our version if not already available as determined by HAVE_STRLCAT in config.h. 2. Add the strlcat code to libdispatch/dstring.c 3. Turns out that strlcat was already defined in several places. So remove it from: ncgen3/genlib.c ncdump/dumplib.c 3. Define strlcat extern definition in ncconfigure.h. 4. Modify following directories to use strlcat: libdap2 libdap4 ncdap_test dap4_test Will do others in subsequent steps.	2017-11-23 10:55:24 -07:00
dmh	859f105005	merge-squash	2015-08-15 16:26:35 -06:00
Ward Fisher	3b610ec445	Removed two-slash style comments, replaced with legal comment blocks.	2014-08-25 15:14:10 -06:00
Ward Fisher	10807379c3	Merging from CMake branch in preparation for 4.3.0 release.	2013-04-29 20:15:57 +00:00
Ward Fisher	67f96188ff	Merged latest from netCDF-cmake branch in preparation for 4.3.0 release.	2013-04-23 21:50:07 +00:00
Ward Fisher	b113f6f8b6	Merged a handful of changes from netcdf-cmake branch. Addressed the following coverity issues: 711762 711763 711766 711788 711933 711934 711935	2013-04-16 23:02:54 +00:00
Ward Fisher	4a274b9870	Merged latest changes from cmake development branch. Addressed a number of memory-related problems reported by Coverity.	2013-04-02 22:09:31 +00:00
Ed Hartnett	fe653a4333	fixed nulldap problem in include file	2011-09-30 15:56:02 +00:00
Dennis Heimbigner	a7fdbb176d	Fixed nulldup issues: Jira # NCF-94	2011-07-14 22:43:43 +00:00
Dennis Heimbigner	4f30d3694c	cleanup misc. issues	2011-07-14 22:24:02 +00:00

44 Commits