netcdf-c

mirror of https://github.com/Unidata/netcdf-c.git synced 2024-12-27 08:49:16 +08:00

Author	SHA1	Message	Date
Dennis Heimbigner	74b40fd788	Upgrade the nczarr code to match Zarr V2 Re: https://github.com/zarr-developers/zarr-python/pull/716 The Zarr version 2 spec has been extended to include the ability to choose the dimension separator in chunk name keys. The legal separators has been extended from {'.'} to {'.' '/'}. So now it is possible to use a key like "0/1/2/0" for chunk names. This PR implements this for NCZarr. The V2 spec now says that this separator can be set on a per-variable basis. For now, I have chosen to allow this be set only globally by adding a key named "ZARR.DIMENSION_SEPARATOR=<char>" in the .daprc/.dodsrc/ncrc file. Currently, the only legal separator characters are '.' (the default) and '/'. On writing, this key will only be written if its value is different than the default. This change caused problems because supporting a separator of '/' is difficult to parse when keys/paths use '/' as the path separator. A test case was added for this. Additionally, make nczarr be enabled default by default. This required some additional changes so that if zip and/or AWS S3 sdk are unavailable, then they are disabled for NCZarr. In addition the following unrelated changes were made. 1. Tested that pure-zarr mode could read an nczarr formatted store. 1. The .rc file handling now merges all known .rc files (.ncrc,.daprc, and .dodsrc) in that order and using those in HOME first, then in current directory. For duplicate entries, the later ones override the earlier ones. This change is to remove some of the conflicts inherent in the current .rc file load process. A set of test cases was also added. 1. Re-order tests in configure.ac and CMakeLists.txt so that if libcurl is not found then the other options that depend upon it properly are disabled. 1. I decided that xarray support should be enabled by default for pure zarr. In order to allow disabling, I added a new mode flag "noxarray". 1. Certain test in nczarr_test depend on use of .dodsrc. In order for these to work when testing in parallel, some inter-test dependencies needed to be added. 1. Improved authorization testing to use changes in thredds.ucar.edu	2021-04-24 19:48:15 -06:00
Dennis Heimbigner	e7c4e7ead1	add zjson fix	2021-04-01 13:56:04 -06:00
Dennis Heimbigner	0b7a5382e7	Codify cross-platform file paths The netcdf-c code has to deal with a variety of platforms: Windows, OSX, Linux, Cygwin, MSYS, etc. These platforms differ significantly in the kind of file paths that they accept. So in order to handle this, I have created a set of replacements for the most common file system operations such as _open_ or _fopen_ or _access_ to manage the file path differences correctly. A more limited version of this idea was already implemented via the ncwinpath.h and dwinpath.c code. So this can be viewed as a replacement for that code. And in path in many cases, the only change that was required was to replace '#include <ncwinpath.h>' with '#include <ncpathmgt.h>' and then replace file operation calls with the NCxxx equivalent from ncpathmgr.h Note that recently, the ncwinpath.h was renamed ncpathmgmt.h, so this pull request should not require dealing with winpath. The heart of the change is include/ncpathmgmt.h, which provides alternate operations such as NCfopen or NCaccess and which properly parse and rebuild path arguments to work for the platform on which the code is executing. This mostly matters for Windows because of the way that it uses backslash and drive letters, as compared to nix. One important feature is that the user can do string manipulations on a file path without having to worry too much about the platform because the path management code will properly handle most mixed cases. So one can for example concatenate a path suffix that uses forward slashes to a Windows path and have it work correctly. The conversion code is in libdispatch/dpathmgr.c, and the important function there is NCpathcvt which does the proper conversions to the local path format. As a rule, most code should just replace their file operations with the corresponding NCxxx ones defined in include/ncpathmgmt.h. These NCxxx functions all call NCpathcvt on their path arguments before executing the actual file operation. In some rare cases, the client may need to directly use NCpathcvt, but this should be avoided as much as possible. If there is a need for supporting a new file operation not already in ncpathmgmt.h, then use the code in dpathmgr.c as a template. Also please notify Unidata so we can include it as a formal part or our supported operations. Also, if you see an operation in the library that is not using the NCxxx form, then please submit an issue so we can fix it. Misc. Changes: * Clean up the utf8 testing code; it is impossible to get some tests to work under windows using shell scripts; the args do not pass as utf8 but as some other encoding. * Added an extra utf8 test case: test_unicode_path.sh * Add a true test for HDF5 1.10.6 or later because as noted in PR https://github.com/Unidata/netcdf-c/pull/1794, HDF5 changed its Windows file path handling.	2021-03-04 13:41:31 -07:00
Dennis Heimbigner	2afbdbd18f	Add support for the XArray Zarr _ARRAY_DIMENSIONS attribute The XArray implementation that uses Zarr for storage provides a mechanism to simulate named dimensions. It does this by adding a per-variable attribute called _ARRAY_DIMENSIONS. This attribute contains a list of names to be matched against the shape values of the variable. In effect a named dimension is created with the name _ARRAY_DIMENSIONS(i) and length shape(i) for all i in range 0..rank(variable). Both read and write support is provided. This XArray support is only invoked if the mode value of "xarray" is defined. So for example, as in this URL. ```` https://s3.us-west-1.amazonaws.com/bucket/dataset#mode=nczarr,xarray,s3 ```` Note that the "xarray" mode flag also implies mode flag "zarr", so the above is equivalent to this URL. ```` https://s3.us-west-1.amazonaws.com/bucket/dataset#mode=nczarr,zarr,xarray,s3 ```` The primary change to implement this was to unify the handling of dimension references in libnczarr/zsync. A test for this and other pure-zarr features was added as nczarr_test/run_purezarr.sh Other changes: * Make sure distcheck leaves no files around. * Change the special attribute flag DIMSCALEFLAG to HIDDENATTRFLAG to support the xarray attribute. * Annotate the zmap implementations with feature flags such as WRITEONCE (for zip files).	2021-02-24 13:46:11 -07:00
Dennis Heimbigner	e7d5f24078	Add zip file support The primary change is to support the use of a zip file as a storage format. Simultaneously the .nz4 support is made obsolete Use of zip requires the libzip support library, so a number of changes to the build files (Makefile.am, CMakeLists.txt) are necessary to locate and incorporate libzip. The nczarr_tests tests are also changed to add zip testing. Other changes: * Make sure distcheck leaves no files around. * Add some functions to netcdf_aux to export some functions of libnetcdf. * Add a new error NC_EFOUND as the complement of NC_EEMPTY. * Add tracing support to nclog and use it in libnczarr. * Modify the zmap interface to support the writeonce semantics of zip. * Create a new s3util.c to support a variety of S3 auxilliary functions. * EXTERNL'ize a number of functions so they can be used in s3util. * Add support for the S3 ListObjects CommonPrefixes mechanism to improve search. * Add experimental support for running nczarr X s3 tests against the actual Amazon S3 cloud.	2021-01-28 20:11:01 -07:00
Ward Fisher	31dee0c4da	Revert "Revert "Fix nczarr-experimental: improve build support, disengage hdf5 vs netcdf4 flags, and find AWS libraries""	2020-08-17 19:15:47 -06:00
Ward Fisher	16c27ca13f	Revert "Fix nczarr-experimental: improve build support, disengage hdf5 vs netcdf4 flags, and find AWS libraries"	2020-08-17 15:51:01 -06:00
Dennis Heimbigner	d538cf38c2	Fix nczarr-experimental to better support CMake and find AWS libraries The primary fix is to improve CMake build support. Specific changes include: * CMake: Provide a better soln to locating the AWS SDK libraries; the new way is the preferred method as described in the aws-cpp-sdk documentation. * CMake (and Automake): allow -DENABLE_S3_SDK (default off) to suppress looking for AWS libraries. * CMake: add the complete set of nczarr tests * CMake: add EXTERNL as needed to various .h files. * Improve support for windows drive letters in paths. * Add nczarr and s3 flags to nc-config * For VisualStudio X nczarr, cleanup the NAN+INFINITY handling * Convert _MSC_VER -> _WIN32 and vice versa as needed * NCZarr - support multiple platform paths including windows, cygwin. mingw, etc. * NCZarr - sort the test outputs because different platforms produce directory contents in different orders. One big change concerns netcdf-c/CMakeLists.txt and netcdf-c/configure.ac. In the current versions, it was the case that --disable-hdf5 disabled netcdf-4 (libsrc4). With nczarr, this can no longer be the case because nczarr requires libsrc4 even if libhdf5 is disabled. So, I modified the above files to move the format options (HDF5, NCZarr, HDF4, etc) to a single place near the front of the files. Now it is the case that: * Enabling any of the formats that require libsrc4 also does an implicit --enable-netcdf4. * --disable-netcdf4 \| --disable-netcdf-4 now becomes and alias for --disable-hdf5. There are probably some bugs in this change in terms of dependencies between format options. Problems: * CMake S3 support is still not working for Visual Studio * A recent issue points out that there is work to do on handling UTF8 filenames, but that will be addressed in a separate fix. Notes: * Consider converting all of our includes/.h files to use EXTERNL	2020-07-12 12:21:56 -06:00
Dennis Heimbigner	59e04ae071	This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform \| Build System \| S3 support ------------------------------------ Linux+gcc \| Automake \| yes Linux+gcc \| CMake \| yes Visual Studio \| CMake \| no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.	2020-06-28 18:02:47 -06:00

9 Commits