netcdf-c

mirror of https://github.com/Unidata/netcdf-c.git synced 2024-12-15 08:30:11 +08:00

Author	SHA1	Message	Date
Ward Fisher	5973f3d683	Merge pull request #2847 from K20shores/packaging Use cmake netCDF with target_* for many options	2024-03-11 15:55:36 -06:00
Ward Fisher	cc1494d988	Merge branch 'main' into awsdfalt.dmh	2024-03-05 12:50:08 -07:00
Kyle Shores	c29db073eb	setting dll export on each target	2024-02-29 11:36:47 -06:00
Peter Hill	907e5cc43f	CMake: Use `target_link_libraries` with `HDF5::HDF5` target	2024-02-16 10:51:20 +00:00
Kyle Shores	f9e3247164	merging main, addressing some PR comments	2024-02-07 09:53:45 -06:00
Kyle Shores	f41178f6e0	adding hdf5 to nczarr	2024-01-26 16:48:50 -06:00
Ward Fisher	16bcb1ddb9	Merge branch 'silence-nclist-warnings' of https://github.com/ZedThree/netcdf-c into rebase-gh2812.wif	2024-01-19 11:11:21 -07:00
Ward Fisher	c1fb4b0bae	Merge pull request #2809 from ZedThree/silence-malloc-warnings Silence conversion warnings from `malloc` arguments	2023-12-21 17:20:56 -07:00
Ward Fisher	6a628b9ca7	Rebased PR by hand against main.	2023-12-12 16:49:13 -07:00
wkliao	52198b3f12	count argument in H5Sselect_hyperslab Argument 'count' in NetCDF is not exactly the same as the 'count' in H5Sselect_hyperslabs(space_id, op, start, stride, count, block). When the argument 'stride' is NULL, NetCDF's 'count' should be used in argument 'block', for example, H5Sselect_hyperslabs(space_id, op, start, NULL, ones, count); where 'one' is an array of all 1s. Although using NULL 'block' below H5Sselect_hyperslabs(space_id, op, start, NULL, count, NULL); has the same effect, HDF5 internally stores the space of a subarray as a list of single elements, instead of a "block", which can affect the performance.	2023-12-12 16:45:19 -07:00
Ward Fisher	2616e2c411	Merge pull request #2745 from e-kwsm/chmod-x chore: unset executable flag	2023-12-11 17:28:46 -07:00
Sean McBride	adc4dc1435	Replaced some sprintf with snprintf with aid of new variable containing size One case required slightly complicated accounting of how much space is left in the buffer.	2023-12-08 13:30:38 -05:00
Dennis Heimbigner	b83ff8fc2b	oops	2023-12-07 15:48:39 -07:00
Dennis Heimbigner	80dd2e95c6	Fix truncate code for S3	2023-12-07 15:38:46 -07:00
Dennis Heimbigner	d81d95a5cc	unthrow	2023-12-02 21:03:25 -07:00
Peter Hill	4e1ff160e1	Change signature of `nczm_sortenvv` to take `size_t` Always called with a `size_t` and passes `n` to `qsort` which expects a `size_t` anyway	2023-11-28 16:28:31 +00:00
Peter Hill	653e09fd6d	Try to more consistently use `size_t` for `nclistget` index argument	2023-11-28 16:28:31 +00:00
Peter Hill	d07dac918c	Silence conversion warnings from `malloc` arguments Mostly just add an explicit cast when calling `malloc` and its variants. Sometimes instead change the type of a local variable if this would silence multiple warnings.	2023-11-24 18:20:52 +00:00
Dennis Heimbigner	87497d79cf	update	2023-11-04 16:08:59 -06:00
Dennis Heimbigner	5fa2defc7e	Improve fetch performance of DAP4 Prior to this PR, DAP4 always fetched the whole (constrained) dataset This PR changes the query processing so 1. It reads data on a per-variable request (equivalent to calling nc_get_var()). 2. It tracks a response for every query. Most of the changes reflect having to do per-variable requests. In any case, doing all this significantly reduces the amount of data transmitted and hence speeds up DAP4 requests.	2023-10-08 19:59:28 -06:00
Dennis Heimbigner	1552d894a2	Cleanup a number of issues. re: Issue https://github.com/Unidata/netcdf-c/issues/2748 This PR fixes a number of issues and bugs. ## s3cleanup fixes * Delete extraneous s3cleanup.sh related files. * Remove duplicate s3cleanup.uids entries. ## Support the Google S3 API * Add code to recognize "storage.gooleapis.com" * Add extra code to track the kind of server being accessed: unknown, Amazon, Google. * Add a new mode flag "gs3" (analog to "s3") to support this api. * Modify the S3 URL code to support this case. * Modify the listobjects result parsing because Google returns some non-standard XML elements. * Change signature and calls for NC_s3urlrebuild. ## Handle corrupt Zarr files where shape is empty for a variable. Modify behavior when a variable's "shape" dictionary entry. Previously it returned an error, but now it suppresses such a variable. This change makes it possible to read non-corrupt data from the file. Also added a test case. ## Misc. Other Changes * Fix the nclog level handling to suppress output by default. * Fix de-duplicates code in ncuri.c * Restore testing of iridl.ldeo.columbia.edu. * Fix bug in define_vars() which did not always do a proper reclaim between variables.	2023-10-08 11:22:52 -06:00
Ward Fisher	80c746981d	Merge pull request #2758 from ZedThree/cmake-fix-linking-mpi CMake: Ensure all libraries link against MPI if needed	2023-10-02 16:20:43 -06:00
Peter Hill	18c813b20b	CMake: Ensure all libraries link against MPI if needed	2023-10-02 10:31:24 +01:00
Dennis Heimbigner	df3636b959	Mitigate S3 test interference + Unlimited Dimensions in NCZarr This PR started as an attempt to add unlimited dimensions to NCZarr. It did that, but this exposed significant problems with test interference. So this PR is mostly about fixing -- well mitigating anyway -- test interference. The problem of test interference is now documented in the document docs/internal.md. The solutions implemented here are also describe in that document. The solution is somewhat fragile but multiple cleanup mechanisms are provided. Note that this feature requires that the AWS command line utility must be installed. ## Unlimited Dimensions. The existing NCZarr extensions to Zarr are modified to support unlimited dimensions. NCzarr extends the Zarr meta-data for the ".zgroup" object to include netcdf-4 model extensions. This information is stored in ".zgroup" as dictionary named "_nczarr_group". Inside "_nczarr_group", there is a key named "dims" that stores information about netcdf-4 named dimensions. The value of "dims" is a dictionary whose keys are the named dimensions. The value associated with each dimension name has one of two forms Form 1 is a special case of form 2, and is kept for backward compatibility. Whenever a new file is written, it uses format 1 if possible, otherwise format 2. * Form 1: An integer representing the size of the dimension, which is used for simple named dimensions. * Form 2: A dictionary with the following keys and values" - "size" with an integer value representing the (current) size of the dimension. - "unlimited" with a value of either "1" or "0" to indicate if this dimension is an unlimited dimension. For Unlimited dimensions, the size is initially zero, and as variables extend the length of that dimension, the size value for the dimension increases. That dimension size is shared by all arrays referencing that dimension, so if one array extends an unlimited dimension, it is implicitly extended for all other arrays that reference that dimension. This is the standard semantics for unlimited dimensions. Adding unlimited dimensions required a number of other changes to the NCZarr code-base. These included the following. * Did a partial refactor of the slice handling code in zwalk.c to clean it up. * Added a number of tests for unlimited dimensions derived from the same test in nc_test4. * Added several NCZarr specific unlimited tests; more are needed. * Add test of endianness. ## Misc. Other Changes * Modify libdispatch/ncs3sdk_aws.cpp to optionally support use of the AWS Transfer Utility mechanism. This is controlled by the ```#define TRANSFER```` command in that file. It defaults to being disabled. * Parameterize both the standard Unidata S3 bucket (S3TESTBUCKET) and the netcdf-c test data prefix (S3TESTSUBTREE). * Fixed an obscure memory leak in ncdump. * Removed some obsolete unit testing code and test cases. * Uncovered a bug in the netcdf-c handling of big-endian floats and doubles. Have not fixed yet. See tst_h5_endians.c. * Renamed some nczarr_tests testcases to avoid name conflicts with nc_test4. * Modify the semantics of zmap\#ncsmap_write to only allow total rewrite of objects. * Modify the semantics of zodom to properly handle stride > 1. * Add a truncate operation to the libnczarr zmap code.	2023-09-26 16:56:48 -06:00
Eisuke Kawashima	d755c4ff32	chore: unset executable flag	2023-08-23 13:31:42 +09:00
Dennis Heimbigner	9094d25409	Fix major bug in the NCZarr cache management re: PR https://github.com/Unidata/netcdf-c/pull/2734 re: Issue https://github.com/Unidata/netcdf-c/issues/2733 As a result of an investigation by https://github.com/uweschulzweida, I discovered a significant bug in the NCZarr cache management. This PR extends the above PR to fix that bug. ## Change Overview * Insert extra checks for cache overflow. * Added test cases contingent on the --enable-large-file-tests option. * The Columbia server is down, so it has been temporarily disabled.	2023-08-16 23:07:05 -06:00
Dennis Heimbigner	f1a3a64b65	Cleanup the handling of cache parameters. re: https://github.com/Unidata/netcdf-c/issues/2733 When addressing the above issue, I noticed that there was a disconnect in NCZarr between nc_set_chunk_cache and nc_set_var_chunk cache. Specifically, setting nc_set_chunk_cache had no impact on the per-variable cache parameters when nc_set_var_chunk_cache was not used. So, modified the NCZarr code so that the per-variable cache parameters are set in this order (#1 is first choice): 1. The values set by nc_set_var_chunk_cache 2. The values set by nc_set_chunk_cache 3. The defaults set by configure.ac	2023-08-10 16:57:57 -06:00
Dennis Heimbigner	db772ce34c	Explicitly suppress variable length type compression re: PR https://github.com/Unidata/netcdf-c/pull/2716). re: Issue https://github.com/Unidata/netcdf-c/issues/2189 The basic change is to make use of the fact that HDF5 automatically suppresses optional filters when an attempt is made to apply them to variable-length typed arrays. This means that e.g. ncdump or nccopy will properly see meaningful data. Note that if a filter is defined as HDF5 mandatory, then the corresponding variable will be suppressed and will be invisible to ncdump and nccopy. This functionality is also propagated to NCZarr. This PR makes some minor changes to PR https://github.com/Unidata/netcdf-c/pull/2716 as follows: * Move the test for filter X variable-length from dfilter.c down into the dispatch table functions. * Make all filters for HDF5 optional rather than mandatory so that the built-in HDF5 test for filter X variable-length will be invoked. The test case for this was expanded to verify that the filters are defined, but suppressed.	2023-08-03 15:47:28 -06:00
Dennis Heimbigner	8cab468169	Suppress filters on variables with non-fixed-size types. re: Discussion https://github.com/Unidata/netcdf-c/discussions/2554 re: PR https://github.com/Unidata/netcdf-c/pull/2231 re: Issue https://github.com/Unidata/netcdf-c/issues/2189 After some discussion, the issue of applying filters on variables whose type is not fixed size, was resolved as follows: 1. A call to nc_def_var_filter will ignore such filters, but will issue a log warning. 2. Loading (from an existing file) a variable whose type is not fixed-size and which has filters, will cause the variable to be suppressed. This PR enforces those rules. ### Misc. Other changes * Add a test case to test the vlen change. * Make some minor clean-ups in various cmake and automake files. * Remove unused test	2023-06-21 14:46:22 -06:00
Dennis Heimbigner	fb40a72b45	Improve performance of the nc_reclaim_data and nc_copy_data functions. re: Issue https://github.com/Unidata/netcdf-c/issues/2685 re: PR https://github.com/Unidata/netcdf-c/pull/2179 As noted in PR https://github.com/Unidata/netcdf-c/pull/2179, the old code did not allow for reclaiming instances of types, nor for properly copying them. That PR provided new functions capable of reclaiming/copying instances of arbitrary types. However, as noted by Issue https://github.com/Unidata/netcdf-c/issues/2685, using these most general functions resulted in a significant performance degradation, even for common cases. This PR attempts to mitigate the cost of using the general reclaim/copy functions in two ways. First, the previous functions operating at the top level by using ncid and typeid arguments. These functions were augmented with equivalent versions that used the netcdf-c library internal data structures to allow direct access to needed information. These new functions are used internally to the library. The second mitigation involves optimizing the internal functions by providing early tests for common cases. This avoids unnecessary recursive function calls. The overall result is a significant improvement in speed by a factor of roughly twenty -- your mileage may vary. These optimized functions are still not as fast as the original (more limited) functions, but they are getting close. Additional optimizations are possible. But the cost is a significant "uglification" of the code that I deemed a step too far, at least for now. ## Misc. Changes 1. Added a test case to check the proper reclamation/copy of complex types. 2. Found and fixed some places where nc_reclaim/copy should have been used. 3. Replaced, in the netcdf-c library, (almost all) occurrences of nc_reclaim_copy with calls to NC_reclaim/copy. This plus the optimizations is the primary speed-up mechanism. 4. In DAP4, the metadata is held in a substrate in-memory file; this required some changes so that the reclaim/copy code accessed that substrate dispatcher rather than the DAP4 dispatcher. 5. Re-factored and isolated the code that computes if a type is (transitively) variable-sized or not. 6. Clean up the reclamation code in ncgen; adding the use of nc_reclaim exposed some memory problems.	2023-05-20 17:11:25 -06:00
Ward Fisher	bfb8a31aed	Merge pull request #2644 from ZhipengXue97/main Fix potential dead store	2023-05-15 10:38:59 -06:00
Dennis Heimbigner	aeaf9e4bec	notrace	2023-04-26 14:16:22 -06:00
Dennis Heimbigner	49737888ca	Improve S3 Documentation and Support ## Improvements to S3 Documentation * Create a new document quickstart_paths.md that give a summary of the legal path formats used by netcdf-c. This includes both file paths and URL paths. * Modify nczarr.md to remove most of the S3 related text. * Move the S3 text from nczarr.md to a new document cloud.md. * Add some S3-related text to the byterange.md document. Hopefully, this will make it easier for users to find the information they want. ## Rebuild NCZarr Testing In order to avoid problems with running make check in parallel, two changes were made: 1. The nczarr_test test system was rebuilt. Now, for each test. any generated files are kept in a test-specific directory, isolated from all other test executions. 2. Similarly, since the S3 test bucket is shared, any generated S3 objects are isolated using a test-specific key path. ## Other S3 Related Changes * Add code to ensure that files created on S3 are reclaimed at end of testing. * Used the bash "trap" command to ensure S3 cleanup even if the test fails. * Cleanup the S3 related configure.ac flag set since S3 is used in several places. So now one should use the option --enable-s3 instead of --enable-nczarr-s3, although the latter is still kept as a deprecated alias for the former. * Get some of the github actions yml to work with S3; required fixing various test scripts adding a secret to access the Unidata S3 bucket. * Cleanup S3 portion of libnetcdf.settings.in and netcdf_meta.h.in and test_common.in. * Merge partial S3 support into dhttp.c. * Create an experimental s3 access library especially for use with Windows. It is enabled by using the options --enable-s3-internal (automake) or -DENABLE_S3_INTERNAL=ON (CMake). Also add a unit-test for it. * Move some definitions from ncrc.h to ncs3sdk.h ## Other Changes * Provide a default implementation of strlcpy and move this and similar defaults into dmissing.c.	2023-04-25 17:15:06 -06:00
Ward Fisher	dc6e392c9d	Merge branch 'main' into znotnc.dmh	2023-04-12 16:02:34 -06:00
Dennis Heimbigner	d7d216a3f5	Merge branch 'master' into dap4tests2.dmh	2023-03-16 14:03:29 -06:00
Dennis Heimbigner	69b5fa4f4e	fix memory leak	2023-03-13 20:11:54 -06:00
Dennis Heimbigner	5c07ebfd11	Check at nc_open if file appears to be in NCZarr/Zarr format. re: Issue https://github.com/Unidata/netcdf-c/issues/2656 Charlie Zender notes that nc_open() does not immediately detect that the given path refers to a file not in zarr format. Rather it fails later when trying to read the (meta-)data. The reason is that the Zarr format is highly decentralized. There is no easily testable magic number or superblock to look for. In effect the only way to see if a directory is Zarr is to successfully read it. It is possible to heuristically detect that a path refers to an NCZarr/Zarr file by doing a breadth-first search of the file tree starting at the given path. If the search encounters a file whose name starts with ".z", then assume it is a legitimate NCZarr/Zarr file. Of course, this test could be costly. One hopes that in practice that it is not. In addition to this fix, a corresponding test case was added. ## Other Changes re: PR https://github.com/Unidata/netcdf-c/pull/2529 There was an error under Cygwin for this PR that is fixed in this PR. The fix was to convert all noinst_ references to check_.	2023-03-13 13:24:14 -06:00
Dennis Heimbigner	69e84fe9f1	Fix byterange handling of some URLS re: Issue The byterange handling of the following URLS fails. ### Problem 1: "https://crudata.uea.ac.uk/cru/data/temperature/HadCRUT.4.6.0.0.median.nc#mode=bytes" It turns out that byterange in hdf5 has two possible targets: S3 and not-S3 (e.g. a thredds server or the crudata URL above). Each uses a different HDF5 Virtual File Driver (VFD). I incorrectly set up the byterange code in libhdf5 so that it would choose one or the other of the two VFD's for any netcdf-c library build. The fix is to allow it to choose either one at run-time. ### Problem 2: "https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc#mode=bytes,s3" When given what appears to be an S3-related URL, the netcdf-c library code converts it into a canonical, so-called "path" format. In casing out the possible input URL formats, I missed the case where the host contains the bucket ("noaa-goes16"), but not the region. So the fix was to check for this case. ## Misc. Related Changes 1. Since S3 is used in more than just NCZarr, I changed the automake/cmake options to replace "--enable-nczarr-s3" with "--enable-s3", but keeping the former option as a synonym for the latter. This also entailed cleaning up libnetcdf.settings WRT S3 support 2. Added the above URLS as additional test cases ## Misc. Un-Related Changes 1. CURLOPT_PUT is deprecated in favor to CURLOPT_UPLOAD 2. Fix some minor warnings ## Open Problems * Under Ubuntu, either libcrypto or aws-sdk-cpp has a memory leak.	2023-03-02 19:51:02 -07:00
Zhipeng Xue	a992aadb32	Fix potential dead store	2023-02-28 16:26:27 +08:00
Dennis Heimbigner	9dfafe6c63	Bring up-to-date with main	2023-01-17 16:28:45 -07:00
Dennis Heimbigner	a03bb5e601	Fix infinite loop in file inferencing re: Issue https://github.com/Unidata/netcdf-c/issues/2573 The file type inferencer in libdispatch/dinference.c has a simple forward inference mechanism so that the occurrence of certain mode values in a URL fragment implies inclusion of additional mode values. This kind of inference is notorious for leading to cycles if not careful. Unfortunately, this occurred in the one in dinference.c. This was fixed by providing a more complicated, but more reliable inference mechanism. ## Misc. Other Changes * Found and fixed a couple of memory leaks. * There is a recent problem in building HDF4 support on github actions. Fixed by using the internal HDF4 xdr capability. * Some filter-related code was not being properly ifdef'd with ENABLE_NCZARRA_FILTERS.	2022-12-18 13:18:00 -07:00
Dennis Heimbigner	591e6b2f6d	Fix DAP4 remotetest server Warning: This PR is a follow on to PR https://github.com/Unidata/netcdf-c/pull/2555 and should not be merged until that prior PR has been merged. The changeset for this PR is a delta on the PR https://github.com/Unidata/netcdf-c/pull/2555. This PR re-enables the use of the server remotetest.unidata.ucar.edu/d4ts to test several features: 1. Show that access over the Internet to servers using the DAP4 protocol works. 2. Test that DAP4 support in the [Thredds Data Server](https://github.com/Unidata/tds) is operating correctly. 4. Test that the DAP4 support in the [netcdf-java library](https://github.com/Unidata/netcdf-java) library and the DAP4 support in the netcdf-c library are consistent and are interoperable. The test inputs (primarily \.nc* files) provided in the netcdf-c library are also used by the DAP4 Test Server (aka d4ts) to present web access to a collection of data files accessible via the DAP4 protocol and which can be used for testing Internet access to a working server. To be precise, this version of d4ts is currently in unmerged branches of the netcdf-java and tds Github repositories and so are not actually in the main repositories yet. However, the d4ts.war file was created from that branch and used to populate the remotetest.unidata.ucar.edu server The two other remote servers that were used in the past are Hyrax (OPenDAP.org) and thredds-test. These will continue to remain disabled until those servers can be fixed. ## Primary Changes * Rebuild the baselineremote directory. This directory contains the validation data needed to test the remote servers. * Re-enable using remotetest.unidata.ucar.edu as part of the DAP4 testing process. * Fix the dap4_test/test_remote.sh test script to match the current available test data. * Make some changes to libdap4 to improve the ability to catch malformed data streams [affects a lot of files in libdap4]. ## Misc. Unrelated Changes * Remove a raft of warnings, especially in nc_test4/tst_quantize.c. * Add some additional explanatory information to the NCZarr documentation. * Cleanup some Doxygen errors in the docs file and reorder some files.	2022-11-15 20:29:21 -07:00
Dan Ibanez	6173956790	Rename variable to avoid function name conflict I was getting the following error while compiling: ``` netcdf-c/libnczarr/zutil.c:544:26: error: called object 'strlen' is not a function or function pointer 544 \| if(dnamep) dnamep = strdup(dname); \| ^~~~~~ netcdf-c/libnczarr/zutil.c:533:68: note: declared here 533 \| ncz_nctype2dtype(nc_type nctype, int endianness, int purezarr, int strlen, char* dnamep) \| ~~~~^~~~~~ ``` My interpretation is that strdup() is implemented as a macro which calls strlen() the standard C function, and when that macro is being substituted here the call to strlen tries to "call" the integer variable named strlen. Resolving this by renaming the integer variable to "len" instead of "strlen", avoiding a conflict with a standard C library function name.	2022-11-07 13:24:20 -07:00
Dennis Heimbigner	1a45ee025f	Fix some addtional errors in NCZarr re: Issue https://github.com/Unidata/netcdf-c/issues/2502 H/T Charlie Zender * Fix NCZarr handling of endianness value NC_ENDIAN_NATIVE. This now matches how it is handled in libhdf5 * Fix NCZarr handling of char typed attribute with value "". This now matches how it is handled in libhdf5 * Add test for various char attribute values * Change the mapping of NC_CHAR and NC_STRING to dtype; requires changing some test files also. * Optimize the testing for NC_ENOTBUILT in NC_open. * Turn off debugging left on accidentally * Fix memory leak in tst_pnetcdf.c * Fix blosc test	2022-09-09 14:25:24 -06:00
Ward Fisher	c489aad975	Merge branch 'main' into bloscfix.dmh	2022-09-06 15:50:01 -06:00
Ward Fisher	0d17edf2ea	Merge branch 'main' into bloscfix.dmh	2022-09-06 13:49:18 -06:00
Dennis Heimbigner	00a80ec8f9	Catch Xarray dimension inconsistencies	2022-09-04 13:45:29 -06:00
Dennis Heimbigner	6abaab967b	Fix some problems with PR https://github.com/Unidata/netcdf-c/pull/2492 re: PR https://github.com/Unidata/netcdf-c/pull/2492 re: Issue https://github.com/Unidata/netcdf-c/issues/2494 This PR fixes some problems with the pull request https://github.com/Unidata/netcdf-c/pull/2492 in response to Issue https://github.com/Unidata/netcdf-c/issues/2494. * Found and fixed more scalar handling problems and add a test case for scalars. * Cleanup nczarr_test/run_string.sh test * Document _nczarr_default_maxstrlen and _nczarr_maxstrlen. * Support both "Nan" and Nan as being floating point constants for attributes. It is unclear from the Zarr V2 spec if unquoted Nan is legal or not, but support for reading. Write the quoted versions when writing an attribute. Similar for Infinity constants. So NCZarr supports the following constants for use in Attributes * Nan, "Nan", -Nan, "-Nan" * Nanf, "Nanf", -Nanf, "-Nanf" * Infinity, "Infinity", -Infinity, "-Infinity" * Infinityf, "Infinityf", -Infinityf, "-Infinityf"	2022-09-03 14:21:48 -06:00
Dennis Heimbigner	d0aff6ac3a	Fix LGTM alert: too few args	2022-08-27 21:35:04 -06:00
Dennis Heimbigner	231ae96c4b	Add support for Zarr string type to NCZarr * re: https://github.com/Unidata/netcdf-c/pull/2278 * re: https://github.com/Unidata/netcdf-c/issues/2485 * re: https://github.com/Unidata/netcdf-c/issues/2474 This PR subsumes PR https://github.com/Unidata/netcdf-c/pull/2278. Actually is a bit an omnibus covering several issues. ## PR https://github.com/Unidata/netcdf-c/pull/2278 Add support for the Zarr string type. Zarr strings are restricted currently to be of fixed size. The primary issue to be addressed is to provide a way for user to specify the size of the fixed length strings. This is handled by providing the following new attributes special: 1. _nczarr_default_maxstrlen — This is an attribute of the root group. It specifies the default maximum string length for string types. If not specified, then it has the value of 64 characters. 2. _nczarr_maxstrlen — This is a per-variable attribute. It specifies the maximum string length for the string type associated with the variable. If not specified, then it is assigned the value of _nczarr_default_maxstrlen. This PR also requires some hacking to handle the existing netcdf-c NC_CHAR type, which does not exist in zarr. The goal was to choose numpy types for both the netcdf-c NC_STRING type and the netcdf-c NC_CHAR type such that if a pure zarr implementation read them, it would still work and an NC_CHAR type would be handled by zarr as a string of length 1. For writing variables and NCZarr attributes, the type mapping is as follows: * "\|S1" for NC_CHAR. * ">S1" for NC_STRING && MAXSTRLEN==1 * ">Sn" for NC_STRING && MAXSTRLEN==n Note that it is a bit of a hack to use endianness, but it should be ok since for string/char, the endianness has no meaning. For reading attributes with pure zarr (i.e. with no nczarr atribute types defined), they will always be interpreted as of type NC_CHAR. ## Issue: https://github.com/Unidata/netcdf-c/issues/2474 This PR partly fixes this issue because it provided more comprehensive support for Zarr attributes that are JSON valued expressions. This PR still does not address the problem in that issue where the _ARRAY_DIMENSION attribute is incorrectly set. Than can only be fixed by the creator of the datasets. ## Issue: https://github.com/Unidata/netcdf-c/issues/2485 This PR also fixes the scalar failure shown in this issue. It generally cleans up scalar handling. It also adds a note to the documentation describing that NCZarr supports scalars while Zarr does not and also how scalar interoperability is achieved. ## Misc. Other Changes 1. Convert the nczarr special attributes and keys to be all lower case. So "_NCZARR_ATTR" now used "_nczarr_attr. Support back compatibility for the upper case names. 2. Cleanup my too-clever-by-half handling of scalars in libnczarr.	2022-08-27 20:21:13 -06:00

1 2 3

147 Commits