From 076da97aa4e437524a02553acb708d23c7cfcff5 Mon Sep 17 00:00:00 2001 From: Dennis Heimbigner Date: Wed, 19 Jun 2024 18:09:29 -0600 Subject: [PATCH] Convert NCzarr meta-data to use only Zarr attributes As discussed in a netcdf meeting, convert NCZarr V2 to store all netcdf-4 specific info as attributes. This improves interoperability with other Zarr implementations by no longer using non-standard keys. ## Other Changes * Remove support for older NCZarr formats. * Update anonymous dimension naming * Begin the process of fixing the -Wconversion and -Wsign-compare warnings in libnczarr, nczarr_test, and v3_nczarr_test. * Update docs/nczarr.md * Rebuild using the .y and .l files --- RELEASE_NOTES.md | 1 + docs/nczarr.md | 420 +++-- include/nc4internal.h | 1 - include/ncjson.h | 13 +- include/netcdf_json.h | 49 +- libdispatch/ncjson.c | 36 +- libdispatch/ncs3sdk_h5.c | 2 +- libnczarr/zarr.c | 6 +- libnczarr/zarr.h | 4 +- libnczarr/zattr.c | 14 +- libnczarr/zchunking.c | 2 +- libnczarr/zclose.c | 14 +- libnczarr/zinternal.h | 46 +- libnczarr/zsync.c | 1356 +++++++++-------- libnczarr/zutil.c | 92 +- libnczarr/zxcache.c | 17 +- libsrc4/nc4internal.c | 4 +- nczarr_test/Makefile.am | 2 +- nczarr_test/ncdumpchunks.c | 25 +- nczarr_test/ref_any.cdl | 48 +- nczarr_test/ref_byte.cdl | 4 +- nczarr_test/ref_byte_fill_value_null.cdl | 4 +- nczarr_test/ref_byte_fill_value_null.zarr.zip | Bin 1666 -> 1945 bytes nczarr_test/ref_groups_regular.cdl | 10 +- nczarr_test/ref_jsonconvention.cdl | 4 +- nczarr_test/ref_jsonconvention.zmap | 8 +- nczarr_test/ref_nczarr2zarr.cdl | 4 +- nczarr_test/ref_newformatpure.cdl | 6 +- nczarr_test/ref_purezarr.cdl | 6 +- nczarr_test/ref_scalar_nczarr.cdl | 8 + nczarr_test/ref_t_meta_dim1.cdl | 2 +- nczarr_test/ref_t_meta_var1.cdl | 2 +- nczarr_test/ref_ut_map_create.cdl | 2 +- nczarr_test/ref_ut_map_readmeta2.txt | 2 +- nczarr_test/ref_ut_map_search.txt | 4 +- nczarr_test/ref_ut_map_writedata.cdl | 4 +- nczarr_test/ref_ut_map_writemeta.cdl | 2 +- nczarr_test/ref_ut_map_writemeta2.cdl | 4 +- nczarr_test/ref_ut_mapapi_create.cdl | 2 +- nczarr_test/ref_ut_mapapi_data.cdl | 2 +- nczarr_test/ref_ut_mapapi_meta.cdl | 2 +- nczarr_test/ref_ut_mapapi_search.txt | 2 +- nczarr_test/ref_ut_testmap_create.cdl | 2 +- nczarr_test/ref_zarr_test_data_2d.cdl.gz | Bin 368 -> 389 bytes nczarr_test/run_jsonconvention.sh | 14 +- nczarr_test/run_scalar.sh | 2 +- nczarr_test/ut_json.c | 10 +- nczarr_test/ut_map.c | 8 +- nczarr_test/ut_mapapi.c | 8 +- nczarr_test/zisjson.c | 2 +- 50 files changed, 1299 insertions(+), 983 deletions(-) create mode 100644 nczarr_test/ref_scalar_nczarr.cdl diff --git a/RELEASE_NOTES.md b/RELEASE_NOTES.md index 623c0bc85..7ec52c8bb 100644 --- a/RELEASE_NOTES.md +++ b/RELEASE_NOTES.md @@ -7,6 +7,7 @@ This file contains a high-level description of this package's evolution. Release ## 4.9.3 - TBD +* Convert NCZarr V2 to store all netcdf-4 specific info as attributes. This improves interoperability with other Zarr implementations by no longer using non-standard keys. See [Github #????](https://github.com/Unidata/netcdf-c/issues/????) for more information. * Cleanup the option code for NETCDF_ENABLE_SET_LOG_LEVEL\[_FUNC\] See [Github #2931](https://github.com/Unidata/netcdf-c/issues/2931) for more information. * Fix duplicate definition when using aws-sdk-cpp. See [Github #2928](https://github.com/Unidata/netcdf-c/issues/2928) for more information. * Cleanup various obsolete options and do some code refactoring. See [Github #2926](https://github.com/Unidata/netcdf-c/issues/2926) for more information. diff --git a/docs/nczarr.md b/docs/nczarr.md index 4b3f25825..ee41e4ffe 100644 --- a/docs/nczarr.md +++ b/docs/nczarr.md @@ -8,13 +8,15 @@ The NetCDF NCZarr Implementation # NCZarr Introduction {#nczarr_introduction} -Beginning with netCDF version 4.8.0, the Unidata NetCDF group has extended the netcdf-c library to provide access to cloud storage (e.g. Amazon S3 [1] ). +Beginning with netCDF version 4.8.0, the Unidata NetCDF group has extended the netcdf-c library to support data stored using the Zarr data model and storage format [4,6]. As part of this support, netCDF adds support for accessing data stored using cloud storage (e.g. Amazon S3 [1] ). -The goal of this project is to provide maximum interoperability between the netCDF Enhanced (netcdf-4) data model and the Zarr version 2 [4] data model. This is embodied in the netcdf-c library so that it is possible to use the netcdf API to read and write Zarr formatted datasets. +The goal of this project, then, is to provide maximum interoperability between the netCDF Enhanced (netcdf-4) data model and the Zarr version 2 [4] data model. This is embodied in the netcdf-c library so that it is possible to use the netcdf API to read and write Zarr formatted datasets. -In order to better support the netcdf-4 data model, the netcdf-c library implements a limited set of extensions to the Zarr data model. +In order to better support the netcdf-4 data model, the netcdf-c library implements a limited set of extensions to the *Zarr* data model. This extended model is referred to as *NCZarr*. -An important goal is that those extensions not interfere with reading of those extended datasets by other Zarr specification conforming implementations. This means that one can write a dataset using the NCZarr extensions and expect that dataset to be readable by other Zarr implementations. +Additionally, another goal is to ensure interoperability between *NCZarr* +formatted files and standard (aka pure) *Zarr* formatted files. +This means that (1) an *NCZarr* file can be read by any other *Zarr* library (and especially the Zarr-python library), and (2) a standard *Zarr* file can be read by netCDF. Of course, there limitations in that other *Zarr* libraries will not use the extra, *NCZarr* meta-data, and netCDF will have to "fake" meta-data not provided by a pure *Zarr* file. As a secondary -- but equally important -- goal, it must be possible to use the NCZarr library to read and write datasets that are pure Zarr, @@ -29,14 +31,12 @@ Notes on terminology in this document. # The NCZarr Data Model {#nczarr_data_model} -NCZarr uses a data model [4] that, by design, extends the Zarr Version 2 Specification [6] to add support for the NetCDF-4 data model. +NCZarr uses a data model that, by design, extends the Zarr Version 2 Specification . -__Note Carefully__: a legal _NCZarr_ dataset is also a legal _Zarr_ dataset under a specific assumption. This assumption is that within Zarr meta-data objects, like ''.zarray'', unrecognized dictionary keys are ignored. -If this assumption is true of an implementation, then the _NCZarr_ dataset is a legal _Zarr_ dataset and should be readable by that _Zarr_ implementation. -The inverse is true also. A legal _Zarr_ dataset is also a legal _NCZarr_ -dataset, where "legal" means it conforms to the Zarr version 2 specification. +__Note Carefully__: a legal _NCZarr_ dataset is expected to also be a legal _Zarr_ dataset. +The inverse is true also. A legal _Zarr_ dataset is expected to also be a legal _NCZarr_ dataset, where "legal" means it conforms to the Zarr specification(s). In addition, certain non-Zarr features are allowed and used. -Specifically the XArray ''\_ARRAY\_DIMENSIONS'' attribute is one such. +Specifically the XArray [7] ''\_ARRAY\_DIMENSIONS'' attribute is one such. There are two other, secondary assumption: @@ -45,9 +45,10 @@ There are two other, secondary assumption: filters](./md_filters.html "filters") for details. Briefly, the data model supported by NCZarr is netcdf-4 minus -the user-defined types. However, a restricted form of String type -is supported (see Appendix E). -As with netcdf-4 chunking is supported. Filters and compression +the user-defined types and full String type support. +However, a restricted form of String type +is supported (see Appendix D). +As with netcdf-4, chunking is supported. Filters and compression are also [supported](./md_filters.html "filters"). Specifically, the model supports the following. @@ -74,8 +75,8 @@ When specified, they are treated as chunked where the file consists of only one This means that testing for contiguous or compact is not possible; the _nc_inq_var_chunking_ function will always return NC_CHUNKED and the chunksizes will be the same as the dimension sizes of the variable's dimensions. Additionally, it should be noted that NCZarr supports scalar variables, -but Zarr does not; Zarr only supports dimensioned variables. -In order to support interoperability, NCZarr does the following. +but Zarr Version 2 does not; Zarr V2 only supports dimensioned variables. +In order to support interoperability, NCZarr V2 does the following. 1. A scalar variable is recorded in the Zarr metadata as if it has a shape of **[1]**. 2. A note is stored in the NCZarr metadata that this is actually a netCDF scalar variable. @@ -108,55 +109,62 @@ using URLs. There are, however, some details that are important. - Protocol: this should be _https_ or _s3_,or _file_. - The _s3_ scheme is equivalent to "https" plus setting "mode=nczarr,s3" (see below). Specifying "file" is mostly used for testing, but is used to support directory tree or zipfile format storage. + The _s3_ scheme is equivalent to "https" plus setting "mode=s3". + Specifying "file" is mostly used for testing, but also for directory tree or zipfile format storage. ## Client Parameters The fragment part of a URL is used to specify information that is interpreted to specify what data format is to be used, as well as additional controls for that data format. -For NCZarr support, the following _key=value_ pairs are allowed. -- mode=nczarr|zarr|noxarray|file|zip|s3 +For reading, _key=value_ pairs are provided for specifying the storage format. +- mode=nczarr|zarr -Typically one will specify two mode flags: one to indicate what format -to use and one to specify the way the dataset is to be stored. -For example, a common one is "mode=zarr,file" +Additional pairs are provided to specify the Zarr version. +- mode=v2 +Additional pairs are provided to specify the storage medium: Amazon S3 vs File tree vs Zip file. +- mode=file|zip|s3 + +Note that when reading, an attempt will be made to infer the +format and Zarr version and storage medium format by probing the +file. If inferencing fails, then it is reported. In this case, +the client may need to add specific mode flags to avoid +inferencing. + +Typically one will specify three mode flags: one to indicate what format +to use and one to specify the way the dataset is to be stored. +For example, a common one is "mode=zarr,file" + + +Obviously, when creating a file, inferring the type of file to create +is not possible so the mode flags must be set specifically. +This means that both the storage medium and the exact storage +format must be specified. Using _mode=nczarr_ causes the URL to be interpreted as a reference to a dataset that is stored in NCZarr format. -The _zarr_ mode tells the library to -use NCZarr, but to restrict its operation to operate on pure -Zarr Version 2 datasets. +The _zarr_ mode tells the library to use NCZarr, but to restrict its operation to operate on pure Zarr. + -The modes _s3_, _file_, and _zip_ tell the library what storage +The modes _s3_, _file_, and _zip_ tell the library what storage medium driver to use. -* The _s3_ driver is the default and indicates using Amazon S3 or some equivalent. -* The _file_ format stores data in a directory tree. -* The _zip_ format stores data in a local zip file. +* The _s3_ driver stores data using Amazon S3 or some equivalent. +* The _file_ driver stores data in a directory tree. +* The _zip_ driver stores data in a local zip file. -Note that It should be the case that zipping a _file_ +As an aside, it should be the case that zipping a _file_ format directory tree will produce a file readable by the _zip_ storage format, and vice-versa. -By default, the XArray convention is supported and used for -both NCZarr files and pure Zarr files. This -means that every variable in the root group whose named dimensions +By default, the XArray convention is supported for Zarr Version 2 +and used for both NCZarr files and pure Zarr files. + +This means that every variable in the root group whose named dimensions are also in the root group will have an attribute called *\_ARRAY\_DIMENSIONS* that stores those dimension names. The _noxarray_ mode tells the library to disable the XArray support. -The netcdf-c library is capable of inferring additional mode flags based on the flags it finds. Currently we have the following inferences. -- _zarr_ => _nczarr_ - -So for example: ````...#mode=zarr,zip```` is equivalent to this. -````...#mode=nczarr,zarr,zip -```` - - # NCZarr Map Implementation {#nczarr_mapimpl} Internally, the nczarr implementation has a map abstraction that allows different storage formats to be used. @@ -192,7 +200,7 @@ be a prefix of any other key. There several other concepts of note. 1. __Dataset__ - a dataset is the complete tree contained by the key defining -the root of the dataset. +the root of the dataset. The term __File__ will often be used as a synonym. Technically, the root of the tree is the key \/.zgroup, where .zgroup can be considered the _superblock_ of the dataset. 2. __Object__ - equivalent of the S3 object; Each object has a unique key and "contains" data in the form of an arbitrary sequence of 8-bit bytes. @@ -277,14 +285,15 @@ As with other URLS (e.g. DAP), these kind of URLS can be passed as the path argu # NCZarr versus Pure Zarr. {#nczarr_purezarr} -The NCZARR format extends the pure Zarr format by adding extra keys such as ''\_NCZARR\_ARRAY'' inside the ''.zarray'' object. -It is possible to suppress the use of these extensions so that the netcdf library can read and write a pure zarr formatted file. -This is controlled by using ''mode=zarr'', which is an alias for the -''mode=nczarr,zarr'' combination. -The primary effects of using pure zarr are described in the [Translation Section](@ref nczarr_translation). - -There are some constraints on the reading of Zarr datasets using the NCZarr implementation. +The NCZARR format extends the pure Zarr format by adding extra attributes such as ''\_nczarr\_array'' inside the ''.zattr'' object. +It is possible to suppress the use of these extensions so that the netcdf library can write a pure zarr formatted file. But this probably unnecessary +since these attributes should be readable by any other Zarr implementation. +But these extra attributes might be seen as clutter and so it is possible +to suppress them when writing using *mode=zarr*. +Reading of pure Zarr files created using other implementations is a necessary +compatibility feature of NCZarr. +This requirement imposed some constraints on the reading of Zarr datasets using the NCZarr implementation. 1. Zarr allows some primitive types not recognized by NCZarr. Over time, the set of unrecognized types is expected to diminish. Examples of currently unsupported types are as follows: @@ -333,13 +342,14 @@ The reason for this is that the bucket name forms the initial segment in the key ## Data Model -The NCZarr storage format is almost identical to that of the the standard Zarr version 2 format. +The NCZarr storage format is almost identical to that of the the standard Zarr format. The data model differs as follows. 1. Zarr only supports anonymous dimensions -- NCZarr supports only shared (named) dimensions. 2. Zarr attributes are untyped -- or perhaps more correctly characterized as of type string. +3. Zarr does not explicitly support unlimited dimensions -- NCZarr does support them. -## Storage Format +## Storage Medium Consider both NCZarr and Zarr, and assume S3 notions of bucket and object. In both systems, Groups and Variables (Array in Zarr) map to S3 objects. @@ -347,8 +357,7 @@ Containment is modeled using the fact that the dataset's key is a prefix of the So for example, if variable _v1_ is contained in top level group g1 -- _/g1 -- then the key for _v1_ is _/g1/v_. Additional meta-data information is stored in special objects whose name start with ".z". -In Zarr, the following special objects exist. - +In Zarr Version 2, the following special objects exist. 1. Information about a group is kept in a special object named _.zgroup_; so for example the object _/g1/.zgroup_. 2. Information about an array is kept as a special object named _.zarray_; @@ -359,45 +368,46 @@ so for example the objects _/g1/.zattr_ and _/g1/v1/.zattr_. The first three contain meta-data objects in the form of a string representing a JSON-formatted dictionary. The NCZarr format uses the same objects as Zarr, but inserts NCZarr -specific key-value pairs in them to hold NCZarr specific information -The value of each of these keys is a JSON dictionary containing a variety +specific attributes in the *.zattr* object to hold NCZarr specific information +The value of each of these attributes is a JSON dictionary containing a variety of NCZarr specific information. -These keys are as follows: +These NCZarr-specific attributes are as follows: -_\_nczarr_superblock\__ -- this is in the top level group -- key _/.zarr_. +_\_nczarr_superblock\__ -- this is in the top level group's *.zattr* object. It is in effect the "superblock" for the dataset and contains any netcdf specific dataset level information. It is also used to verify that a given key is the root of a dataset. -Currently it contains the following key(s): -* "version" -- the NCZarr version defining the format of the dataset. +Currently it contains keys that are ignored and exist only to ensure that +older netcdf library versions do not crash. +* "version" -- the NCZarr version defining the format of the dataset (deprecated). -_\_nczarr_group\__ -- this key appears in every _.zgroup_ object. +_\_nczarr_group\__ -- this key appears in every group's _.zattr_ object. It contains any netcdf specific group information. Specifically it contains the following keys: -* "dims" -- the name and size of shared dimensions defined in this group, as well an optional flag indictating if the dimension is UNLIMITED. -* "vars" -- the name of variables defined in this group. +* "dimensions" -- the name and size of shared dimensions defined in this group, as well an optional flag indictating if the dimension is UNLIMITED. +* "arrays" -- the name of variables defined in this group. * "groups" -- the name of sub-groups defined in this group. These lists allow walking the NCZarr dataset without having to use the potentially costly search operation. -_\_nczarr_array\__ -- this key appears in every _.zarray_ object. +_\_nczarr_array\__ -- this key appears in the *.zattr* object associated +with a _.zarray_ object. It contains netcdf specific array information. Specifically it contains the following keys: -* dimrefs -- the names of the shared dimensions referenced by the variable. -* storage -- indicates if the variable is chunked vs contiguous in the netcdf sense. +* dimension_references -- the fully qualified names of the shared dimensions referenced by the variable. +* storage -- indicates if the variable is chunked vs contiguous in the netcdf sense. Also signals if a variable is scalar. -_\_nczarr_attr\__ -- this key appears in every _.zattr_ object. -This means that technically, it is attribute, but one for which access -is normally surpressed . +_\_nczarr_attr\__ -- this attribute appears in every _.zattr_ object. Specifically it contains the following keys: -* types -- the types of all of the other attributes in the _.zattr_ object. +* types -- the types of all attributes in the _.zattr_ object. ## Translation {#nczarr_translation} -With some constraints, it is possible for an nczarr library to read the pure Zarr format and for a zarr library to read the nczarr format. -The latter case, zarr reading nczarr is possible if the zarr library is willing to ignore keys whose name it does not recognize; specifically anything beginning with _\_nczarr\__. +With some loss of netcdf-4 information, it is possible for an nczarr library to read the pure Zarr format and for other zarr libraries to read the nczarr format. -The former case, nczarr reading zarr is also possible if the nczarr can simulate or infer the contents of the missing _\_nczarr\_xxx_ objects. +The latter case, zarr reading nczarr, is trival because all of the nczarr metadata is stored as ordinary, String valued (but JSON syntax), attributes. + +The former case, nczarr reading zarr is possible assuming the nczarr code can simulate or infer the contents of the missing _\_nczarr\_xxx_ attributes. As a rule this can be done as follows. 1. _\_nczarr_group\__ -- The list of contained variables and sub-groups can be computed using the search API to list the keys "contained" in the key for a group. The search looks for occurrences of _.zgroup_, _.zattr_, _.zarray_ to infer the keys for the contained groups, attribute sets, and arrays (variables). @@ -405,9 +415,8 @@ Constructing the set of "shared dimensions" is carried out by walking all the variables in the whole dataset and collecting the set of unique integer shapes for the variables. For each such dimension length, a top level dimension is created -named ".zdim_" where len is the integer length. -2. _\_nczarr_array\__ -- The dimrefs are inferred by using the shape -in _.zarray_ and creating references to the simulated shared dimension. +named "_Anonymous_Dimension_" where len is the integer length. +2. _\_nczarr_array\__ -- The dimension referencess are inferred by using the shape in _.zarray_ and creating references to the simulated shared dimensions. netcdf specific information. 3. _\_nczarr_attr\__ -- The type of each attribute is inferred by trying to parse the first attribute value string. @@ -417,13 +426,15 @@ In order to accomodate existing implementations, certain mode tags are provided ## XArray -The Xarray [XArray Zarr Encoding Specification](http://xarray.pydata.org/en/latest/internals.html#zarr-encoding-specification) Zarr implementation uses its own mechanism for specifying shared dimensions. +The Xarray [7] Zarr implementation uses its own mechanism for specifying shared dimensions. It uses a special attribute named ''_ARRAY_DIMENSIONS''. The value of this attribute is a list of dimension names (strings). An example might be ````["time", "lon", "lat"]````. -It is essentially equivalent to the ````_nczarr_array "dimrefs" list````, except that the latter uses fully qualified names so the referenced dimensions can be anywhere in the dataset. +It is almost equivalent to the ````_nczarr_array "dimension_references" list````, except that the latter uses fully qualified names so the referenced dimensions can be anywhere in the dataset. The Xarray dimension list differs from the netcdf-4 shared dimensions in two ways. +1. Specifying Xarray in a non-root group has no meaning in the current Xarray specification. +2. A given name can be associated with different lengths, even within a single array. This is considered an error in NCZarr. -As of _netcdf-c_ version 4.8.2, The Xarray ''_ARRAY_DIMENSIONS'' attribute is supported for both NCZarr and pure Zarr. +The Xarray ''_ARRAY_DIMENSIONS'' attribute is supported for both NCZarr and pure Zarr. If possible, this attribute will be read/written by default, but can be suppressed if the mode value "noxarray" is specified. If detected, then these dimension names are used to define shared dimensions. @@ -431,6 +442,8 @@ The following conditions will cause ''_ARRAY_DIMENSIONS'' to not be written. * The variable is not in the root group, * Any dimension referenced by the variable is not in the root group. +Note that this attribute is not needed for Zarr Version 3, and is ignored. + # Examples {#nczarr_examples} Here are a couple of examples using the _ncgen_ and _ncdump_ utilities. @@ -453,34 +466,17 @@ Here are a couple of examples using the _ncgen_ and _ncdump_ utilities. ``` 5. Create an nczarr file using the s3 protocol with a specific profile ``` - ncgen -4 -lb -o 's3://datasetbucket/rootkey\#mode=nczarr,awsprofile=unidata' dataset.cdl + ncgen -4 -lb -o "s3://datasetbucket/rootkey\#mode=nczarr&awsprofile=unidata" dataset.cdl ``` Note that the URL is internally translated to this - ``` - 'https://s2.<region>.amazonaws.com/datasetbucket/rootkey#mode=nczarr,awsprofile=unidata' dataset.cdl - ``` - -# References {#nczarr_bib} - -[1] [Amazon Simple Storage Service Documentation](https://docs.aws.amazon.com/s3/index.html)
-[2] [Amazon Simple Storage Service Library](https://github.com/aws/aws-sdk-cpp)
-[3] [The LibZip Library](https://libzip.org/)
-[4] [NetCDF ZARR Data Model Specification](https://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf-zarr-data-model-specification)
-[5] [Python Documentation: 8.3. -collections — High-performance dataset datatypes](https://docs.python.org/2/library/collections.html)
-[6] [Zarr Version 2 Specification](https://zarr.readthedocs.io/en/stable/spec/v2.html)
-[7] [XArray Zarr Encoding Specification](http://xarray.pydata.org/en/latest/internals.html#zarr-encoding-specification)
-[8] [Dynamic Filter Loading](https://support.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/HDF5DynamicallyLoadedFilters.pdf)
-[9] [Officially Registered Custom HDF5 Filters](https://portal.hdfgroup.org/display/support/Registered+Filter+Plugins)
-[10] [C-Blosc Compressor Implementation](https://github.com/Blosc/c-blosc)
-[11] [Conda-forge packages / aws-sdk-cpp](https://anaconda.org/conda-forge/aws-sdk-cpp)
-[12] [GDAL Zarr](https://gdal.org/drivers/raster/zarr.html)
- + ```` + "https://s2.<region>.amazonaws.com/datasetbucket/rootkey\#mode=nczarr&awsprofile=unidata" + ```` # Appendix A. Building NCZarr Support {#nczarr_build} Currently the following build cases are known to work. Note that this does not include S3 support. -A separate tabulation of S3 support is in the document cloud.md. +A separate tabulation of S3 support is in the document _cloud.md_.
Operating SystemBuild SystemNCZarr @@ -551,24 +547,9 @@ Some of the relevant limits are as follows: Note that the limit is defined in terms of bytes and not (Unicode) characters. This affects the depth to which groups can be nested because the key encodes the full path name of a group. -# Appendix C. NCZarr Version 1 Meta-Data Representation. {#nczarr_version1} +# Appendix C. JSON Attribute Convention. {#nczarr_json} -In NCZarr Version 1, the NCZarr specific metadata was represented using new objects rather than as keys in existing Zarr objects. -Due to conflicts with the Zarr specification, that format is deprecated in favor of the one described above. -However the netcdf-c NCZarr support can still read the version 1 format. - -The version 1 format defines three specific objects: _.nczgroup_, _.nczarray_,_.nczattr_. -These are stored in parallel with the corresponding Zarr objects. So if there is a key of the form "/x/y/.zarray", then there is also a key "/x/y/.nczarray". -The content of these objects is the same as the contents of the corresponding keys. So the value of the ''_NCZARR_ARRAY'' key is the same as the content of the ''.nczarray'' object. The list of connections is as follows: - -* ''.nczarr'' <=> ''_nczarr_superblock_'' -* ''.nczgroup <=> ''_nczarr_group_'' -* ''.nczarray <=> ''_nczarr_array_'' -* ''.nczattr <=> ''_nczarr_attr_'' - -# Appendix D. JSON Attribute Convention. {#nczarr_json} - -The Zarr V2 specification is somewhat vague on what is a legal +The Zarr V2 specification is somewhat vague on what is a legal value for an attribute. The examples all show one of two cases: 1. A simple JSON scalar atomic values (e.g. int, float, char, etc), or 2. A JSON array of such values. @@ -581,7 +562,7 @@ complex JSON expression. An example is the GDAL Driver convention [12], where the value is a complex JSON dictionary. -In order for NCZarr to be as consistent as possible with Zarr Version 2, +In order for NCZarr to be as consistent as possible with Zarr, it is desirable to support this convention for attribute values. This means that there must be some way to handle an attribute whose value is not either of the two cases above. That is, its value @@ -611,12 +592,12 @@ There are mutiple cases to consider. 3. The netcdf attribute **is** of type NC_CHAR and its value – taken as a single sequence of characters – **is** parseable as a legal JSON expression. * Parse to produce a JSON expression and write that expression. - * Use "|U1" as the dtype and store in the NCZarr metadata. + * Use "|J0" as the dtype and store in the NCZarr metadata. 4. The netcdf attribute **is** of type NC_CHAR and its value – taken as a single sequence of characters – **is not** parseable as a legal JSON expression. * Convert to a JSON string and write that expression - * Use "|U1" as the dtype and store in the NCZarr metadata. + * Use ">S1" as the dtype and store in the NCZarr metadata. ## Reading an attribute: @@ -640,10 +621,7 @@ and then store it as the equivalent netcdf vector. * If the dtype is not defined, then infer the dtype based on the first JSON value in the array, and then store it as the equivalent netcdf vector. -3. The JSON expression is an array some of whose values are dictionaries or (sub-)arrays. - * Un-parse the expression to an equivalent sequence of characters, and then store it as of type NC_CHAR. - -3. The JSON expression is a dictionary. +3. The attribute is any other JSON structure. * Un-parse the expression to an equivalent sequence of characters, and then store it as of type NC_CHAR. ## Notes @@ -654,7 +632,7 @@ actions "read-write-read" is equivalent to a single "read" and "write-read-write The "almost" caveat is necessary because (1) whitespace may be added or lost during the sequence of operations, and (2) numeric precision may change. -# Appendix E. Support for string types +# Appendix D. Support for string types Zarr supports a string type, but it is restricted to fixed size strings. NCZarr also supports such strings, @@ -702,6 +680,182 @@ the above types should always appear as strings, and the type that signals NC_CHAR (in NCZarr) would be handled by Zarr as a string of length 1. + + +# References {#nczarr_bib} + +[1] [Amazon Simple Storage Service Documentation](https://docs.aws.amazon.com/s3/index.html)
+[2] [Amazon Simple Storage Service Library](https://github.com/aws/aws-sdk-cpp)
+[3] [The LibZip Library](https://libzip.org/)
+[4] [NetCDF ZARR Data Model Specification](https://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf-zarr-data-model-specification)
+[5] [Python Documentation: 8.3. +collections — High-performance dataset datatypes](https://docs.python.org/2/library/collections.html)
+[6] [Zarr Version 2 Specification](https://zarr.readthedocs.io/en/stable/spec/v2.html)
+[7] [XArray Zarr Encoding Specification](http://xarray.pydata.org/en/latest/internals.html#zarr-encoding-specification)
+[8] [Dynamic Filter Loading](https://support.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/HDF5DynamicallyLoadedFilters.pdf)
+[9] [Officially Registered Custom HDF5 Filters](https://portal.hdfgroup.org/display/support/Registered+Filter+Plugins)
+[10] [C-Blosc Compressor Implementation](https://github.com/Blosc/c-blosc)
+[11] [Conda-forge packages / aws-sdk-cpp](https://anaconda.org/conda-forge/aws-sdk-cpp)
+[12] [GDAL Zarr](https://gdal.org/drivers/raster/zarr.html)
+ + # Change Log {#nczarr_changelog} [Note: minor text changes are not included.] @@ -710,6 +864,12 @@ intended to be a detailed chronology. Rather, it provides highlights that will be of interest to NCZarr users. In order to see exact changes, It is necessary to use the 'git diff' command. +## 03/31/2024 +1. Document the change to V2 to using attributes to hold NCZarr metadata. + +## 01/31/2024 +1. Add description of support for Zarr version 3 as an appendix. + ## 3/10/2023 1. Move most of the S3 text to the cloud.md document. @@ -729,4 +889,4 @@ include arbitrary JSON expressions; see Appendix D for more details. __Author__: Dennis Heimbigner
__Email__: dmh at ucar dot edu
__Initial Version__: 4/10/2020
-__Last Revised__: 3/8/2023 +__Last Revised__: 4/02/2024 diff --git a/include/nc4internal.h b/include/nc4internal.h index 56be31086..9a2aac02b 100644 --- a/include/nc4internal.h +++ b/include/nc4internal.h @@ -512,6 +512,5 @@ extern void NC_initialize_reserved(void); #define NC_NCZARR_GROUP "_nczarr_group" #define NC_NCZARR_ARRAY "_nczarr_array" #define NC_NCZARR_ATTR "_nczarr_attr" -#define NC_NCZARR_ATTR_UC "_NCZARR_ATTR" /* deprecated */ #endif /* _NC4INTERNAL_ */ diff --git a/include/ncjson.h b/include/ncjson.h index df24c0a56..1ff9967f4 100644 --- a/include/ncjson.h +++ b/include/ncjson.h @@ -57,7 +57,7 @@ typedef struct NCjson { int sort; /* of this object */ char* string; /* sort != DICT|ARRAY */ struct NCjlist { - int len; + size_t len; struct NCjson** contents; } list; /* sort == DICT|ARRAY */ } NCjson; @@ -108,7 +108,14 @@ OPTEXPORT int NCJaddstring(NCjson* json, int sort, const char* s); OPTEXPORT int NCJappend(NCjson* object, NCjson* value); /* Insert key-value pair into a dict object. key will be copied */ -OPTEXPORT int NCJinsert(NCjson* object, char* key, NCjson* value); +OPTEXPORT int NCJinsert(NCjson* object, const char* key, NCjson* value); + +/* Insert key-value pair as strings into a dict object. + key and value will be copied */ +OPTEXPORT int NCJinsertstring(NCjson* object, const char* key, const char* value); + +/* Insert key-value pair where value is an int */ +OPTEXPORT int NCJinsertint(NCjson* object, const char* key, long long ivalue); /* Unparser to convert NCjson object to text in buffer */ OPTEXPORT int NCJunparse(const NCjson* json, unsigned flags, char** textp); @@ -131,8 +138,10 @@ OPTEXPORT const char* NCJtotext(const NCjson* json); #define NCJsort(x) ((x)->sort) #define NCJstring(x) ((x)->string) #define NCJlength(x) ((x)==NULL ? 0 : (x)->list.len) +#define NCJdictlength(x) ((x)==NULL ? 0 : (x)->list.len/2) #define NCJcontents(x) ((x)->list.contents) #define NCJith(x,i) ((x)->list.contents[i]) +#define NCJdictith(x,i) ((x)->list.contents[2*i]) /* Setters */ #define NCJsetsort(x,s) (x)->sort=(s) diff --git a/include/netcdf_json.h b/include/netcdf_json.h index 6879edf89..384587265 100644 --- a/include/netcdf_json.h +++ b/include/netcdf_json.h @@ -57,7 +57,7 @@ typedef struct NCjson { int sort; /* of this object */ char* string; /* sort != DICT|ARRAY */ struct NCjlist { - int len; + size_t len; struct NCjson** contents; } list; /* sort == DICT|ARRAY */ } NCjson; @@ -108,7 +108,14 @@ OPTEXPORT int NCJaddstring(NCjson* json, int sort, const char* s); OPTEXPORT int NCJappend(NCjson* object, NCjson* value); /* Insert key-value pair into a dict object. key will be copied */ -OPTEXPORT int NCJinsert(NCjson* object, char* key, NCjson* value); +OPTEXPORT int NCJinsert(NCjson* object, const char* key, NCjson* value); + +/* Insert key-value pair as strings into a dict object. + key and value will be copied */ +OPTEXPORT int NCJinsertstring(NCjson* object, const char* key, const char* value); + +/* Insert key-value pair where value is an int */ +OPTEXPORT int NCJinsertint(NCjson* object, const char* key, long long ivalue); /* Unparser to convert NCjson object to text in buffer */ OPTEXPORT int NCJunparse(const NCjson* json, unsigned flags, char** textp); @@ -131,8 +138,10 @@ OPTEXPORT const char* NCJtotext(const NCjson* json); #define NCJsort(x) ((x)->sort) #define NCJstring(x) ((x)->string) #define NCJlength(x) ((x)==NULL ? 0 : (x)->list.len) +#define NCJdictlength(x) ((x)==NULL ? 0 : (x)->list.len/2) #define NCJcontents(x) ((x)->list.contents) #define NCJith(x,i) ((x)->list.contents[i]) +#define NCJdictith(x,i) ((x)->list.contents[2*i]) /* Setters */ #define NCJsetsort(x,s) (x)->sort=(s) @@ -278,7 +287,9 @@ static int NCJnewstring(int sort, const char* value, NCjson** jsonp); static int NCJnewstringn(int sort, size_t len, const char* value, NCjson** jsonp); static int NCJclone(const NCjson* json, NCjson** clonep); static int NCJaddstring(NCjson* json, int sort, const char* s); -static int NCJinsert(NCjson* object, char* key, NCjson* jvalue); +static int NCJinsert(NCjson* object, const char* key, NCjson* jvalue); +static int NCJinsertstring(NCjson* object, const char* key, const char* value); +static int NCJinsertint(NCjson* object, const char* key, long long ivalue); static int NCJappend(NCjson* object, NCjson* value); static int NCJunparse(const NCjson* json, unsigned flags, char** textp); #else /*!NETCDF_JSON_H*/ @@ -1050,7 +1061,7 @@ done: /* Insert key-value pair into a dict object. key will be strdup'd */ OPTSTATIC int -NCJinsert(NCjson* object, char* key, NCjson* jvalue) +NCJinsert(NCjson* object, const char* key, NCjson* jvalue) { int stat = NCJ_OK; NCjson* jkey = NULL; @@ -1063,6 +1074,36 @@ done: return NCJTHROW(stat); } +/* Insert key-value pair as strings into a dict object. + key and value will be strdup'd */ +OPTSTATIC int +NCJinsertstring(NCjson* object, const char* key, const char* value) +{ + int stat = NCJ_OK; + NCjson* jvalue = NULL; + if(value == NULL) + NCJnew(NCJ_NULL,&jvalue); + else + NCJnewstring(NCJ_STRING,value,&jvalue); + NCJinsert(object,key,jvalue); +done: + return NCJTHROW(stat); +} + +/* Insert key-value pair with value being an integer */ +OPTSTATIC int +NCJinsertint(NCjson* object, const char* key, long long ivalue) +{ + int stat = NCJ_OK; + NCjson* jvalue = NULL; + char digits[128]; + snprintf(digits,sizeof(digits),"%lld",ivalue); + NCJnewstring(NCJ_STRING,digits,&jvalue); + NCJinsert(object,key,jvalue); +done: + return NCJTHROW(stat); +} + /* Append value to an array or dict object. */ OPTSTATIC int NCJappend(NCjson* object, NCjson* value) diff --git a/libdispatch/ncjson.c b/libdispatch/ncjson.c index 363b24ffe..349292d13 100644 --- a/libdispatch/ncjson.c +++ b/libdispatch/ncjson.c @@ -128,7 +128,9 @@ static int NCJnewstring(int sort, const char* value, NCjson** jsonp); static int NCJnewstringn(int sort, size_t len, const char* value, NCjson** jsonp); static int NCJclone(const NCjson* json, NCjson** clonep); static int NCJaddstring(NCjson* json, int sort, const char* s); -static int NCJinsert(NCjson* object, char* key, NCjson* jvalue); +static int NCJinsert(NCjson* object, const char* key, NCjson* jvalue); +static int NCJinsertstring(NCjson* object, const char* key, const char* value); +static int NCJinsertint(NCjson* object, const char* key, long long ivalue); static int NCJappend(NCjson* object, NCjson* value); static int NCJunparse(const NCjson* json, unsigned flags, char** textp); #else /*!NETCDF_JSON_H*/ @@ -900,7 +902,7 @@ done: /* Insert key-value pair into a dict object. key will be strdup'd */ OPTSTATIC int -NCJinsert(NCjson* object, char* key, NCjson* jvalue) +NCJinsert(NCjson* object, const char* key, NCjson* jvalue) { int stat = NCJ_OK; NCjson* jkey = NULL; @@ -913,6 +915,36 @@ done: return NCJTHROW(stat); } +/* Insert key-value pair as strings into a dict object. + key and value will be strdup'd */ +OPTSTATIC int +NCJinsertstring(NCjson* object, const char* key, const char* value) +{ + int stat = NCJ_OK; + NCjson* jvalue = NULL; + if(value == NULL) + NCJnew(NCJ_NULL,&jvalue); + else + NCJnewstring(NCJ_STRING,value,&jvalue); + NCJinsert(object,key,jvalue); +done: + return NCJTHROW(stat); +} + +/* Insert key-value pair with value being an integer */ +OPTSTATIC int +NCJinsertint(NCjson* object, const char* key, long long ivalue) +{ + int stat = NCJ_OK; + NCjson* jvalue = NULL; + char digits[128]; + snprintf(digits,sizeof(digits),"%lld",ivalue); + NCJnewstring(NCJ_STRING,digits,&jvalue); + NCJinsert(object,key,jvalue); +done: + return NCJTHROW(stat); +} + /* Append value to an array or dict object. */ OPTSTATIC int NCJappend(NCjson* object, NCjson* value) diff --git a/libdispatch/ncs3sdk_h5.c b/libdispatch/ncs3sdk_h5.c index f8263293b..0f99dfb47 100644 --- a/libdispatch/ncs3sdk_h5.c +++ b/libdispatch/ncs3sdk_h5.c @@ -122,7 +122,7 @@ NC_s3sdkinitialize(void) } /* Get environment information */ - NC_s3sdkenvironment(void); + NC_s3sdkenvironment(); return NC_NOERR; } diff --git a/libnczarr/zarr.c b/libnczarr/zarr.c index 832b0d7c4..a37a1d024 100644 --- a/libnczarr/zarr.c +++ b/libnczarr/zarr.c @@ -269,8 +269,8 @@ ncz_open_rootgroup(NC_FILE_INFO_T* dataset) if((stat=nczm_concat(NULL,ZGROUP,&rootpath))) goto done; - if((stat = NCZ_downloadjson(zfile->map, rootpath, &json))) - goto done; + if((stat = NCZ_downloadjson(zfile->map, rootpath, &json))) goto done; + if(json == NULL) goto done; /* Process the json */ for(i=0;icontents);i+=2) { const NCjson* key = nclistget(json->contents,i); @@ -315,7 +315,7 @@ applycontrols(NCZ_FILE_INFO_T* zinfo) int stat = NC_NOERR; const char* value = NULL; NClist* modelist = nclistnew(); - int noflags = 0; /* track non-default negative flags */ + size64_t noflags = 0; /* track non-default negative flags */ if((value = controllookup(zinfo->controllist,"mode")) != NULL) { if((stat = NCZ_comma_parse(value,modelist))) goto done; diff --git a/libnczarr/zarr.h b/libnczarr/zarr.h index 22dd2d1cf..9eedc3bff 100644 --- a/libnczarr/zarr.h +++ b/libnczarr/zarr.h @@ -49,7 +49,7 @@ EXTERNL int NCZ_stringconvert(nc_type typid, size_t len, void* data0, NCjson** j /* zsync.c */ EXTERNL int ncz_sync_file(NC_FILE_INFO_T* file, int isclose); EXTERNL int ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose); -EXTERNL int ncz_sync_atts(NC_FILE_INFO_T*, NC_OBJ* container, NCindex* attlist, int isclose); +EXTERNL int ncz_sync_atts(NC_FILE_INFO_T*, NC_OBJ* container, NCindex* attlist, NCjson* jatts, NCjson* jtypes, int isclose); EXTERNL int ncz_read_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp); EXTERNL int ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container); EXTERNL int ncz_read_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp); @@ -62,8 +62,6 @@ EXTERNL int NCZ_grpkey(const NC_GRP_INFO_T* grp, char** pathp); EXTERNL int NCZ_varkey(const NC_VAR_INFO_T* var, char** pathp); EXTERNL int NCZ_dimkey(const NC_DIM_INFO_T* dim, char** pathp); EXTERNL int ncz_splitkey(const char* path, NClist* segments); -EXTERNL int NCZ_readdict(NCZMAP* zmap, const char* key, NCjson** jsonp); -EXTERNL int NCZ_readarray(NCZMAP* zmap, const char* key, NCjson** jsonp); EXTERNL int ncz_nctypedecode(const char* snctype, nc_type* nctypep); EXTERNL int ncz_nctype2dtype(nc_type nctype, int endianness, int purezarr,int len, char** dnamep); EXTERNL int ncz_dtype2nctype(const char* dtype, nc_type typehint, int purezarr, nc_type* nctypep, int* endianp, int* typelenp); diff --git a/libnczarr/zattr.c b/libnczarr/zattr.c index 29d8e693f..7f3ef5545 100644 --- a/libnczarr/zattr.c +++ b/libnczarr/zattr.c @@ -51,7 +51,7 @@ ncz_getattlist(NC_GRP_INFO_T *grp, int varid, NC_VAR_INFO_T **varp, NCindex **at { NC_VAR_INFO_T *var; - if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, varid))) + if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, (size_t)varid))) return NC_ENOTVAR; assert(var->hdr.id == varid); @@ -120,7 +120,7 @@ ncz_get_att_special(NC_FILE_INFO_T* h5, NC_VAR_INFO_T* var, const char* name, /* The global reserved attributes */ if(strcmp(name,NCPROPS)==0) { - int len; + size_t len; if(h5->provenance.ncproperties == NULL) {stat = NC_ENOTATT; goto done;} if(mem_type == NC_NAT) mem_type = NC_CHAR; @@ -138,7 +138,7 @@ ncz_get_att_special(NC_FILE_INFO_T* h5, NC_VAR_INFO_T* var, const char* name, if(strcmp(name,SUPERBLOCKATT)==0) iv = (unsigned long long)h5->provenance.superblockversion; else /* strcmp(name,ISNETCDF4ATT)==0 */ - iv = NCZ_isnetcdf4(h5); + iv = (unsigned long long)NCZ_isnetcdf4(h5); if(mem_type == NC_NAT) mem_type = NC_INT; if(data) switch (mem_type) { @@ -279,8 +279,8 @@ NCZ_del_att(int ncid, int varid, const char *name) NC_FILE_INFO_T *h5; NC_ATT_INFO_T *att; NCindex* attlist = NULL; - int i; - size_t deletedid; + size_t i; + int deletedid; int retval; /* Name must be provided. */ @@ -516,7 +516,7 @@ ncz_put_att(NC_GRP_INFO_T* grp, int varid, const char *name, nc_type file_type, /* For an existing att, if we're not in define mode, the len must not be greater than the existing len for classic model. */ if (!(h5->flags & NC_INDEF) && - len * nc4typelen(file_type) > (size_t)att->len * nc4typelen(att->nc_typeid)) + len * (size_t)nc4typelen(file_type) > (size_t)att->len * (size_t)nc4typelen(att->nc_typeid)) { if (h5->cmode & NC_CLASSIC_MODEL) return NC_ENOTINDEFINE; @@ -980,7 +980,7 @@ int ncz_create_fillvalue(NC_VAR_INFO_T* var) { int stat = NC_NOERR; - int i; + size_t i; NC_ATT_INFO_T* fv = NULL; /* Have the var's attributes been read? */ diff --git a/libnczarr/zchunking.c b/libnczarr/zchunking.c index da0da5951..442f53e0c 100644 --- a/libnczarr/zchunking.c +++ b/libnczarr/zchunking.c @@ -258,7 +258,7 @@ NCZ_compute_all_slice_projections( NCZSliceProjections* results) { int stat = NC_NOERR; - size64_t r; + int r; for(r=0;rrank;r++) { /* Compute each of the rank SliceProjections instances */ diff --git a/libnczarr/zclose.c b/libnczarr/zclose.c index 7515bfcce..7d82ceeed 100644 --- a/libnczarr/zclose.c +++ b/libnczarr/zclose.c @@ -72,7 +72,7 @@ zclose_group(NC_GRP_INFO_T *grp) { int stat = NC_NOERR; NCZ_GRP_INFO_T* zgrp; - int i; + size_t i; assert(grp && grp->format_grp_info != NULL); LOG((3, "%s: grp->name %s", __func__, grp->hdr.name)); @@ -123,7 +123,7 @@ zclose_gatts(NC_GRP_INFO_T* grp) { int stat = NC_NOERR; NC_ATT_INFO_T *att; - int a; + size_t a; for(a = 0; a < ncindexsize(grp->att); a++) { NCZ_ATT_INFO_T* zatt = NULL; att = (NC_ATT_INFO_T* )ncindexith(grp->att, a); @@ -149,7 +149,7 @@ NCZ_zclose_var1(NC_VAR_INFO_T* var) int stat = NC_NOERR; NCZ_VAR_INFO_T* zvar; NC_ATT_INFO_T* att; - int a; + size_t a; assert(var && var->format_var_info); zvar = var->format_var_info;; @@ -191,7 +191,7 @@ zclose_vars(NC_GRP_INFO_T* grp) { int stat = NC_NOERR; NC_VAR_INFO_T* var; - int i; + size_t i; for(i = 0; i < ncindexsize(grp->vars); i++) { var = (NC_VAR_INFO_T*)ncindexith(grp->vars, i); @@ -215,7 +215,7 @@ zclose_dims(NC_GRP_INFO_T* grp) { int stat = NC_NOERR; NC_DIM_INFO_T* dim; - int i; + size_t i; for(i = 0; i < ncindexsize(grp->dim); i++) { NCZ_DIM_INFO_T* zdim; @@ -265,7 +265,7 @@ static int zclose_types(NC_GRP_INFO_T* grp) { int stat = NC_NOERR; - int i; + size_t i; NC_TYPE_INFO_T* type; for(i = 0; i < ncindexsize(grp->type); i++) @@ -289,7 +289,7 @@ static int zwrite_vars(NC_GRP_INFO_T *grp) { int stat = NC_NOERR; - int i; + size_t i; assert(grp && grp->format_grp_info != NULL); LOG((3, "%s: grp->name %s", __func__, grp->hdr.name)); diff --git a/libnczarr/zinternal.h b/libnczarr/zinternal.h index 3c3f706f9..54fb3e877 100644 --- a/libnczarr/zinternal.h +++ b/libnczarr/zinternal.h @@ -22,7 +22,6 @@ #define NCZ_CHUNKSIZE_FACTOR (10) #define NCZ_MIN_CHUNK_SIZE (2) - /**************************************************/ /* Constants */ @@ -39,56 +38,43 @@ # endif #endif -/* V1 reserved objects */ -#define NCZMETAROOT "/.nczarr" -#define NCZGROUP ".nczgroup" -#define NCZARRAY ".nczarray" -#define NCZATTRS ".nczattrs" -/* Deprecated */ -#define NCZVARDEP ".nczvar" -#define NCZATTRDEP ".nczattr" - #define ZMETAROOT "/.zgroup" +#define ZMETAATTR "/.zattrs" #define ZGROUP ".zgroup" #define ZATTRS ".zattrs" #define ZARRAY ".zarray" -/* Pure Zarr pseudo names */ -#define ZDIMANON "_zdim" - /* V2 Reserved Attributes */ /* -Inserted into /.zgroup +For nczarr version 2.x.x, the following (key,value) +pairs are stored in .zgroup and/or .zarray. + +Inserted into /.zattrs in root group _nczarr_superblock: {"version": "2.0.0"} -Inserted into any .zgroup + +Inserted into any group level .zattrs "_nczarr_group": "{ -\"dimensions\": {\"d1\": \"1\", \"d2\": \"1\",...} -\"variables\": [\"v1\", \"v2\", ...] +\"dimensions\": [{name: , size: , unlimited: 1|0},...], +\"arrays\": [\"v1\", \"v2\", ...] \"groups\": [\"g1\", \"g2\", ...] }" -Inserted into any .zarray + +Inserted into any array level .zattrs "_nczarr_array": "{ -\"dimensions\": [\"/g1/g2/d1\", \"/d2\",...] -\"storage\": \"scalar\"|\"contiguous\"|\"compact\"|\"chunked\" +\"dimension_references\": [\"/g1/g2/d1\", \"/d2\",...] +\"storage\": \"scalar\"|\"contiguous\"|\"chunked\" }" -Inserted into any .zattrs ? or should it go into the container? + +Inserted into any .zattrs "_nczarr_attr": "{ \"types\": {\"attr1\": \" NC_CHAR. -+ */ #define NCZ_V2_SUPERBLOCK "_nczarr_superblock" #define NCZ_V2_GROUP "_nczarr_group" #define NCZ_V2_ARRAY "_nczarr_array" -#define NCZ_V2_ATTR NC_NCZARR_ATTR - -#define NCZ_V2_SUPERBLOCK_UC "_NCZARR_SUPERBLOCK" -#define NCZ_V2_GROUP_UC "_NCZARR_GROUP" -#define NCZ_V2_ARRAY_UC "_NCZARR_ARRAY" -#define NCZ_V2_ATTR_UC NC_NCZARR_ATTR_UC +#define NCZ_V2_ATTR "_nczarr_attr" /* Must match value in include/nc4internal.h */ #define NCZARRCONTROL "nczarr" #define PUREZARRCONTROL "zarr" diff --git a/libnczarr/zsync.c b/libnczarr/zsync.c index 4d8ee9d9c..0f382c62e 100644 --- a/libnczarr/zsync.c +++ b/libnczarr/zsync.c @@ -8,7 +8,7 @@ #include #ifndef nulldup - #define nulldup(x) ((x)?strdup(x):(x)) +#define nulldup(x) ((x)?strdup(x):(x)) #endif #undef FILLONCLOSE @@ -21,15 +21,16 @@ static int ncz_collect_dims(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NCjson** jdimsp); static int ncz_sync_var(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose); -static int load_jatts(NCZMAP* map, NC_OBJ* container, int nczarrv1, NCjson** jattrsp, NClist** atypes); +static int download_jatts(NCZMAP* map, NC_OBJ* container, NCjson** jattsp, const NCjson** jtypesp, const NCjson** jnczgrpp, const NCjson** jnczarrayp); static int zconvert(NCjson* src, nc_type typeid, size_t typelen, int* countp, NCbytes* dst); -static int computeattrinfo(const char* name, NClist* atypes, nc_type typehint, int purezarr, NCjson* values, +static int computeattrinfo(const char* name, const NCjson* jtypes, nc_type typehint, int purezarr, NCjson* values, nc_type* typeidp, size_t* typelenp, size_t* lenp, void** datap); static int parse_group_content(NCjson* jcontent, NClist* dimdefs, NClist* varnames, NClist* subgrps); static int parse_group_content_pure(NCZ_FILE_INFO_T* zinfo, NC_GRP_INFO_T* grp, NClist* varnames, NClist* subgrps); static int define_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp); static int define_dims(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* diminfo); static int define_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* varnames); +static int define_var1(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, const char* varname); static int define_subgrps(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* subgrpnames); static int searchvars(NCZ_FILE_INFO_T*, NC_GRP_INFO_T*, NClist*); static int searchsubgrps(NCZ_FILE_INFO_T*, NC_GRP_INFO_T*, NClist*); @@ -40,9 +41,11 @@ static int decodeints(NCjson* jshape, size64_t* shapes); static int computeattrdata(nc_type typehint, nc_type* typeidp, NCjson* values, size_t* typelenp, size_t* lenp, void** datap); static int computedimrefs(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int purezarr, int xarray, int ndims, NClist* dimnames, size64_t* shapes, NC_DIM_INFO_T** dims); static int json_convention_read(NCjson* jdict, NCjson** jtextp); -static int jtypes2atypes(NCjson* jtypes, NClist* atypes); - static int ncz_validate(NC_FILE_INFO_T* file); +static int insert_attr(NCjson* jatts, NCjson* jtypes, const char* aname, NCjson* javalue, const char* atype); +static int insert_nczarr_attr(NCjson* jatts, NCjson* jtypes); +static int upload_attrs(NC_FILE_INFO_T* file, NC_OBJ* container, NCjson* jatts); +static int readdict(NCZMAP* zmap, const char* key, NCjson** jsonp); /**************************************************/ /**************************************************/ @@ -93,7 +96,8 @@ done: static int ncz_collect_dims(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NCjson** jdimsp) { - int i, stat=NC_NOERR; + int stat=NC_NOERR; + size_t i; NCjson* jdims = NULL; NCjson* jdimsize = NULL; NCjson* jdimargs = NULL; @@ -144,7 +148,8 @@ done: int ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose) { - int i,stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; NCZ_FILE_INFO_T* zinfo = NULL; char version[1024]; int purezarr = 0; @@ -156,8 +161,11 @@ ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose) NCjson* jdims = NULL; NCjson* jvars = NULL; NCjson* jsubgrps = NULL; + NCjson* jnczgrp = NULL; NCjson* jsuper = NULL; NCjson* jtmp = NULL; + NCjson* jatts = NULL; + NCjson* jtypes = NULL; LOG((3, "%s: dims: %s", __func__, key)); ZTRACE(3,"file=%s grp=%s isclose=%d",file->controller->path,grp->hdr.name,isclose); @@ -171,7 +179,29 @@ ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose) if((stat = NCZ_grpkey(grp,&fullpath))) goto done; + /* build ZGROUP contents */ + if((stat = NCJnew(NCJ_DICT,&jgroup))) + goto done; + snprintf(version,sizeof(version),"%d",zinfo->zarr.zarr_version); + if((stat = NCJaddstring(jgroup,NCJ_STRING,"zarr_format"))) goto done; + if((stat = NCJaddstring(jgroup,NCJ_INT,version))) goto done; + /* build ZGROUP path */ + if((stat = nczm_concat(fullpath,ZGROUP,&key))) + goto done; + /* Write to map */ + if((stat=NCZ_uploadjson(map,key,jgroup))) goto done; + nullfree(key); key = NULL; + if(!purezarr) { + if(grp->parent == NULL) { /* Root group */ + /* create superblock */ + snprintf(version,sizeof(version),"%lu.%lu.%lu", + zinfo->zarr.nczarr_version.major, + zinfo->zarr.nczarr_version.minor, + zinfo->zarr.nczarr_version.release); + if((stat = NCJnew(NCJ_DICT,&jsuper))) goto done; + if((stat = NCJinsertstring(jsuper,"version",version))) goto done; + } /* Create dimensions dict */ if((stat = ncz_collect_dims(file,grp,&jdims))) goto done; @@ -191,54 +221,43 @@ ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose) if((stat = NCJaddstring(jsubgrps,NCJ_STRING,g->hdr.name))) goto done; } /* Create the "_nczarr_group" dict */ - if((stat = NCJnew(NCJ_DICT,&json))) + if((stat = NCJnew(NCJ_DICT,&jnczgrp))) goto done; /* Insert the various dicts and arrays */ - if((stat = NCJinsert(json,"dims",jdims))) goto done; + if((stat = NCJinsert(jnczgrp,"dimensions",jdims))) goto done; jdims = NULL; /* avoid memory problems */ - if((stat = NCJinsert(json,"vars",jvars))) goto done; + if((stat = NCJinsert(jnczgrp,"arrays",jvars))) goto done; jvars = NULL; /* avoid memory problems */ - if((stat = NCJinsert(json,"groups",jsubgrps))) goto done; + if((stat = NCJinsert(jnczgrp,"groups",jsubgrps))) goto done; jsubgrps = NULL; /* avoid memory problems */ } - /* build ZGROUP contents */ - if((stat = NCJnew(NCJ_DICT,&jgroup))) - goto done; - snprintf(version,sizeof(version),"%d",zinfo->zarr.zarr_version); - if((stat = NCJaddstring(jgroup,NCJ_STRING,"zarr_format"))) goto done; - if((stat = NCJaddstring(jgroup,NCJ_INT,version))) goto done; - if(!purezarr && grp->parent == NULL) { /* Root group */ - snprintf(version,sizeof(version),"%lu.%lu.%lu", - zinfo->zarr.nczarr_version.major, - zinfo->zarr.nczarr_version.minor, - zinfo->zarr.nczarr_version.release); - if((stat = NCJnew(NCJ_DICT,&jsuper))) goto done; - if((stat-NCJnewstring(NCJ_STRING,version,&jtmp))) goto done; - if((stat = NCJinsert(jsuper,"version",jtmp))) goto done; - jtmp = NULL; - if((stat = NCJinsert(jgroup,NCZ_V2_SUPERBLOCK,jsuper))) goto done; + /* Build the .zattrs object */ + assert(grp->att); + NCJnew(NCJ_DICT,&jatts); + NCJnew(NCJ_DICT,&jtypes); + if((stat = ncz_sync_atts(file, (NC_OBJ*)grp, grp->att, jatts, jtypes, isclose))) goto done; + + if(!purezarr && jnczgrp != NULL) { + /* Insert _nczarr_group */ + if((stat=insert_attr(jatts,jtypes,NCZ_V2_GROUP,jnczgrp,"|J0"))) goto done; + jnczgrp = NULL; + } + + if(!purezarr && jsuper != NULL) { + /* Insert superblock */ + if((stat=insert_attr(jatts,jtypes,NCZ_V2_SUPERBLOCK,jsuper,"|J0"))) goto done; jsuper = NULL; } - if(!purezarr) { - /* Insert the "_NCZARR_GROUP" dict */ - if((stat = NCJinsert(jgroup,NCZ_V2_GROUP,json))) goto done; - json = NULL; + /* As a last mod to jatts, insert the jtypes as an attribute */ + if(!purezarr && jtypes != NULL) { + if((stat = insert_nczarr_attr(jatts,jtypes))) goto done; + jtypes = NULL; } - /* build ZGROUP path */ - if((stat = nczm_concat(fullpath,ZGROUP,&key))) - goto done; - /* Write to map */ - if((stat=NCZ_uploadjson(map,key,jgroup))) - goto done; - nullfree(key); key = NULL; - - /* Build the .zattrs object */ - assert(grp->att); - if((stat = ncz_sync_atts(file,(NC_OBJ*)grp, grp->att, isclose))) - goto done; + /* Write out the .zattrs */ + if((stat = upload_attrs(file,(NC_OBJ*)grp,jatts))) goto done; /* Now synchronize all the variables */ for(i=0; ivars); i++) { @@ -260,6 +279,9 @@ done: NCJreclaim(jdims); NCJreclaim(jvars); NCJreclaim(jsubgrps); + NCJreclaim(jnczgrp); + NCJreclaim(jtypes); + NCJreclaim(jatts); nullfree(fullpath); nullfree(key); return ZUNTRACE(THROW(stat)); @@ -292,6 +314,8 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) NCjson* jdimrefs = NULL; NCjson* jtmp = NULL; NCjson* jfill = NULL; + NCjson* jatts = NULL; + NCjson* jtypes = NULL; char* dtypename = NULL; int purezarr = 0; size64_t shape[NC_MAX_VAR_DIMS]; @@ -465,6 +489,15 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) jtmp = NULL; } + /* build .zarray path */ + if((stat = nczm_concat(fullpath,ZARRAY,&key))) + goto done; + + /* Write to map */ + if((stat=NCZ_uploadjson(map,key,jvar))) + goto done; + nullfree(key); key = NULL; + /* Capture dimref names as FQNs */ if(var->ndims > 0) { if((dimrefs = nclistnew())==NULL) {stat = NC_ENOMEM; goto done;} @@ -504,28 +537,30 @@ ncz_sync_var_meta(NC_FILE_INFO_T* file, NC_VAR_INFO_T* var, int isclose) if((stat = NCJnewstring(NCJ_STRING,"chunked",&jtmp)))goto done; if((stat = NCJinsert(jncvar,"storage",jtmp))) goto done; jtmp = NULL; - - if(!(zinfo->controls.flags & FLAG_PUREZARR)) { - if((stat = NCJinsert(jvar,NCZ_V2_ARRAY,jncvar))) goto done; - jncvar = NULL; - } } - /* build .zarray path */ - if((stat = nczm_concat(fullpath,ZARRAY,&key))) - goto done; - - /* Write to map */ - if((stat=NCZ_uploadjson(map,key,jvar))) - goto done; - nullfree(key); key = NULL; - - var->created = 1; - /* Build .zattrs object */ assert(var->att); - if((stat = ncz_sync_atts(file,(NC_OBJ*)var, var->att, isclose))) - goto done; + NCJnew(NCJ_DICT,&jatts); + NCJnew(NCJ_DICT,&jtypes); + if((stat = ncz_sync_atts(file,(NC_OBJ*)var, var->att, jatts, jtypes, isclose))) goto done; + + if(!purezarr && jncvar != NULL) { + /* Insert _nczarr_array */ + if((stat=insert_attr(jatts,jtypes,NCZ_V2_ARRAY,jncvar,"|J0"))) goto done; + jncvar = NULL; + } + + /* As a last mod to jatts, optionally insert the jtypes as an attribute and add _nczarr_attr as attribute*/ + if(!purezarr && jtypes != NULL) { + if((stat = insert_nczarr_attr(jatts,jtypes))) goto done; + jtypes = NULL; + } + + /* Write out the .zattrs */ + if((stat = upload_attrs(file,(NC_OBJ*)var,jatts))) goto done; + + var->created = 1; done: nclistfreeall(dimrefs); @@ -537,6 +572,8 @@ done: NCJreclaim(jncvar); NCJreclaim(jtmp); NCJreclaim(jfill); + NCJreclaim(jatts); + NCJreclaim(jtypes); return ZUNTRACE(THROW(stat)); } @@ -653,25 +690,25 @@ done: /** * @internal Synchronize attribute data from memory to map. * + * @param file * @param container Pointer to grp|var struct containing the attributes - * @param key the name of the map entry + * @param attlist + * @param jattsp + * @param jtypesp * * @return ::NC_NOERR No error. * @author Dennis Heimbigner */ int -ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isclose) +ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, NCjson* jatts, NCjson* jtypes, int isclose) { - int i,stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; NCZ_FILE_INFO_T* zinfo = NULL; - NCjson* jatts = NULL; - NCjson* jtypes = NULL; - NCjson* jtype = NULL; NCjson* jdimrefs = NULL; NCjson* jdict = NULL; NCjson* jint = NULL; NCjson* jdata = NULL; - NCZMAP* map = NULL; char* fullpath = NULL; char* key = NULL; char* content = NULL; @@ -684,6 +721,8 @@ ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isc int purezarr = 0; int endianness = (NC_isLittleEndian()?NC_ENDIAN_LITTLE:NC_ENDIAN_BIG); + NC_UNUSED(isclose); + LOG((3, "%s", __func__)); ZTRACE(3,"file=%s container=%s |attlist|=%u",file->controller->path,container->name,(unsigned)ncindexsize(attlist)); @@ -696,46 +735,33 @@ ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isc } zinfo = file->format_file_info; - map = zinfo->map; - purezarr = (zinfo->controls.flags & FLAG_PUREZARR)?1:0; if(zinfo->controls.flags & FLAG_XARRAYDIMS) isxarray = 1; - /* Create the attribute dictionary */ - if((stat = NCJnew(NCJ_DICT,&jatts))) goto done; - if(ncindexsize(attlist) > 0) { - /* Create the jncattr.types object */ - if((stat = NCJnew(NCJ_DICT,&jtypes))) - goto done; /* Walk all the attributes convert to json and collect the dtype */ for(i=0;ihdr.name); - /* If reserved and hidden, then ignore */ - if(ra && (ra->flags & HIDDENATTRFLAG)) continue; -#endif if(a->nc_typeid > NC_MAX_ATOMIC_TYPE) {stat = (THROW(NC_ENCZARR)); goto done;} if(a->nc_typeid == NC_STRING) - typesize = NCZ_get_maxstrlen(container); + typesize = (size_t)NCZ_get_maxstrlen(container); else {if((stat = NC4_inq_atomic_type(a->nc_typeid,NULL,&typesize))) goto done;} /* Convert to storable json */ if((stat = NCZ_stringconvert(a->nc_typeid,a->len,a->data,&jdata))) goto done; - if((stat = NCJinsert(jatts,a->hdr.name,jdata))) goto done; - jdata = NULL; /* Collect the corresponding dtype */ - { - if((stat = ncz_nctype2dtype(a->nc_typeid,endianness,purezarr,typesize,&tname))) goto done; - if((stat = NCJnewstring(NCJ_STRING,tname,&jtype))) goto done; - nullfree(tname); tname = NULL; - if((stat = NCJinsert(jtypes,a->hdr.name,jtype))) goto done; /* add {name: type} */ - jtype = NULL; - } + if((stat = ncz_nctype2dtype(a->nc_typeid,endianness,purezarr,typesize,&tname))) goto done; + + /* Insert the attribute; consumes jdata */ + if((stat = insert_attr(jatts,jtypes,a->hdr.name, jdata, tname))) goto done; + + /* cleanup */ + nullfree(tname); tname = NULL; + jdata = NULL; + } } @@ -805,36 +831,12 @@ ncz_sync_atts(NC_FILE_INFO_T* file, NC_OBJ* container, NCindex* attlist, int isc } } - if(NCJlength(jatts) > 0) { - if(!(zinfo->controls.flags & FLAG_PUREZARR)) { - /* Insert the _NCZARR_ATTR attribute */ - if((stat = NCJnew(NCJ_DICT,&jdict))) - goto done; - if(jtypes != NULL) - {if((stat = NCJinsert(jdict,"types",jtypes))) goto done;} - jtypes = NULL; - if(jdict != NULL) - {if((stat = NCJinsert(jatts,NCZ_V2_ATTR,jdict))) goto done;} - jdict = NULL; - } - /* write .zattrs path */ - if((stat = nczm_concat(fullpath,ZATTRS,&key))) - goto done; - /* Write to map */ - if((stat=NCZ_uploadjson(map,key,jatts))) - goto done; - nullfree(key); key = NULL; - } - done: nullfree(fullpath); nullfree(key); nullfree(content); nullfree(dimpath); nullfree(tname); - NCJreclaim(jatts); - NCJreclaim(jtypes); - NCJreclaim(jtype); NCJreclaim(jdimrefs); NCJreclaim(jdict); NCJreclaim(jint); @@ -850,97 +852,86 @@ done: the corresponding NCjson dict. @param map - [in] the map object for storage @param container - [in] the containing object -@param jattrsp - [out] the json for .zattrs -@param jtypesp - [out] the json for .ztypes +@param jattsp - [out] the json for .zattrs || NULL if not found +@param jtypesp - [out] the json attribute type dict || NULL +@param jnczgrp - [out] the json for _nczarr_group || NULL +@param jnczarray - [out] the json for _nczarr_array || NULL @return NC_NOERR +@return NC_EXXX @author Dennis Heimbigner */ static int -load_jatts(NCZMAP* map, NC_OBJ* container, int nczarrv1, NCjson** jattrsp, NClist** atypesp) +download_jatts(NCZMAP* map, NC_OBJ* container, NCjson** jattsp, const NCjson** jtypesp, const NCjson** jnczgrpp, const NCjson** jnczarrayp) { int stat = NC_NOERR; char* fullpath = NULL; char* key = NULL; - NCjson* jnczarr = NULL; - NCjson* jattrs = NULL; - NCjson* jncattr = NULL; - NClist* atypes = NULL; /* envv list */ + NCjson* jatts = NULL; + const NCjson* jtypes = NULL; + const NCjson* jnczgrp = NULL; + const NCjson* jnczarray = NULL; + const NCjson* jnczattr = NULL; + NC_GRP_INFO_T* grp = NULL; + NC_VAR_INFO_T* var = NULL; - ZTRACE(3,"map=%p container=%s nczarrv1=%d",map,container->name,nczarrv1); - - /* alway return (possibly empty) list of types */ - atypes = nclistnew(); + ZTRACE(3,"map=%p container=%s ",map,container->name); if(container->sort == NCGRP) { - NC_GRP_INFO_T* grp = (NC_GRP_INFO_T*)container; + grp = (NC_GRP_INFO_T*)container; /* Get grp's fullpath name */ - if((stat = NCZ_grpkey(grp,&fullpath))) - goto done; + if((stat = NCZ_grpkey(grp,&fullpath))) goto done; } else { - NC_VAR_INFO_T* var = (NC_VAR_INFO_T*)container; + var = (NC_VAR_INFO_T*)container; /* Get var's fullpath name */ - if((stat = NCZ_varkey(var,&fullpath))) - goto done; + if((stat = NCZ_varkey(var,&fullpath))) goto done; } /* Construct the path to the .zattrs object */ if((stat = nczm_concat(fullpath,ZATTRS,&key))) goto done; - /* Download the .zattrs object: may not exist if not NCZarr V1 */ - switch ((stat=NCZ_downloadjson(map,key,&jattrs))) { - case NC_NOERR: break; - case NC_EEMPTY: stat = NC_NOERR; break; /* did not exist */ - default: goto done; /* failure */ - } + /* Download the .zattrs object */ + if((stat=NCZ_downloadjson(map,key,&jatts))) goto done; nullfree(key); key = NULL; - if(jattrs != NULL) { - if(nczarrv1) { - /* Construct the path to the NCZATTRS object */ - if((stat = nczm_concat(fullpath,NCZATTRS,&key))) goto done; - /* Download the NCZATTRS object: may not exist if pure zarr or using deprecated name */ - stat=NCZ_downloadjson(map,key,&jncattr); - if(stat == NC_EEMPTY) { - /* try deprecated name */ - nullfree(key); key = NULL; - if((stat = nczm_concat(fullpath,NCZATTRDEP,&key))) goto done; - stat=NCZ_downloadjson(map,key,&jncattr); - } - } else {/* Get _nczarr_attr from .zattrs */ - stat = NCJdictget(jattrs,NCZ_V2_ATTR,&jncattr); - if(!stat && jncattr == NULL) - {stat = NCJdictget(jattrs,NCZ_V2_ATTR_UC,&jncattr);} - } - nullfree(key); key = NULL; + if(jatts != NULL) { + /* Get _nczarr_attr from .zattrs */ + stat = NCJdictget(jatts,NCZ_V2_ATTR,(NCjson**)&jnczattr); switch (stat) { case NC_NOERR: break; - case NC_EEMPTY: stat = NC_NOERR; jncattr = NULL; break; + case NC_EEMPTY: stat = NC_NOERR; jnczattr = NULL; break; default: goto done; /* failure */ } - if(jncattr != NULL) { - NCjson* jtypes = NULL; - /* jncattr attribute should be a dict */ - if(NCJsort(jncattr) != NCJ_DICT) {stat = (THROW(NC_ENCZARR)); goto done;} - /* Extract "types; may not exist if only hidden attributes are defined */ - if((stat = NCJdictget(jncattr,"types",&jtypes))) goto done; + /* Get _nczarr_array|group from .zattrs */ + if(grp != NULL) { + stat = NCJdictget(jatts,NCZ_V2_GROUP,(NCjson**)&jnczgrp); + } else { + stat = NCJdictget(jatts,NCZ_V2_ARRAY,(NCjson**)&jnczarray); + } + switch (stat) { + case NC_NOERR: break; + case NC_EEMPTY: stat = NC_NOERR; + jnczgrp = NULL; jnczarray = NULL; break; + default: goto done; /* failure */ + } + + if(jnczattr != NULL) { + /* jnczattr attribute should be a dict */ + if(NCJsort(jnczattr) != NCJ_DICT) {stat = (THROW(NC_ENCZARR)); goto done;} + /* Extract "types"; may not exist if only hidden attributes are defined */ + if((stat = NCJdictget(jnczattr,"types",(NCjson**)&jtypes))) goto done; if(jtypes != NULL) { if(NCJsort(jtypes) != NCJ_DICT) {stat = (THROW(NC_ENCZARR)); goto done;} - /* Convert to an envv list */ - if((stat = jtypes2atypes(jtypes,atypes))) goto done; } } } - if(jattrsp) {*jattrsp = jattrs; jattrs = NULL;} - if(atypesp) {*atypesp = atypes; atypes = NULL;} + if(jattsp) {*jattsp = jatts; jatts = NULL;} + if(jtypes) {*jtypesp = jtypes; jtypes = NULL;} + if(jnczgrp) {*jnczgrpp = jnczgrp; jnczgrp = NULL;} + if(jnczarray) {*jnczarrayp = jnczarray; jnczarray = NULL;} done: - if(nczarrv1) - NCJreclaim(jncattr); - if(stat) { - NCJreclaim(jnczarr); - nclistfreeall(atypes); - } + NCJreclaim(jatts); nullfree(fullpath); nullfree(key); return ZUNTRACE(THROW(stat)); @@ -950,7 +941,8 @@ done: static int zcharify(NCjson* src, NCbytes* buf) { - int i, stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; struct NCJconst jstr = NCJconst_empty; if(NCJsort(src) != NCJ_ARRAY) { /* singleton */ @@ -1027,7 +1019,7 @@ done: Extract type and data for an attribute */ static int -computeattrinfo(const char* name, NClist* atypes, nc_type typehint, int purezarr, NCjson* values, +computeattrinfo(const char* name, const NCjson* jtypes, nc_type typehint, int purezarr, NCjson* values, nc_type* typeidp, size_t* typelenp, size_t* lenp, void** datap) { int stat = NC_NOERR; @@ -1036,15 +1028,16 @@ computeattrinfo(const char* name, NClist* atypes, nc_type typehint, int purezarr void* data = NULL; nc_type typeid; - ZTRACE(3,"name=%s |atypes|=%u typehint=%d purezarr=%d values=|%s|",name,nclistlength(atypes),typehint,purezarr,NCJtotext(values)); + ZTRACE(3,"name=%s typehint=%d purezarr=%d values=|%s|",name,typehint,purezarr,NCJtotext(values)); /* Get type info for the given att */ typeid = NC_NAT; - for(i=0;icontroller->path,grp->hdr.name); zinfo = file->format_file_info; map = zinfo->map; + purezarr = (zinfo->controls.flags & FLAG_PUREZARR)?1:0; /* Construct grp path */ if((stat = NCZ_grpkey(grp,&fullpath))) goto done; - if(zinfo->controls.flags & FLAG_PUREZARR) { + if(purezarr) { if((stat = parse_group_content_pure(zinfo,grp,varnames,subgrps))) goto done; purezarr = 1; } else { /*!purezarr*/ - if(zinfo->controls.flags & FLAG_NCZARR_V1) { - /* build NCZGROUP path */ - if((stat = nczm_concat(fullpath,NCZGROUP,&key))) - goto done; - /* Read */ - jdict = NULL; - stat=NCZ_downloadjson(map,key,&jdict); - v1 = 1; - } else { - /* build ZGROUP path */ - if((stat = nczm_concat(fullpath,ZGROUP,&key))) - goto done; - /* Read */ - switch (stat=NCZ_downloadjson(map,key,&jgroup)) { - case NC_NOERR: /* Extract the NCZ_V2_GROUP dict */ - if((stat = NCJdictget(jgroup,NCZ_V2_GROUP,&jdict))) goto done; - if(!stat && jdict == NULL) - {if((stat = NCJdictget(jgroup,NCZ_V2_GROUP_UC,&jdict))) goto done;} - break; - case NC_EEMPTY: /* does not exist, use search */ - if((stat = parse_group_content_pure(zinfo,grp,varnames,subgrps))) - goto done; - purezarr = 1; - break; - default: goto done; - } + /* build ZGROUP path */ + if((stat = nczm_concat(fullpath,ZGROUP,&key))) goto done; + /* Read */ + if((stat=NCZ_downloadjson(map,key,&jgroup))) goto done; + if(jgroup == NULL) { /* does not exist, use search */ + if((stat = parse_group_content_pure(zinfo,grp,varnames,subgrps))) goto done; + purezarr = 1; } nullfree(key); key = NULL; - if(jdict) { + /* read corresponding ZATTR object */ + if((stat = nczm_concat(fullpath,ZATTRS,&key))) goto done; + if((stat=NCZ_downloadjson(map,key,&jattrs))) goto done; + if(jattrs == NULL) { /* does not exist, use search */ + if((stat = parse_group_content_pure(zinfo,grp,varnames,subgrps))) goto done; + purezarr = 1; + } else { /* Extract the NCZ_V2_GROUP attribute*/ + if((stat = NCJdictget(jattrs,NCZ_V2_GROUP,&jnczgrp))) goto done; + } + nullfree(key); key = NULL; + if(jnczgrp) { /* Pull out lists about group content */ - if((stat = parse_group_content(jdict,dimdefs,varnames,subgrps))) + if((stat = parse_group_content(jnczgrp,dimdefs,varnames,subgrps))) goto done; } } @@ -1234,9 +1219,9 @@ define_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp) if((stat = define_subgrps(file,grp,subgrps))) goto done; done: - if(v1) NCJreclaim(jdict); NCJreclaim(json); NCJreclaim(jgroup); + NCJreclaim(jattrs); nclistfreeall(dimdefs); nclistfreeall(varnames); nclistfreeall(subgrps); @@ -1259,7 +1244,7 @@ int ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container) { int stat = NC_NOERR; - int i; + size_t i; char* fullpath = NULL; char* key = NULL; NCZ_FILE_INFO_T* zinfo = NULL; @@ -1269,14 +1254,16 @@ ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container) NCZMAP* map = NULL; NC_ATT_INFO_T* att = NULL; NCindex* attlist = NULL; - NCjson* jattrs = NULL; - NClist* atypes = NULL; nc_type typeid; size_t len, typelen; void* data = NULL; NC_ATT_INFO_T* fillvalueatt = NULL; nc_type typehint = NC_NAT; int purezarr; + NCjson* jattrs = NULL; + const NCjson* jtypes = NULL; + const NCjson* jnczgrp = NULL; + const NCjson* jnczarray = NULL; ZTRACE(3,"file=%s container=%s",file->controller->path,container->name); @@ -1294,13 +1281,7 @@ ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container) attlist = var->att; } - switch ((stat = load_jatts(map, container, (zinfo->controls.flags & FLAG_NCZARR_V1), &jattrs, &atypes))) { - case NC_NOERR: break; - case NC_EEMPTY: /* container has no attributes */ - stat = NC_NOERR; - break; - default: goto done; /* true error */ - } + if((stat = download_jatts(map, container, &jattrs, &jtypes, &jnczgrp, &jnczarray))) goto done; if(jattrs != NULL) { /* Iterate over the attributes to create the in-memory attributes */ @@ -1334,7 +1315,7 @@ ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container) /* case 2: name = _ARRAY_DIMENSIONS, sort==NCVAR, flags & HIDDENATTRFLAG */ if(strcmp(aname,NC_XARRAY_DIMS)==0 && var != NULL && (ra->flags & HIDDENATTRFLAG)) { /* store for later */ - int i; + size_t i; assert(NCJsort(value) == NCJ_ARRAY); if((zvar->xarray = nclistnew())==NULL) {stat = NC_ENOMEM; goto done;} @@ -1352,7 +1333,7 @@ ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container) typehint = var->type_info->hdr.id ; /* if unknown use the var's type for _FillValue */ /* Create the attribute */ /* Collect the attribute's type and value */ - if((stat = computeattrinfo(aname,atypes,typehint,purezarr,value, + if((stat = computeattrinfo(aname,jtypes,typehint,purezarr,value, &typeid,&typelen,&len,&data))) goto done; if((stat = ncz_makeattr(container,attlist,aname,typeid,len,data,&att))) @@ -1384,7 +1365,6 @@ done: if(data != NULL) stat = NC_reclaim_data(file->controller,att->nc_typeid,data,len); NCJreclaim(jattrs); - nclistfreeall(atypes); nullfree(fullpath); nullfree(key); return ZUNTRACE(THROW(stat)); @@ -1435,6 +1415,369 @@ done: return ZUNTRACE(THROW(stat)); } + +/** + * @internal Materialize single var into memory; + * Take xarray and purezarr into account. + * + * @param file Pointer to file info struct. + * @param grp Pointer to grp info struct. + * @param varname name of variable in this group + * + * @return ::NC_NOERR No error. + * @author Dennis Heimbigner + */ +static int +define_var1(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, const char* varname) +{ + int stat = NC_NOERR; + size_t j; + NCZ_FILE_INFO_T* zinfo = NULL; + NCZMAP* map = NULL; + int purezarr = 0; + int xarray = 0; + /* per-variable info */ + NC_VAR_INFO_T* var = NULL; + NCZ_VAR_INFO_T* zvar = NULL; + NCjson* jvar = NULL; + NCjson* jatts = NULL; /* corresponding to jvar */ + NCjson* jncvar = NULL; + NCjson* jdimrefs = NULL; + NCjson* jvalue = NULL; + char* varpath = NULL; + char* key = NULL; + size64_t* shapes = NULL; + NClist* dimnames = NULL; + int varsized = 0; + int suppress = 0; /* Abort processing of this variable */ + nc_type vtype = NC_NAT; + int vtypelen = 0; + size_t rank = 0; + size_t zarr_rank = 0; /* Need to watch out for scalars */ +#ifdef NETCDF_ENABLE_NCZARR_FILTERS + NCjson* jfilter = NULL; + int chainindex = 0; +#endif + + ZTRACE(3,"file=%s grp=%s varname=%s",file->controller->path,grp->hdr.name,varname); + + zinfo = file->format_file_info; + map = zinfo->map; + + if(zinfo->controls.flags & FLAG_PUREZARR) purezarr = 1; + if(zinfo->controls.flags & FLAG_XARRAYDIMS) {xarray = 1;} + + dimnames = nclistnew(); + + if((stat = nc4_var_list_add2(grp, varname, &var))) + goto done; + + /* And its annotation */ + if((zvar = calloc(1,sizeof(NCZ_VAR_INFO_T)))==NULL) + {stat = NC_ENOMEM; goto done;} + var->format_var_info = zvar; + zvar->common.file = file; + + /* pretend it was created */ + var->created = 1; + + /* Indicate we do not have quantizer yet */ + var->quantize_mode = -1; + + /* Construct var path */ + if((stat = NCZ_varkey(var,&varpath))) + goto done; + + /* Construct the path to the .zarray object */ + if((stat = nczm_concat(varpath,ZARRAY,&key))) + goto done; + /* Download the zarray object */ + if((stat=readdict(map,key,&jvar))) goto done; + nullfree(key); key = NULL; + assert(NCJsort(jvar) == NCJ_DICT); + + /* Construct the path to the .zattrs object */ + if((stat = nczm_concat(varpath,ZATTRS,&key))) goto done; + /* Download object */ + if((stat=readdict(map,key,&jatts))) goto done; + nullfree(key); key = NULL; + if(jatts != NULL) + assert(NCJsort(jatts) == NCJ_DICT); + + /* Verify the format */ + { + int version; + if((stat = NCJdictget(jvar,"zarr_format",&jvalue))) goto done; + sscanf(NCJstring(jvalue),"%d",&version); + if(version != zinfo->zarr.zarr_version) + {stat = (THROW(NC_ENCZARR)); goto done;} + } + + /* Set the type and endianness of the variable */ + { + int endianness; + if((stat = NCJdictget(jvar,"dtype",&jvalue))) goto done; + /* Convert dtype to nc_type + endianness */ + if((stat = ncz_dtype2nctype(NCJstring(jvalue),NC_NAT,purezarr,&vtype,&endianness,&vtypelen))) + goto done; + if(vtype > NC_NAT && vtype <= NC_MAX_ATOMIC_TYPE) { + /* Locate the NC_TYPE_INFO_T object */ + if((stat = ncz_gettype(file,grp,vtype,&var->type_info))) + goto done; + } else {stat = NC_EBADTYPE; goto done;} +#if 0 /* leave native in place */ + if(endianness == NC_ENDIAN_NATIVE) + endianness = zinfo->native_endianness; + if(endianness == NC_ENDIAN_NATIVE) + endianness = (NCZ_isLittleEndian()?NC_ENDIAN_LITTLE:NC_ENDIAN_BIG); + if(endianness == NC_ENDIAN_LITTLE || endianness == NC_ENDIAN_BIG) { + var->endianness = endianness; + } else {stat = NC_EBADTYPE; goto done;} +#else + var->endianness = endianness; +#endif + var->type_info->endianness = var->endianness; /* Propagate */ + if(vtype == NC_STRING) { + zvar->maxstrlen = vtypelen; + vtypelen = sizeof(char*); /* in-memory len */ + if(zvar->maxstrlen <= 0) zvar->maxstrlen = NCZ_get_maxstrlen((NC_OBJ*)var); + } + } + + if(!purezarr) { + if(jatts == NULL) {stat = NC_ENCZARR; goto done;} + /* Extract the _NCZARR_ARRAY values */ + /* Do this first so we know about storage esp. scalar */ + /* Extract the NCZ_V2_ARRAY dict */ + if((stat = NCJdictget(jatts,NCZ_V2_ARRAY,&jncvar))) goto done; + if(jncvar == NULL) {stat = NC_ENCZARR; goto done;} + assert((NCJsort(jncvar) == NCJ_DICT)); + /* Extract scalar flag */ + if((stat = NCJdictget(jncvar,"scalar",&jvalue))) + goto done; + if(jvalue != NULL) { + var->storage = NC_CHUNKED; + zvar->scalar = 1; + } + /* Extract storage flag */ + if((stat = NCJdictget(jncvar,"storage",&jvalue))) + goto done; + if(jvalue != NULL) { + var->storage = NC_CHUNKED; + } + /* Extract dimrefs list */ + switch ((stat = NCJdictget(jncvar,"dimrefs",&jdimrefs))) { + case NC_NOERR: /* Extract the dimref names */ + assert((NCJsort(jdimrefs) == NCJ_ARRAY)); + if(zvar->scalar) { + assert(NCJlength(jdimrefs) == 0); + } else { + rank = NCJlength(jdimrefs); + for(j=0;jdimension_separator = 0; + if((stat = NCJdictget(jvar,"dimension_separator",&jvalue))) goto done; + if(jvalue != NULL) { + /* Verify its value */ + if(NCJsort(jvalue) == NCJ_STRING && NCJstring(jvalue) != NULL && strlen(NCJstring(jvalue)) == 1) + zvar->dimension_separator = NCJstring(jvalue)[0]; + } + /* If value is invalid, then use global default */ + if(!islegaldimsep(zvar->dimension_separator)) + zvar->dimension_separator = ngs->zarr.dimension_separator; /* use global value */ + assert(islegaldimsep(zvar->dimension_separator)); /* we are hosed */ + } + + /* fill_value; must precede calls to adjust cache */ + { + if((stat = NCJdictget(jvar,"fill_value",&jvalue))) goto done; + if(jvalue == NULL || NCJsort(jvalue) == NCJ_NULL) + var->no_fill = 1; + else { + size_t fvlen; + nc_type atypeid = vtype; + var->no_fill = 0; + if((stat = computeattrdata(var->type_info->hdr.id, &atypeid, jvalue, NULL, &fvlen, &var->fill_value))) + goto done; + assert(atypeid == vtype); + /* Note that we do not create the _FillValue + attribute here to avoid having to read all + the attributes and thus foiling lazy read.*/ + } + } + + /* shape */ + { + if((stat = NCJdictget(jvar,"shape",&jvalue))) goto done; + if(NCJsort(jvalue) != NCJ_ARRAY) {stat = (THROW(NC_ENCZARR)); goto done;} + + /* Process the rank */ + zarr_rank = NCJlength(jvalue); + if(zarr_rank == 0) { + /* suppress variable */ + ZLOG(NCLOGWARN,"Empty shape for variable %s suppressed",var->hdr.name); + suppress = 1; + goto suppressvar; + } + + if(zvar->scalar) { + rank = 0; + zarr_rank = 1; /* Zarr does not support scalars */ + } else + rank = (zarr_rank = NCJlength(jvalue)); + + if(zarr_rank > 0) { + /* Save the rank of the variable */ + if((stat = nc4_var_set_ndims(var, rank))) goto done; + /* extract the shapes */ + if((shapes = (size64_t*)malloc(sizeof(size64_t)*(size_t)zarr_rank)) == NULL) + {stat = (THROW(NC_ENOMEM)); goto done;} + if((stat = decodeints(jvalue, shapes))) goto done; + } + } + + /* chunks */ + { + size64_t chunks[NC_MAX_VAR_DIMS]; + if((stat = NCJdictget(jvar,"chunks",&jvalue))) goto done; + if(jvalue != NULL && NCJsort(jvalue) != NCJ_ARRAY) + {stat = (THROW(NC_ENCZARR)); goto done;} + /* Verify the rank */ + if(zvar->scalar || zarr_rank == 0) { + if(var->ndims != 0) + {stat = (THROW(NC_ENCZARR)); goto done;} + zvar->chunkproduct = 1; + zvar->chunksize = zvar->chunkproduct * var->type_info->size; + /* Create the cache */ + if((stat = NCZ_create_chunk_cache(var,var->type_info->size*zvar->chunkproduct,zvar->dimension_separator,&zvar->cache))) + goto done; + } else {/* !zvar->scalar */ + if(zarr_rank == 0) {stat = NC_ENCZARR; goto done;} + var->storage = NC_CHUNKED; + if(var->ndims != rank) + {stat = (THROW(NC_ENCZARR)); goto done;} + if((var->chunksizes = malloc(sizeof(size_t)*(size_t)zarr_rank)) == NULL) + {stat = NC_ENOMEM; goto done;} + if((stat = decodeints(jvalue, chunks))) goto done; + /* validate the chunk sizes */ + zvar->chunkproduct = 1; + for(j=0;jchunksizes[j] = (size_t)chunks[j]; + zvar->chunkproduct *= chunks[j]; + } + zvar->chunksize = zvar->chunkproduct * var->type_info->size; + /* Create the cache */ + if((stat = NCZ_create_chunk_cache(var,var->type_info->size*zvar->chunkproduct,zvar->dimension_separator,&zvar->cache))) + goto done; + } + if((stat = NCZ_adjust_var_cache(var))) goto done; + } + /* Capture row vs column major; currently, column major not used*/ + { + if((stat = NCJdictget(jvar,"order",&jvalue))) goto done; + if(strcmp(NCJstring(jvalue),"C") > 0) + ((NCZ_VAR_INFO_T*)var->format_var_info)->order = 1; + else ((NCZ_VAR_INFO_T*)var->format_var_info)->order = 0; + } + /* filters key */ + /* From V2 Spec: A list of JSON objects providing codec configurations, + or null if no filters are to be applied. Each codec configuration + object MUST contain a "id" key identifying the codec to be used. */ + /* Do filters key before compressor key so final filter chain is in correct order */ + { +#ifdef NETCDF_ENABLE_NCZARR_FILTERS + if(var->filters == NULL) var->filters = (void*)nclistnew(); + if(zvar->incompletefilters == NULL) zvar->incompletefilters = (void*)nclistnew(); + chainindex = 0; /* track location of filter in the chain */ + if((stat = NCZ_filter_initialize())) goto done; + if((stat = NCJdictget(jvar,"filters",&jvalue))) goto done; + if(jvalue != NULL && NCJsort(jvalue) != NCJ_NULL) { + int k; + if(NCJsort(jvalue) != NCJ_ARRAY) {stat = NC_EFILTER; goto done;} + for(k=0;;k++) { + jfilter = NULL; + jfilter = NCJith(jvalue,k); + if(jfilter == NULL) break; /* done */ + if(NCJsort(jfilter) != NCJ_DICT) {stat = NC_EFILTER; goto done;} + if((stat = NCZ_filter_build(file,var,jfilter,chainindex++))) goto done; + } + } +#endif + } + + /* compressor key */ + /* From V2 Spec: A JSON object identifying the primary compression codec and providing + configuration parameters, or ``null`` if no compressor is to be used. */ +#ifdef NETCDF_ENABLE_NCZARR_FILTERS + { + if(var->filters == NULL) var->filters = (void*)nclistnew(); + if((stat = NCZ_filter_initialize())) goto done; + if((stat = NCJdictget(jvar,"compressor",&jfilter))) goto done; + if(jfilter != NULL && NCJsort(jfilter) != NCJ_NULL) { + if(NCJsort(jfilter) != NCJ_DICT) {stat = NC_EFILTER; goto done;} + if((stat = NCZ_filter_build(file,var,jfilter,chainindex++))) goto done; + } + } + /* Suppress variable if there are filters and var is not fixed-size */ + if(varsized && nclistlength((NClist*)var->filters) > 0) + suppress = 1; +#endif + if(zarr_rank > 0) { + if((stat = computedimrefs(file, var, purezarr, xarray, rank, dimnames, shapes, var->dim))) + goto done; + if(!zvar->scalar) { + /* Extract the dimids */ + for(j=0;jdimids[j] = var->dim[j]->hdr.id; + } + } + +#ifdef NETCDF_ENABLE_NCZARR_FILTERS + if(!suppress) { + /* At this point, we can finalize the filters */ + if((stat = NCZ_filter_setup(var))) goto done; + } +#endif + +suppressvar: + if(suppress) { + /* Reclaim NCZarr variable specific info */ + (void)NCZ_zclose_var1(var); + /* Remove from list of variables and reclaim the top level var object */ + (void)nc4_var_list_del(grp, var); + var = NULL; + } + +done: + nclistfreeall(dimnames); dimnames = NULL; + nullfree(varpath); varpath = NULL; + nullfree(shapes); shapes = NULL; + nullfree(key); key = NULL; + NCJreclaim(jvar); + NCJreclaim(jatts); + return THROW(stat); +} + /** * @internal Materialize vars into memory; * Take xarray and purezarr into account. @@ -1450,363 +1793,15 @@ static int define_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, NClist* varnames) { int stat = NC_NOERR; - size_t i,j; - NCZ_FILE_INFO_T* zinfo = NULL; - NCZMAP* map = NULL; - int purezarr = 0; - int xarray = 0; - int formatv1 = 0; + size_t i; ZTRACE(3,"file=%s grp=%s |varnames|=%u",file->controller->path,grp->hdr.name,nclistlength(varnames)); - zinfo = file->format_file_info; - map = zinfo->map; - - if(zinfo->controls.flags & FLAG_PUREZARR) purezarr = 1; - if(zinfo->controls.flags & FLAG_NCZARR_V1) formatv1 = 1; - if(zinfo->controls.flags & FLAG_XARRAYDIMS) {xarray = 1;} - /* Load each var in turn */ for(i = 0; i < nclistlength(varnames); i++) { - /* per-variable info */ - NC_VAR_INFO_T* var = NULL; - NCZ_VAR_INFO_T* zvar = NULL; - NCjson* jvar = NULL; - NCjson* jncvar = NULL; - NCjson* jdimrefs = NULL; - NCjson* jvalue = NULL; - char* varpath = NULL; - char* key = NULL; - const char* varname = NULL; - size64_t* shapes = NULL; - NClist* dimnames = NULL; - int varsized = 0; - int suppress = 0; /* Abort processing of this variable */ - nc_type vtype = NC_NAT; - int vtypelen = 0; - int rank = 0; - int zarr_rank = 0; /* Need to watch out for scalars */ -#ifdef NETCDF_ENABLE_NCZARR_FILTERS - NCjson* jfilter = NULL; - int chainindex = 0; -#endif - - dimnames = nclistnew(); + const char* varname = (const char*)nclistget(varnames,i); + if((stat = define_var1(file,grp,varname))) goto done; varname = nclistget(varnames,i); - - if((stat = nc4_var_list_add2(grp, varname, &var))) - goto done; - - /* And its annotation */ - if((zvar = calloc(1,sizeof(NCZ_VAR_INFO_T)))==NULL) - {stat = NC_ENOMEM; goto done;} - var->format_var_info = zvar; - zvar->common.file = file; - - /* pretend it was created */ - var->created = 1; - - /* Indicate we do not have quantizer yet */ - var->quantize_mode = -1; - - /* Construct var path */ - if((stat = NCZ_varkey(var,&varpath))) - goto done; - - /* Construct the path to the zarray object */ - if((stat = nczm_concat(varpath,ZARRAY,&key))) - goto done; - /* Download the zarray object */ - if((stat=NCZ_readdict(map,key,&jvar))) - goto done; - nullfree(key); key = NULL; - assert(NCJsort(jvar) == NCJ_DICT); - - /* Extract the .zarray info from jvar */ - - /* Verify the format */ - { - int version; - if((stat = NCJdictget(jvar,"zarr_format",&jvalue))) goto done; - sscanf(NCJstring(jvalue),"%d",&version); - if(version != zinfo->zarr.zarr_version) - {stat = (THROW(NC_ENCZARR)); goto done;} - } - - /* Set the type and endianness of the variable */ - { - int endianness; - if((stat = NCJdictget(jvar,"dtype",&jvalue))) goto done; - /* Convert dtype to nc_type + endianness */ - if((stat = ncz_dtype2nctype(NCJstring(jvalue),NC_NAT,purezarr,&vtype,&endianness,&vtypelen))) - goto done; - if(vtype > NC_NAT && vtype <= NC_MAX_ATOMIC_TYPE) { - /* Locate the NC_TYPE_INFO_T object */ - if((stat = ncz_gettype(file,grp,vtype,&var->type_info))) - goto done; - } else {stat = NC_EBADTYPE; goto done;} -#if 0 /* leave native in place */ - if(endianness == NC_ENDIAN_NATIVE) - endianness = zinfo->native_endianness; - if(endianness == NC_ENDIAN_NATIVE) - endianness = (NCZ_isLittleEndian()?NC_ENDIAN_LITTLE:NC_ENDIAN_BIG); - if(endianness == NC_ENDIAN_LITTLE || endianness == NC_ENDIAN_BIG) { - var->endianness = endianness; - } else {stat = NC_EBADTYPE; goto done;} -#else - var->endianness = endianness; -#endif - var->type_info->endianness = var->endianness; /* Propagate */ - if(vtype == NC_STRING) { - zvar->maxstrlen = vtypelen; - vtypelen = sizeof(char*); /* in-memory len */ - if(zvar->maxstrlen <= 0) zvar->maxstrlen = NCZ_get_maxstrlen((NC_OBJ*)var); - } - } - - if(!purezarr) { - /* Extract the _NCZARR_ARRAY values */ - /* Do this first so we know about storage esp. scalar */ - if(formatv1) { - /* Construct the path to the zarray object */ - if((stat = nczm_concat(varpath,NCZARRAY,&key))) - goto done; - /* Download the nczarray object */ - if((stat=NCZ_readdict(map,key,&jncvar))) - goto done; - nullfree(key); key = NULL; - } else {/* format v2 */ - /* Extract the NCZ_V2_ARRAY dict */ - if((stat = NCJdictget(jvar,NCZ_V2_ARRAY,&jncvar))) goto done; - if(!stat && jncvar == NULL) - {if((stat = NCJdictget(jvar,NCZ_V2_ARRAY_UC,&jncvar))) goto done;} - } - if(jncvar == NULL) {stat = NC_ENCZARR; goto done;} - assert((NCJsort(jncvar) == NCJ_DICT)); - /* Extract scalar flag */ - if((stat = NCJdictget(jncvar,"scalar",&jvalue))) - goto done; - if(jvalue != NULL) { - var->storage = NC_CHUNKED; - zvar->scalar = 1; - } - /* Extract storage flag */ - if((stat = NCJdictget(jncvar,"storage",&jvalue))) - goto done; - if(jvalue != NULL) { - var->storage = NC_CHUNKED; - } - /* Extract dimrefs list */ - switch ((stat = NCJdictget(jncvar,"dimrefs",&jdimrefs))) { - case NC_NOERR: /* Extract the dimref names */ - assert((NCJsort(jdimrefs) == NCJ_ARRAY)); - if(zvar->scalar) { - assert(NCJlength(jdimrefs) == 0); - } else { - rank = NCJlength(jdimrefs); - for(j=0;jdimension_separator = 0; - if((stat = NCJdictget(jvar,"dimension_separator",&jvalue))) goto done; - if(jvalue != NULL) { - /* Verify its value */ - if(NCJsort(jvalue) == NCJ_STRING && NCJstring(jvalue) != NULL && strlen(NCJstring(jvalue)) == 1) - zvar->dimension_separator = NCJstring(jvalue)[0]; - } - /* If value is invalid, then use global default */ - if(!islegaldimsep(zvar->dimension_separator)) - zvar->dimension_separator = ngs->zarr.dimension_separator; /* use global value */ - assert(islegaldimsep(zvar->dimension_separator)); /* we are hosed */ - } - - /* fill_value; must precede calls to adjust cache */ - { - if((stat = NCJdictget(jvar,"fill_value",&jvalue))) goto done; - if(jvalue == NULL || NCJsort(jvalue) == NCJ_NULL) - var->no_fill = 1; - else { - size_t fvlen; - nc_type atypeid = vtype; - var->no_fill = 0; - if((stat = computeattrdata(var->type_info->hdr.id, &atypeid, jvalue, NULL, &fvlen, &var->fill_value))) - goto done; - assert(atypeid == vtype); - /* Note that we do not create the _FillValue - attribute here to avoid having to read all - the attributes and thus foiling lazy read.*/ - } - } - - /* shape */ - { - if((stat = NCJdictget(jvar,"shape",&jvalue))) goto done; - if(NCJsort(jvalue) != NCJ_ARRAY) {stat = (THROW(NC_ENCZARR)); goto done;} - - /* Process the rank */ - zarr_rank = NCJlength(jvalue); - if(zarr_rank == 0) { - /* suppress variable */ - ZLOG(NCLOGWARN,"Empty shape for variable %s suppressed",var->hdr.name); - suppress = 1; - goto suppressvar; - } - - if(zvar->scalar) { - rank = 0; - zarr_rank = 1; /* Zarr does not support scalars */ - } else - rank = (zarr_rank = NCJlength(jvalue)); - - if(zarr_rank > 0) { - /* Save the rank of the variable */ - if((stat = nc4_var_set_ndims(var, rank))) goto done; - /* extract the shapes */ - if((shapes = (size64_t*)malloc(sizeof(size64_t)*(size_t)zarr_rank)) == NULL) - {stat = (THROW(NC_ENOMEM)); goto done;} - if((stat = decodeints(jvalue, shapes))) goto done; - } - } - - /* chunks */ - { - size64_t chunks[NC_MAX_VAR_DIMS]; - if((stat = NCJdictget(jvar,"chunks",&jvalue))) goto done; - if(jvalue != NULL && NCJsort(jvalue) != NCJ_ARRAY) - {stat = (THROW(NC_ENCZARR)); goto done;} - /* Verify the rank */ - if(zvar->scalar || zarr_rank == 0) { - if(var->ndims != 0) - {stat = (THROW(NC_ENCZARR)); goto done;} - zvar->chunkproduct = 1; - zvar->chunksize = zvar->chunkproduct * var->type_info->size; - /* Create the cache */ - if((stat = NCZ_create_chunk_cache(var,var->type_info->size*zvar->chunkproduct,zvar->dimension_separator,&zvar->cache))) - goto done; - } else {/* !zvar->scalar */ - if(zarr_rank == 0) {stat = NC_ENCZARR; goto done;} - var->storage = NC_CHUNKED; - if(var->ndims != rank) - {stat = (THROW(NC_ENCZARR)); goto done;} - if((var->chunksizes = malloc(sizeof(size_t)*(size_t)zarr_rank)) == NULL) - {stat = NC_ENOMEM; goto done;} - if((stat = decodeints(jvalue, chunks))) goto done; - /* validate the chunk sizes */ - zvar->chunkproduct = 1; - for(j=0;jchunksizes[j] = (size_t)chunks[j]; - zvar->chunkproduct *= chunks[j]; - } - zvar->chunksize = zvar->chunkproduct * var->type_info->size; - /* Create the cache */ - if((stat = NCZ_create_chunk_cache(var,var->type_info->size*zvar->chunkproduct,zvar->dimension_separator,&zvar->cache))) - goto done; - } - if((stat = NCZ_adjust_var_cache(var))) goto done; - } - /* Capture row vs column major; currently, column major not used*/ - { - if((stat = NCJdictget(jvar,"order",&jvalue))) goto done; - if(strcmp(NCJstring(jvalue),"C") > 0) - ((NCZ_VAR_INFO_T*)var->format_var_info)->order = 1; - else ((NCZ_VAR_INFO_T*)var->format_var_info)->order = 0; - } - /* filters key */ - /* From V2 Spec: A list of JSON objects providing codec configurations, - or null if no filters are to be applied. Each codec configuration - object MUST contain a "id" key identifying the codec to be used. */ - /* Do filters key before compressor key so final filter chain is in correct order */ - { -#ifdef NETCDF_ENABLE_NCZARR_FILTERS - if(var->filters == NULL) var->filters = (void*)nclistnew(); - if(zvar->incompletefilters == NULL) zvar->incompletefilters = (void*)nclistnew(); - chainindex = 0; /* track location of filter in the chain */ - if((stat = NCZ_filter_initialize())) goto done; - if((stat = NCJdictget(jvar,"filters",&jvalue))) goto done; - if(jvalue != NULL && NCJsort(jvalue) != NCJ_NULL) { - int k; - if(NCJsort(jvalue) != NCJ_ARRAY) {stat = NC_EFILTER; goto done;} - for(k=0;;k++) { - jfilter = NULL; - jfilter = NCJith(jvalue,k); - if(jfilter == NULL) break; /* done */ - if(NCJsort(jfilter) != NCJ_DICT) {stat = NC_EFILTER; goto done;} - if((stat = NCZ_filter_build(file,var,jfilter,chainindex++))) goto done; - } - } -#endif - } - - /* compressor key */ - /* From V2 Spec: A JSON object identifying the primary compression codec and providing - configuration parameters, or ``null`` if no compressor is to be used. */ -#ifdef NETCDF_ENABLE_NCZARR_FILTERS - { - if(var->filters == NULL) var->filters = (void*)nclistnew(); - if((stat = NCZ_filter_initialize())) goto done; - if((stat = NCJdictget(jvar,"compressor",&jfilter))) goto done; - if(jfilter != NULL && NCJsort(jfilter) != NCJ_NULL) { - if(NCJsort(jfilter) != NCJ_DICT) {stat = NC_EFILTER; goto done;} - if((stat = NCZ_filter_build(file,var,jfilter,chainindex++))) goto done; - } - } - /* Suppress variable if there are filters and var is not fixed-size */ - if(varsized && nclistlength((NClist*)var->filters) > 0) - suppress = 1; -#endif - if(zarr_rank > 0) { - if((stat = computedimrefs(file, var, purezarr, xarray, rank, dimnames, shapes, var->dim))) - goto done; - if(!zvar->scalar) { - /* Extract the dimids */ - for(j=0;jdimids[j] = var->dim[j]->hdr.id; - } - } - -#ifdef NETCDF_ENABLE_NCZARR_FILTERS - if(!suppress) { - /* At this point, we can finalize the filters */ - if((stat = NCZ_filter_setup(var))) goto done; - } -#endif - -suppressvar: - if(suppress) { - /* Reclaim NCZarr variable specific info */ - (void)NCZ_zclose_var1(var); - /* Remove from list of variables and reclaim the top level var object */ - (void)nc4_var_list_del(grp, var); - var = NULL; - } - - /* Clean up from last cycle */ - nclistfreeall(dimnames); dimnames = NULL; - nullfree(varpath); varpath = NULL; - nullfree(shapes); shapes = NULL; - nullfree(key); key = NULL; - if(formatv1) {NCJreclaim(jncvar); jncvar = NULL;} - NCJreclaim(jvar); jvar = NULL; - var = NULL; } done: @@ -1862,6 +1857,7 @@ ncz_read_superblock(NC_FILE_INFO_T* file, char** nczarrvp, char** zarrfp) { int stat = NC_NOERR; NCjson* jnczgroup = NULL; + NCjson* jnczattr = NULL; NCjson* jzgroup = NULL; NCjson* jsuper = NULL; NCjson* jtmp = NULL; @@ -1871,40 +1867,24 @@ ncz_read_superblock(NC_FILE_INFO_T* file, char** nczarrvp, char** zarrfp) ZTRACE(3,"file=%s",file->controller->path); - /* See if the V1 META-Root is being used */ - switch(stat = NCZ_downloadjson(zinfo->map, NCZMETAROOT, &jnczgroup)) { - case NC_EEMPTY: /* not there */ - stat = NC_NOERR; - break; - case NC_NOERR: - if((stat = NCJdictget(jnczgroup,"nczarr_version",&jtmp))) goto done; - nczarr_version = strdup(NCJstring(jtmp)); - break; - default: goto done; - } /* Get Zarr Root Group, if any */ - switch(stat = NCZ_downloadjson(zinfo->map, ZMETAROOT, &jzgroup)) { - case NC_NOERR: - break; - case NC_EEMPTY: /* not there */ - stat = NC_NOERR; - assert(jzgroup == NULL); - break; - default: goto done; - } - if(jzgroup != NULL) { - /* See if this NCZarr V2 */ - if((stat = NCJdictget(jzgroup,NCZ_V2_SUPERBLOCK,&jsuper))) goto done; - if(!stat && jsuper == NULL) { /* try uppercase name */ - if((stat = NCJdictget(jzgroup,NCZ_V2_SUPERBLOCK_UC,&jsuper))) goto done; - } + if((stat = NCZ_downloadjson(zinfo->map, ZMETAROOT, &jzgroup))) goto done; + + /* Get corresponding .zattr, if any */ + if((stat = NCZ_downloadjson(zinfo->map, ZMETAATTR, &jnczattr))) goto done; + + /* Look for superblock */ + if(jnczattr != NULL) { + if(jnczattr->sort != NCJ_DICT) {stat = NC_ENCZARR; goto done;} + NCJdictget(jnczattr, NCZ_V2_SUPERBLOCK,&jsuper); if(jsuper != NULL) { - /* Extract the equivalent attribute */ - if(jsuper->sort != NCJ_DICT) - {stat = NC_ENCZARR; goto done;} + if(jsuper->sort != NCJ_DICT) {stat = NC_ENCZARR; goto done;} if((stat = NCJdictget(jsuper,"version",&jtmp))) goto done; nczarr_version = nulldup(NCJstring(jtmp)); } + } + if(jzgroup != NULL) { + if(jzgroup->sort != NCJ_DICT) {stat = NC_ENCZARR; goto done;} /* In any case, extract the zarr format */ if((stat = NCJdictget(jzgroup,"zarr_format",&jtmp))) goto done; assert(zarr_format == NULL); @@ -1917,7 +1897,6 @@ ncz_read_superblock(NC_FILE_INFO_T* file, char** nczarrvp, char** zarrfp) if((stat = ncz_validate(file))) goto done; /* ok, assume pure zarr with no groups */ zinfo->controls.flags |= FLAG_PUREZARR; - zinfo->controls.flags &= ~(FLAG_NCZARR_V1); if(zarr_format == NULL) zarr_format = strdup("2"); } else if(jnczgroup != NULL) { zinfo->controls.flags |= FLAG_NCZARR_V1; @@ -1933,6 +1912,7 @@ done: nullfree(nczarr_version); NCJreclaim(jzgroup); NCJreclaim(jnczgroup); + NCJreclaim(jnczattr); return ZUNTRACE(THROW(stat)); } @@ -1942,12 +1922,13 @@ done: static int parse_group_content(NCjson* jcontent, NClist* dimdefs, NClist* varnames, NClist* subgrps) { - int i,stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; NCjson* jvalue = NULL; ZTRACE(3,"jcontent=|%s| |dimdefs|=%u |varnames|=%u |subgrps|=%u",NCJtotext(jcontent),(unsigned)nclistlength(dimdefs),(unsigned)nclistlength(varnames),(unsigned)nclistlength(subgrps)); - if((stat=NCJdictget(jcontent,"dims",&jvalue))) goto done; + if((stat=NCJdictget(jcontent,"dimensions",&jvalue))) goto done; if(jvalue != NULL) { if(NCJsort(jvalue) != NCJ_DICT) {stat = (THROW(NC_ENCZARR)); goto done;} /* Extract the dimensions defined in this group */ @@ -1980,7 +1961,7 @@ parse_group_content(NCjson* jcontent, NClist* dimdefs, NClist* varnames, NClist* } } - if((stat=NCJdictget(jcontent,"vars",&jvalue))) goto done; + if((stat=NCJdictget(jcontent,"arrays",&jvalue))) goto done; if(jvalue != NULL) { /* Extract the variable names in this group */ for(i=0;imap,zakey,&jvar))) + if((stat=readdict(zinfo->map,zakey,&jvar))) goto done; assert((NCJsort(jvar) == NCJ_DICT)); nullfree(varkey); varkey = NULL; @@ -2134,7 +2115,8 @@ done: static int decodeints(NCjson* jshape, size64_t* shapes) { - int i, stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; for(i=0;icontroller->path,grp->hdr.name); + + if(jatts == NULL) goto done; + + zinfo = file->format_file_info; + map = zinfo->map; + + if(container->sort == NCVAR) { + var = (NC_VAR_INFO_T*)container; + } else if(container->sort == NCGRP) { + grp = (NC_GRP_INFO_T*)container; + } + + /* Construct container path */ + if(container->sort == NCGRP) + stat = NCZ_grpkey(grp,&fullpath); + else + stat = NCZ_varkey(var,&fullpath); + if(stat) goto done; + + /* write .zattrs*/ + if((stat = nczm_concat(fullpath,ZATTRS,&key))) goto done; + if((stat=NCZ_uploadjson(map,key,jatts))) goto done; + nullfree(key); key = NULL; + +done: + nullfree(fullpath); + return ZUNTRACE(THROW(stat)); +} + +#if 0 +/** +@internal Get contents of a meta object; fail it it does not exist +@param zmap - [in] map +@param key - [in] key of the object +@param jsonp - [out] return parsed json || NULL if not exists +@return NC_NOERR +@return NC_EXXX +@author Dennis Heimbigner +*/ +static int +readarray(NCZMAP* zmap, const char* key, NCjson** jsonp) +{ + int stat = NC_NOERR; + NCjson* json = NULL; + + if((stat = NCZ_downloadjson(zmap,key,&json))) goto done; + if(json != NULL && NCJsort(json) != NCJ_ARRAY) {stat = NC_ENCZARR; goto done;} + if(jsonp) {*jsonp = json; json = NULL;} +done: + NCJreclaim(json); + return stat; +} +#endif + +/** +@internal Get contents of a meta object; fail it it does not exist +@param zmap - [in] map +@param key - [in] key of the object +@param jsonp - [out] return parsed json || NULL if non-existent +@return NC_NOERR +@return NC_EXXX +@author Dennis Heimbigner +*/ +static int +readdict(NCZMAP* zmap, const char* key, NCjson** jsonp) +{ + int stat = NC_NOERR; + NCjson* json = NULL; + + if((stat = NCZ_downloadjson(zmap,key,&json))) goto done; + if(json != NULL) { + if(NCJsort(json) != NCJ_DICT) {stat = NC_ENCZARR; goto done;} + } + if(jsonp) {*jsonp = json; json = NULL;} +done: + NCJreclaim(json); + return stat; +} diff --git a/libnczarr/zutil.c b/libnczarr/zutil.c index 1d4973854..6c4fd8904 100644 --- a/libnczarr/zutil.c +++ b/libnczarr/zutil.c @@ -226,8 +226,9 @@ ncz_splitkey(const char* key, NClist* segments) @internal Down load a .z... structure into memory @param zmap - [in] controlling zarr map @param key - [in] .z... object to load -@param jsonp - [out] root of the loaded json +@param jsonp - [out] root of the loaded json (NULL if key does not exist) @return NC_NOERR +@return NC_EXXX @author Dennis Heimbigner */ int @@ -238,17 +239,22 @@ NCZ_downloadjson(NCZMAP* zmap, const char* key, NCjson** jsonp) char* content = NULL; NCjson* json = NULL; - if((stat = nczmap_len(zmap, key, &len))) - goto done; + switch(stat = nczmap_len(zmap, key, &len)) { + case NC_NOERR: break; + case NC_ENOOBJECT: case NC_EEMPTY: + stat = NC_NOERR; + goto exit; + default: goto done; + } if((content = malloc(len+1)) == NULL) {stat = NC_ENOMEM; goto done;} if((stat = nczmap_read(zmap, key, 0, len, (void*)content))) goto done; content[len] = '\0'; - if((stat = NCJparse(content,0,&json)) < 0) {stat = NC_ENCZARR; goto done;} +exit: if(jsonp) {*jsonp = json; json = NULL;} done: @@ -310,13 +316,9 @@ NCZ_createdict(NCZMAP* zmap, const char* key, NCjson** jsonp) NCjson* json = NULL; /* See if it already exists */ - stat = NCZ_downloadjson(zmap,key,&json); - if(stat != NC_NOERR) { - if(stat == NC_EEMPTY) {/* create it */ - if((stat = nczmap_def(zmap,key,NCZ_ISMETA))) - goto done; - } else - goto done; + if((stat = NCZ_downloadjson(zmap,key,&json))) goto done; + ifjson == NULL) { + if((stat = nczmap_def(zmap,key,NCZ_ISMETA))) goto done; } else { /* Already exists, fail */ stat = NC_EINVAL; @@ -346,18 +348,14 @@ NCZ_createarray(NCZMAP* zmap, const char* key, NCjson** jsonp) int stat = NC_NOERR; NCjson* json = NULL; - stat = NCZ_downloadjson(zmap,key,&json); - if(stat != NC_NOERR) { - if(stat == NC_EEMPTY) {/* create it */ - if((stat = nczmap_def(zmap,key,NCZ_ISMETA))) - goto done; - /* Create the initial array */ - if((stat = NCJnew(NCJ_ARRAY,&json))) - goto done; - } else { - stat = NC_EINVAL; - goto done; - } + if((stat = NCZ_downloadjson(zmap,key,&json))) goto done; + if(json == NULL) { /* create it */ + if((stat = nczmap_def(zmap,key,NCZ_ISMETA))) goto done; + /* Create the initial array */ + if((stat = NCJnew(NCJ_ARRAY,&json))) goto done; + } else { + stat = NC_EINVAL; + goto done; } if(json->sort != NCJ_ARRAY) {stat = NC_ENCZARR; goto done;} if(jsonp) {*jsonp = json; json = NULL;} @@ -367,54 +365,6 @@ done: } #endif /*0*/ -/** -@internal Get contents of a meta object; fail it it does not exist -@param zmap - [in] map -@param key - [in] key of the object -@param jsonp - [out] return parsed json -@return NC_NOERR -@return NC_EEMPTY [object did not exist] -@author Dennis Heimbigner -*/ -int -NCZ_readdict(NCZMAP* zmap, const char* key, NCjson** jsonp) -{ - int stat = NC_NOERR; - NCjson* json = NULL; - - if((stat = NCZ_downloadjson(zmap,key,&json))) - goto done; - if(NCJsort(json) != NCJ_DICT) {stat = NC_ENCZARR; goto done;} - if(jsonp) {*jsonp = json; json = NULL;} -done: - NCJreclaim(json); - return stat; -} - -/** -@internal Get contents of a meta object; fail it it does not exist -@param zmap - [in] map -@param key - [in] key of the object -@param jsonp - [out] return parsed json -@return NC_NOERR -@return NC_EEMPTY [object did not exist] -@author Dennis Heimbigner -*/ -int -NCZ_readarray(NCZMAP* zmap, const char* key, NCjson** jsonp) -{ - int stat = NC_NOERR; - NCjson* json = NULL; - - if((stat = NCZ_downloadjson(zmap,key,&json))) - goto done; - if(NCJsort(json) != NCJ_ARRAY) {stat = NC_ENCZARR; goto done;} - if(jsonp) {*jsonp = json; json = NULL;} -done: - NCJreclaim(json); - return stat; -} - #if 0 /** @internal Given an nc_type, produce the corresponding diff --git a/libnczarr/zxcache.c b/libnczarr/zxcache.c index 957ed1525..f4ab040d7 100644 --- a/libnczarr/zxcache.c +++ b/libnczarr/zxcache.c @@ -78,7 +78,7 @@ NCZ_set_var_chunk_cache(int ncid, int varid, size_t cachesize, size_t nelems, fl assert(grp && h5); /* Find the var. */ - if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, varid))) + if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, (size_t)varid))) {retval = NC_ENOTVAR; goto done;} assert(var && var->hdr.id == varid); @@ -140,7 +140,7 @@ fprintf(stderr,"xxx: adjusting cache for: %s\n",var->hdr.name); zcache->chunksize = zvar->chunksize; zcache->chunkcount = 1; if(var->ndims > 0) { - int i; + size_t i; for(i=0;indims;i++) { zcache->chunkcount *= var->chunksizes[i]; } @@ -184,7 +184,7 @@ NCZ_create_chunk_cache(NC_VAR_INFO_T* var, size64_t chunksize, char dimsep, NCZC cache->chunkcount = 1; if(var->ndims > 0) { - int i; + size_t i; for(i=0;indims;i++) { cache->chunkcount *= var->chunksizes[i]; } @@ -297,7 +297,7 @@ NCZ_read_cache_chunk(NCZChunkCache* cache, const size64_t* indices, void** datap /* Create a new entry */ if((entry = calloc(1,sizeof(NCZCacheEntry)))==NULL) {stat = NC_ENOMEM; goto done;} - memcpy(entry->indices,indices,rank*sizeof(size64_t)); + memcpy(entry->indices,indices,(size_t)rank*sizeof(size64_t)); /* Create the key for this cache */ if((stat = NCZ_buildchunkpath(cache,indices,&entry->key))) goto done; entry->hashkey = hkey; @@ -496,7 +496,8 @@ done: int NCZ_ensure_fill_chunk(NCZChunkCache* cache) { - int i, stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; NC_VAR_INFO_T* var = cache->var; nc_type typeid = var->type_info->hdr.id; size_t typesize = var->type_info->size; @@ -605,7 +606,7 @@ int NCZ_buildchunkkey(size_t R, const size64_t* chunkindices, char dimsep, char** keyp) { int stat = NC_NOERR; - int r; + size_t r; NCbytes* key = ncbytesnew(); if(keyp) *keyp = NULL; @@ -670,7 +671,7 @@ put_chunk(NCZChunkCache* cache, NCZCacheEntry* entry) if((stat = NC_reclaim_data_all(file->controller,tid,entry->data,cache->chunkcount))) goto done; entry->data = NULL; entry->data = strchunk; strchunk = NULL; - entry->size = cache->chunkcount * maxstrlen; + entry->size = (cache->chunkcount * (size64_t)maxstrlen); entry->isfixedstring = 1; } @@ -865,7 +866,7 @@ NCZ_dumpxcacheentry(NCZChunkCache* cache, NCZCacheEntry* e, NCbytes* buf) { char s[8192]; char idx[64]; - int i; + size_t i; ncbytescat(buf,"{"); snprintf(s,sizeof(s),"modified=%u isfiltered=%u indices=", diff --git a/libsrc4/nc4internal.c b/libsrc4/nc4internal.c index f7f32c7ca..3274c89a6 100644 --- a/libsrc4/nc4internal.c +++ b/libsrc4/nc4internal.c @@ -49,13 +49,15 @@ static NC_reservedatt NC_reserved[] = { {NC_ATT_FORMAT, READONLYFLAG}, /*_Format*/ {ISNETCDF4ATT, READONLYFLAG|NAMEONLYFLAG|VIRTUALFLAG}, /*_IsNetcdf4*/ {NCPROPS,READONLYFLAG|NAMEONLYFLAG|HIDDENATTRFLAG}, /*_NCProperties*/ - {NC_NCZARR_ATTR_UC, READONLYFLAG|NAMEONLYFLAG|HIDDENATTRFLAG}, /*_NCZARR_ATTR */ {NC_ATT_COORDINATES, READONLYFLAG|HIDDENATTRFLAG}, /*_Netcdf4Coordinates*/ {NC_ATT_DIMID_NAME, READONLYFLAG|HIDDENATTRFLAG}, /*_Netcdf4Dimid*/ {SUPERBLOCKATT, READONLYFLAG|NAMEONLYFLAG|VIRTUALFLAG}, /*_SuperblockVersion*/ {NC_ATT_NC3_STRICT_NAME, READONLYFLAG}, /*_nc3_strict*/ {NC_ATT_NC3_STRICT_NAME, READONLYFLAG}, /*_nc3_strict*/ {NC_NCZARR_ATTR, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_attr */ + {NC_NCZARR_GROUP, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_group */ + {NC_NCZARR_ARRAY, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_array */ + {NC_NCZARR_SUPERBLOCK, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_superblock */ }; #define NRESERVED (sizeof(NC_reserved) / sizeof(NC_reservedatt)) /*|NC_reservedatt*/ diff --git a/nczarr_test/Makefile.am b/nczarr_test/Makefile.am index 541526780..17d7bbf19 100644 --- a/nczarr_test/Makefile.am +++ b/nczarr_test/Makefile.am @@ -228,7 +228,7 @@ ref_any.cdl ref_oldformat.cdl ref_oldformat.zip ref_newformatpure.cdl \ ref_groups.h5 ref_byte.zarr.zip ref_byte_fill_value_null.zarr.zip \ ref_groups_regular.cdl ref_byte.cdl ref_byte_fill_value_null.cdl \ ref_jsonconvention.cdl ref_jsonconvention.zmap \ -ref_string.cdl ref_string_nczarr.baseline ref_string_zarr.baseline ref_scalar.cdl \ +ref_string.cdl ref_string_nczarr.baseline ref_string_zarr.baseline ref_scalar.cdl ref_scalar_nczarr.cdl \ ref_nulls_nczarr.baseline ref_nulls_zarr.baseline ref_nulls.cdl ref_notzarr.tar.gz # Interoperability files diff --git a/nczarr_test/ncdumpchunks.c b/nczarr_test/ncdumpchunks.c index 0c93ca8c9..758314aa2 100644 --- a/nczarr_test/ncdumpchunks.c +++ b/nczarr_test/ncdumpchunks.c @@ -50,7 +50,7 @@ typedef struct Format { int debug; int linear; int holevalue; - int rank; + size_t rank; size_t dimlens[NC_MAX_VAR_DIMS]; size_t chunklens[NC_MAX_VAR_DIMS]; size_t chunkcounts[NC_MAX_VAR_DIMS]; @@ -60,7 +60,7 @@ typedef struct Format { } Format; typedef struct Odometer { - int rank; /*rank */ + size_t rank; /*rank */ size_t start[NC_MAX_VAR_DIMS]; size_t stop[NC_MAX_VAR_DIMS]; size_t max[NC_MAX_VAR_DIMS]; /* max size of ith index */ @@ -71,11 +71,11 @@ typedef struct Odometer { #define ceildiv(x,y) (((x) % (y)) == 0 ? ((x) / (y)) : (((x) / (y)) + 1)) static char* captured[4096]; -static int ncap = 0; +static size_t ncap = 0; extern int nc__testurl(const char*,char**); -Odometer* odom_new(int rank, const size_t* stop, const size_t* max); +Odometer* odom_new(size_t rank, const size_t* stop, const size_t* max); void odom_free(Odometer* odom); int odom_more(Odometer* odom); int odom_next(Odometer* odom); @@ -120,9 +120,9 @@ cleanup(void) } Odometer* -odom_new(int rank, const size_t* stop, const size_t* max) +odom_new(size_t rank, const size_t* stop, const size_t* max) { - int i; + size_t i; Odometer* odom = NULL; if((odom = calloc(1,sizeof(Odometer))) == NULL) return NULL; @@ -339,12 +339,12 @@ dump(Format* format) { void* chunkdata = NULL; /*[CHUNKPROD];*/ Odometer* odom = NULL; - int r; + size_t r; size_t offset[NC_MAX_VAR_DIMS]; int holechunk = 0; char sindices[64]; #ifdef H5 - int i; + size_t i; hid_t fileid, grpid, datasetid; hid_t dxpl_id = H5P_DEFAULT; /*data transfer property list */ unsigned int filter_mask = 0; @@ -388,7 +388,7 @@ dump(Format* format) if((chunkdata = calloc(sizeof(int),format->chunkprod))==NULL) usage(NC_ENOMEM); - printf("rank=%d dims=(%s) chunks=(%s)\n",format->rank,printvector(format->rank,format->dimlens), + printf("rank=%zu dims=(%s) chunks=(%s)\n",format->rank,printvector(format->rank,format->dimlens), printvector(format->rank,format->chunklens)); while(odom_more(odom)) { @@ -506,12 +506,14 @@ done: int main(int argc, char** argv) { - int i,stat = NC_NOERR; + int stat = NC_NOERR; + size_t i; Format format; int ncid, varid, dimids[NC_MAX_VAR_DIMS]; int vtype, storage; int mode; int c; + int r; memset(&format,0,sizeof(format)); @@ -577,7 +579,8 @@ main(int argc, char** argv) /* Get the info about the var */ if((stat=nc_inq_varid(ncid,format.var_name,&varid))) usage(stat); - if((stat=nc_inq_var(ncid,varid,NULL,&vtype,&format.rank,dimids,NULL))) usage(stat); + if((stat=nc_inq_var(ncid,varid,NULL,&vtype,&r,dimids,NULL))) usage(stat); + format.rank = (size_t)r; if(format.rank == 0) usage(NC_EDIMSIZE); if((stat=nc_inq_var_chunking(ncid,varid,&storage,format.chunklens))) usage(stat); if(storage != NC_CHUNKED) usage(NC_EBADCHUNK); diff --git a/nczarr_test/ref_any.cdl b/nczarr_test/ref_any.cdl index bbbc30e86..3486f32e4 100644 --- a/nczarr_test/ref_any.cdl +++ b/nczarr_test/ref_any.cdl @@ -4,39 +4,21 @@ dimensions: dim1 = 4 ; dim2 = 4 ; variables: - int ivar(dim0, dim1, dim2) ; - ivar:_FillValue = -2147483647 ; - ivar:_Storage = @chunked@ ; - ivar:_ChunkSizes = 4, 4, 4 ; - ivar:_Filter = @IH5@ ; - ivar:_Codecs = @ICX@ ; float fvar(dim0, dim1, dim2) ; fvar:_FillValue = 9.96921e+36f ; fvar:_Storage = @chunked@ ; fvar:_ChunkSizes = 4, 4, 4 ; fvar:_Filter = @FH5@ ; fvar:_Codecs = @FCX@ ; + int ivar(dim0, dim1, dim2) ; + ivar:_FillValue = -2147483647 ; + ivar:_Storage = @chunked@ ; + ivar:_ChunkSizes = 4, 4, 4 ; + ivar:_Filter = @IH5@ ; + ivar:_Codecs = @ICX@ ; data: - ivar = - 0, 1, 2, 3, - 4, 5, 6, 7, - 8, 9, 10, 11, - 12, 13, 14, 15, - 16, 17, 18, 19, - 20, 21, 22, 23, - 24, 25, 26, 27, - 28, 29, 30, 31, - 32, 33, 34, 35, - 36, 37, 38, 39, - 40, 41, 42, 43, - 44, 45, 46, 47, - 48, 49, 50, 51, - 52, 53, 54, 55, - 56, 57, 58, 59, - 60, 61, 62, 63 ; - fvar = 0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, @@ -54,4 +36,22 @@ data: 52.5, 53.5, 54.5, 55.5, 56.5, 57.5, 58.5, 59.5, 60.5, 61.5, 62.5, 63.5 ; + + ivar = + 0, 1, 2, 3, + 4, 5, 6, 7, + 8, 9, 10, 11, + 12, 13, 14, 15, + 16, 17, 18, 19, + 20, 21, 22, 23, + 24, 25, 26, 27, + 28, 29, 30, 31, + 32, 33, 34, 35, + 36, 37, 38, 39, + 40, 41, 42, 43, + 44, 45, 46, 47, + 48, 49, 50, 51, + 52, 53, 54, 55, + 56, 57, 58, 59, + 60, 61, 62, 63 ; } diff --git a/nczarr_test/ref_byte.cdl b/nczarr_test/ref_byte.cdl index 7be7c03d3..62c28a2d2 100644 --- a/nczarr_test/ref_byte.cdl +++ b/nczarr_test/ref_byte.cdl @@ -1,8 +1,8 @@ netcdf ref_byte { dimensions: - _zdim_20 = 20 ; + _Anonymous_Dim_20 = 20 ; variables: - ubyte byte(_zdim_20, _zdim_20) ; + ubyte byte(_Anonymous_Dim_20, _Anonymous_Dim_20) ; byte:_Storage = "chunked" ; byte:_ChunkSizes = 20, 20 ; diff --git a/nczarr_test/ref_byte_fill_value_null.cdl b/nczarr_test/ref_byte_fill_value_null.cdl index 6afd2ef37..93bcad298 100644 --- a/nczarr_test/ref_byte_fill_value_null.cdl +++ b/nczarr_test/ref_byte_fill_value_null.cdl @@ -1,8 +1,8 @@ netcdf ref_byte_fill_value_null { dimensions: - _zdim_20 = 20 ; + _Anonymous_Dim_20 = 20 ; variables: - ubyte byt(_zdim_20, _zdim_20) ; + ubyte byt(_Anonymous_Dim_20, _Anonymous_Dim_20) ; byt:_Storage = "chunked" ; byt:_ChunkSizes = 20, 20 ; byt:_NoFill = "true" ; diff --git a/nczarr_test/ref_byte_fill_value_null.zarr.zip b/nczarr_test/ref_byte_fill_value_null.zarr.zip index a548c912e16fd782da86beba404114579cb8256d..0717576450274be6f172507ab0cdec4de03862f8 100644 GIT binary patch delta 486 zcmZqToykAJRq?WAURr4dHv=Qf3uXoeFcIL*%)-S00{S!0M@;5qH0SWL&P$5{;>jM2 zvXkF4I&aA=I1mXgNhmDPo0~ly=z^GJZ0l5$W$AFDb delta 410 zcmbQq-^4q?Rs7jrwZzg2ZU#n{7t9O{U?RYqnPsvZyTs&q%mys{K>5j+m_sKrNlYx# zHhQ~9EfFMyMT-an0|N(xbG>A6okadGbs$d5lV^I(jRGz?8YE{kPTUdzv5$kXZn6TClN1)4e*do*;Zb9lEWqpwQJcuDJXxJ3 zZ?X!z9Am`f02al``mdS-I za%wpI0M)<<)V859oN;m>i^9YY?2`{Lu}v;!k(vCRMTO}hRFrqJ7pv0b`>b+IE14kb uSSBm8DS{j_c_*73lK?YFbh0ndjm0byBCKp6r?3OzYbFMUMqm^$FaQAV9%XL; diff --git a/nczarr_test/ref_groups_regular.cdl b/nczarr_test/ref_groups_regular.cdl index fd1875427..93ad0384b 100644 --- a/nczarr_test/ref_groups_regular.cdl +++ b/nczarr_test/ref_groups_regular.cdl @@ -1,15 +1,15 @@ netcdf tmp_groups_regular { dimensions: - _zdim_3 = 3 ; - _zdim_2 = 2 ; - _zdim_10 = 10 ; + _Anonymous_Dim_3 = 3 ; + _Anonymous_Dim_2 = 2 ; + _Anonymous_Dim_10 = 10 ; // global attributes: :_Format = "netCDF-4" ; group: MyGroup { variables: - int dset1(_zdim_3, _zdim_3) ; + int dset1(_Anonymous_Dim_3, _Anonymous_Dim_3) ; dset1:_Storage = "chunked" ; dset1:_ChunkSizes = 3, 3 ; dset1:_NoFill = "true" ; @@ -24,7 +24,7 @@ group: MyGroup { group: Group_A { variables: - int dset2(_zdim_2, _zdim_10) ; + int dset2(_Anonymous_Dim_2, _Anonymous_Dim_10) ; dset2:_Storage = "chunked" ; dset2:_ChunkSizes = 2, 10 ; dset2:_NoFill = "true" ; diff --git a/nczarr_test/ref_jsonconvention.cdl b/nczarr_test/ref_jsonconvention.cdl index 187fffd99..306730e42 100644 --- a/nczarr_test/ref_jsonconvention.cdl +++ b/nczarr_test/ref_jsonconvention.cdl @@ -5,8 +5,8 @@ variables: int v(d1) ; v:varjson1 = "{\"key1\": [1,2,3], \"key2\": {\"key3\": \"abc\"}}" ; v:varjson2 = "[[1.0,0.0,0.0],[0.0,1.0,0.0],[0.0,0.0,1.0]]" ; - v:varvec1 = "1.0, 0.0, 0.0" ; - v:varvec2 = "[0.,0.,1.]" ; + v:varjson3 = "[0.,0.,1.]" ; + v:varchar1 = "1.0, 0.0, 0.0" ; // global attributes: :globalfloat = 1. ; diff --git a/nczarr_test/ref_jsonconvention.zmap b/nczarr_test/ref_jsonconvention.zmap index 4687f6c67..8d8e2ffa7 100644 --- a/nczarr_test/ref_jsonconvention.zmap +++ b/nczarr_test/ref_jsonconvention.zmap @@ -1,5 +1,5 @@ -[0] /.zattrs : () |{"globalfloat": 1, "globalfloatvec": [1,2], "globalchar": "abc", "globalillegal": "[ [ 1.0, 0.0, 0.0 ], [ 0.0, 1.0, 0.0 ], [ 0.0, 0.0, 1.0 ", "_nczarr_attr": {"types": {"globalfloat": "S1", "globalillegal": ">S1", "_NCProperties": ">S1"}}}| -[1] /.zgroup : () |{"zarr_format": 2, "_nczarr_superblock": {"version": "2.0.0"}, "_nczarr_group": {"dims": {"d1": 1}, "vars": ["v"], "groups": []}}| -[3] /v/.zarray : () |{"zarr_format": 2, "shape": [1], "dtype": "S1", "varjson2": ">S1", "varvec1": ">S1", "varvec2": ">S1"}}}| +[0] /.zattrs : () |{"globalfloat": 1, "globalfloatvec": [1,2], "globalchar": "abc", "globalillegal": "[ [ 1.0, 0.0, 0.0 ], [ 0.0, 1.0, 0.0 ], [ 0.0, 0.0, 1.0 ", "_nczarr_group": {"dimensions": {"d1": 1}, "arrays": ["v"], "groups": []}, "_nczarr_superblock": {"version": "2.0.0"}, "_nczarr_attr": {"types": {"globalfloat": "S1", "globalillegal": ">S1", "_NCProperties": ">S1", "_nczarr_group": "|J0", "_nczarr_superblock": "|J0", "_nczarr_attr": "|J0"}}}| +[1] /.zgroup : () |{"zarr_format": 2}| +[3] /v/.zarray : () |{"zarr_format": 2, "shape": [1], "dtype": "S1", "varjson2": ">S1", "varjson3": ">S1", "varchar1": ">S1", "_nczarr_array": "|J0", "_nczarr_attr": "|J0"}}}| [5] /v/0 : (4) (ubyte) |...| diff --git a/nczarr_test/ref_nczarr2zarr.cdl b/nczarr_test/ref_nczarr2zarr.cdl index 814201c81..68bf86c46 100644 --- a/nczarr_test/ref_nczarr2zarr.cdl +++ b/nczarr_test/ref_nczarr2zarr.cdl @@ -1,8 +1,8 @@ netcdf nczarr2zarr { dimensions: - _zdim_8 = 8 ; + _Anonymous_Dim_8 = 8 ; variables: - int v(_zdim_8, _zdim_8) ; + int v(_Anonymous_Dim_8, _Anonymous_Dim_8) ; v:_FillValue = -1 ; data: diff --git a/nczarr_test/ref_newformatpure.cdl b/nczarr_test/ref_newformatpure.cdl index 51058889f..210da3c02 100644 --- a/nczarr_test/ref_newformatpure.cdl +++ b/nczarr_test/ref_newformatpure.cdl @@ -1,8 +1,8 @@ netcdf ref_oldformat { dimensions: lat = 8 ; - _zdim_8 = 8 ; - _zdim_10 = 10 ; + _Anonymous_Dim_8 = 8 ; + _Anonymous_Dim_10 = 10 ; variables: int lat(lat) ; lat:_FillValue = -1 ; @@ -13,7 +13,7 @@ data: group: g1 { variables: - int pos(_zdim_8, _zdim_10) ; + int pos(_Anonymous_Dim_8, _Anonymous_Dim_10) ; pos:_FillValue = -1 ; string pos:pos_attr = "latXlon" ; diff --git a/nczarr_test/ref_purezarr.cdl b/nczarr_test/ref_purezarr.cdl index edc00790f..d0cfb4f0c 100644 --- a/nczarr_test/ref_purezarr.cdl +++ b/nczarr_test/ref_purezarr.cdl @@ -1,9 +1,9 @@ netcdf tmp_purezarr { dimensions: - _zdim_2 = 2 ; - _zdim_5 = 5 ; + _Anonymous_Dim_2 = 2 ; + _Anonymous_Dim_5 = 5 ; variables: - int i(_zdim_2, _zdim_5) ; + int i(_Anonymous_Dim_2, _Anonymous_Dim_5) ; data: i = diff --git a/nczarr_test/ref_scalar_nczarr.cdl b/nczarr_test/ref_scalar_nczarr.cdl new file mode 100644 index 000000000..dab2c44c8 --- /dev/null +++ b/nczarr_test/ref_scalar_nczarr.cdl @@ -0,0 +1,8 @@ +netcdf ref_scalar { +variables: + int v ; + v:_FillValue = -1 ; +data: + + v = 17 ; +} diff --git a/nczarr_test/ref_t_meta_dim1.cdl b/nczarr_test/ref_t_meta_dim1.cdl index c18b87c3a..be39caab9 100644 --- a/nczarr_test/ref_t_meta_dim1.cdl +++ b/nczarr_test/ref_t_meta_dim1.cdl @@ -15,7 +15,7 @@ group: _zgroup { group: _nczgroup { // group attributes: - :data = "{\"dims\": {\"dim1\": 1},\"vars\": [],\"groups\": []}" ; + :data = "{\"dimensions\": {\"dim1\": 1},\"arrays\": [],\"groups\": []}" ; } // group _nczgroup group: _nczattr { diff --git a/nczarr_test/ref_t_meta_var1.cdl b/nczarr_test/ref_t_meta_var1.cdl index 87b0421bd..becdb35b4 100644 --- a/nczarr_test/ref_t_meta_var1.cdl +++ b/nczarr_test/ref_t_meta_var1.cdl @@ -15,7 +15,7 @@ group: _zgroup { group: _nczgroup { // group attributes: - :data = "{\"dims\": {},\"vars\": [\"var1\"],\"groups\": []}" ; + :data = "{\"dimensions\": {},\"arrays\": [\"var1\"],\"groups\": []}" ; } // group _nczgroup group: _nczattr { diff --git a/nczarr_test/ref_ut_map_create.cdl b/nczarr_test/ref_ut_map_create.cdl index 092f26a21..68d2ed637 100644 --- a/nczarr_test/ref_ut_map_create.cdl +++ b/nczarr_test/ref_ut_map_create.cdl @@ -1 +1 @@ -[0] /.nczarr : (0) || +[0] /.zgroup : (0) || diff --git a/nczarr_test/ref_ut_map_readmeta2.txt b/nczarr_test/ref_ut_map_readmeta2.txt index eacaf3045..ec4c5ba98 100644 --- a/nczarr_test/ref_ut_map_readmeta2.txt +++ b/nczarr_test/ref_ut_map_readmeta2.txt @@ -1,4 +1,4 @@ -/meta2/.nczarray: |{ +/meta2/.zarray: |{ "foo": 42, "bar": "apples", "baz": [1, 2, 3, 4], diff --git a/nczarr_test/ref_ut_map_search.txt b/nczarr_test/ref_ut_map_search.txt index 65cf9bc98..23757a978 100644 --- a/nczarr_test/ref_ut_map_search.txt +++ b/nczarr_test/ref_ut_map_search.txt @@ -1,8 +1,8 @@ [0] / -[1] /.nczarr +[1] /.zgroup [2] /data1 [3] /data1/0 [4] /meta1 [5] /meta1/.zarray [6] /meta2 -[7] /meta2/.nczarray +[7] /meta2/.zarray diff --git a/nczarr_test/ref_ut_map_writedata.cdl b/nczarr_test/ref_ut_map_writedata.cdl index bf3f2780c..802bff04b 100644 --- a/nczarr_test/ref_ut_map_writedata.cdl +++ b/nczarr_test/ref_ut_map_writedata.cdl @@ -1,10 +1,10 @@ -[0] /.nczarr : (0) || +[0] /.zgroup : (0) || [2] /data1/0 : (25) (int) |0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24| [4] /meta1/.zarray : (50) |{ "foo": 42, "bar": "apples", "baz": [1, 2, 3, 4]}| -[6] /meta2/.nczarray : (64) |{ +[6] /meta2/.zarray : (64) |{ "foo": 42, "bar": "apples", "baz": [1, 2, 3, 4], diff --git a/nczarr_test/ref_ut_map_writemeta.cdl b/nczarr_test/ref_ut_map_writemeta.cdl index c82087637..f79e06aff 100644 --- a/nczarr_test/ref_ut_map_writemeta.cdl +++ b/nczarr_test/ref_ut_map_writemeta.cdl @@ -1,4 +1,4 @@ -[0] /.nczarr : (0) || +[0] /.zgroup : (0) || [2] /meta1/.zarray : (50) |{ "foo": 42, "bar": "apples", diff --git a/nczarr_test/ref_ut_map_writemeta2.cdl b/nczarr_test/ref_ut_map_writemeta2.cdl index 66d92407a..3af0d1c36 100644 --- a/nczarr_test/ref_ut_map_writemeta2.cdl +++ b/nczarr_test/ref_ut_map_writemeta2.cdl @@ -1,9 +1,9 @@ -[0] /.nczarr : (0) || +[0] /.zgroup : (0) || [2] /meta1/.zarray : (50) |{ "foo": 42, "bar": "apples", "baz": [1, 2, 3, 4]}| -[4] /meta2/.nczarray : (64) |{ +[4] /meta2/.zarray : (64) |{ "foo": 42, "bar": "apples", "baz": [1, 2, 3, 4], diff --git a/nczarr_test/ref_ut_mapapi_create.cdl b/nczarr_test/ref_ut_mapapi_create.cdl index 092f26a21..68d2ed637 100644 --- a/nczarr_test/ref_ut_mapapi_create.cdl +++ b/nczarr_test/ref_ut_mapapi_create.cdl @@ -1 +1 @@ -[0] /.nczarr : (0) || +[0] /.zgroup : (0) || diff --git a/nczarr_test/ref_ut_mapapi_data.cdl b/nczarr_test/ref_ut_mapapi_data.cdl index c34703065..a4dbb8f51 100644 --- a/nczarr_test/ref_ut_mapapi_data.cdl +++ b/nczarr_test/ref_ut_mapapi_data.cdl @@ -1,4 +1,4 @@ -[0] /.nczarr : (50) |{ +[0] /.zgroup : (50) |{ "foo": 42, "bar": "apples", "baz": [1, 2, 3, 4]}| diff --git a/nczarr_test/ref_ut_mapapi_meta.cdl b/nczarr_test/ref_ut_mapapi_meta.cdl index 4eb123557..cbd85cf34 100644 --- a/nczarr_test/ref_ut_mapapi_meta.cdl +++ b/nczarr_test/ref_ut_mapapi_meta.cdl @@ -1,4 +1,4 @@ -[0] /.nczarr : (50) |{ +[0] /.zgroup : (50) |{ "foo": 42, "bar": "apples", "baz": [1, 2, 3, 4]}| diff --git a/nczarr_test/ref_ut_mapapi_search.txt b/nczarr_test/ref_ut_mapapi_search.txt index e216d8c1c..d78bb421c 100644 --- a/nczarr_test/ref_ut_mapapi_search.txt +++ b/nczarr_test/ref_ut_mapapi_search.txt @@ -1,5 +1,5 @@ [0] / -[1] /.nczarr +[1] /.zgroup [2] /data1 [3] /meta1 [4] /meta1/.zarray diff --git a/nczarr_test/ref_ut_testmap_create.cdl b/nczarr_test/ref_ut_testmap_create.cdl index 5e7ce154e..75636ee80 100644 --- a/nczarr_test/ref_ut_testmap_create.cdl +++ b/nczarr_test/ref_ut_testmap_create.cdl @@ -15,7 +15,7 @@ group: _zgroup { group: _nczgroup { // group attributes: - :data = "{\"dims\": {},\"vars\": [],\"groups\": []}" ; + :data = "{\"dimensions\": {},\"arrays\": [],\"groups\": []}" ; } // group _nczgroup group: _nczattr { diff --git a/nczarr_test/ref_zarr_test_data_2d.cdl.gz b/nczarr_test/ref_zarr_test_data_2d.cdl.gz index e8d2e5eed0b400b710b6a0e20136e9aeabccd52e..9f4ce1e2aa64a59948c2c7219051600d72b3c8fe 100644 GIT binary patch delta 370 zcmV-&0ge9f0)+zzABzYGTWAPokq91tUr&QD6vdzEr?}aZCXHJG|A5(p$>K-wwdn+0 zlhQ72(Kxg3?gbVjlc6j=@&csyoZm^fgrjr+oE5MW)^eXa&oki#e2`3*xpvZMmy&=# zXrtdtQ#rnpB@Y*{1iai6@?uT3O`*B*H|aMVjSl?)ufj^PRyhIy0;zp{!d4f5KVX*# zLPgMwQ~oHGdJ?M2F=Ww>4(71OOks!@4)UVE@rEO!!HD?mh~d<@H?`hfO7Nk5Td>4O zPJvMfDMS=v3fB~FDBM!GqmbYTa~RGU3K%LFN*HPwiWsUG%2+}t&cvBG6KCT5E2sHs z?_oY~3-q`~`|4GAZ~OYyZH)C{ps~U8Y*b6OmZLUt?XXZrc-RW-rBJ*;1slcgIJ*%^ zl3W$1Qo^%S!e3Gv) ziE(3&{`bA9mNL(VEd_6e_F*v`IQ+eX!1xL8U66PNgWV^i)abwJb%U z%aZ>GPX2-$qm2%@ZYR?Ky;^lGrqErTHNJ0Zly;pD+o4A&v1?EmULcMa*WrxDhKHb2 zv%?PGF9${RhGaioXOFDMkSTaf^E*6r>x#vq%Y}AW0=oPTjP4lSGkRe3$Y={wFvHFP va1a~_2g3nzP#hTFvRP-=nRRBJS?8)cr$c-EW(pTje2l*V?-OHJjR*h$D8!|L diff --git a/nczarr_test/run_jsonconvention.sh b/nczarr_test/run_jsonconvention.sh index 64b629d85..9f9724cb8 100755 --- a/nczarr_test/run_jsonconvention.sh +++ b/nczarr_test/run_jsonconvention.sh @@ -23,15 +23,13 @@ deletemap $zext $file ${NCGEN} -4 -b -o "$fileurl" $srcdir/ref_jsonconvention.cdl ${NCDUMP} $fileurl > tmp_jsonconvention_${zext}.cdl ${ZMD} -h $fileurl > tmp_jsonconvention_${zext}.txt -# | sed -e 's/,key1=value1|key2=value2//' -e '/"_NCProperties"/ s/(378)/(354)/' # Clean up extraneous changes so comparisons work -# remove '\n' from ref file before comparing -#sed -e 's|\\n||g' < ${srcdir}/ref_jsonconvention.cdl > tmp_jsonconvention_clean.cdl -cat < ${srcdir}/ref_jsonconvention.cdl > tmp_jsonconvention_clean.cdl -cat < tmp_jsonconvention_${zext}.cdl > tmp_jsonconvention_clean_${zext}.cdl -sed -e 's|\(.z[a-z][a-z]*\) : ([0-9][0-9]*)|\1 : ()|g' < tmp_jsonconvention_${zext}.txt >tmp1.tmp -sed -e 's|"_NCProperties": "version=[0-9],[^"]*",||' tmp_jsonconvention_clean_${zext}.txt -diff -b tmp_jsonconvention_clean.cdl tmp_jsonconvention_clean_${zext}.cdl +cat < tmp_jsonconvention_${zext}.cdl > tmp_jsonconvention_clean_${zext}.cdl +cat < tmp_jsonconvention_${zext}.txt > tmp_jsonconvention_clean_${zext}.txt +sed -i.bak -e 's|"_NCProperties": "version=[0-9],[^"]*",||' tmp_jsonconvention_clean_${zext}.txt +sed -i.bak -e 's|\(.z[a-z][a-z]*\) : ([0-9][0-9]*)|\1 : ()|g' tmp_jsonconvention_clean_${zext}.txt +# compare +diff -b ${srcdir}/ref_jsonconvention.cdl tmp_jsonconvention_clean_${zext}.cdl diff -b ${srcdir}/ref_jsonconvention.zmap tmp_jsonconvention_clean_${zext}.txt } diff --git a/nczarr_test/run_scalar.sh b/nczarr_test/run_scalar.sh index b7c268ee5..3e09303ef 100755 --- a/nczarr_test/run_scalar.sh +++ b/nczarr_test/run_scalar.sh @@ -50,7 +50,7 @@ ${NCDUMP} -n ref_scalar $nczarrurl > tmp_scalar_nczarr_${zext}.cdl ${ZMD} -h $nczarrurl > tmp_scalar_nczarr_${zext}.txt echo "*** verify" -diff -bw $top_srcdir/nczarr_test/ref_scalar.cdl tmp_scalar_nczarr_${zext}.cdl +diff -bw $top_srcdir/nczarr_test/ref_scalar_nczarr.cdl tmp_scalar_nczarr_${zext}.cdl # Fixup zarrscalar tmp_scalar_zarr_${zext}.cdl tmp_rescale_zarr_${zext}.cdl diff --git a/nczarr_test/ut_json.c b/nczarr_test/ut_json.c index 37ab65d23..9dd4d3fee 100644 --- a/nczarr_test/ut_json.c +++ b/nczarr_test/ut_json.c @@ -159,7 +159,8 @@ done: static int cloneArray(NCjson* array, NCjson** clonep) { - int i, stat=NC_NOERR; + int stat=NC_NOERR; + size_t i; NCjson* clone = NULL; if((stat=NCJnew(NCJ_ARRAY,&clone))) goto done; for(i=0;i