Convert NCzarr meta-data to use only Zarr attributes

As discussed in a netcdf meeting, convert NCZarr V2 to store all netcdf-4 specific info as attributes. This improves interoperability with other Zarr implementations by no longer using non-standard keys.

## Other Changes
* Remove support for older NCZarr formats.
* Update anonymous dimension naming
* Begin the process of fixing the -Wconversion and -Wsign-compare warnings in libnczarr, nczarr_test, and v3_nczarr_test.
* Update docs/nczarr.md
* Rebuild using the .y and .l files
This commit is contained in:
Dennis Heimbigner 2024-06-19 18:09:29 -06:00
parent be009ed741
commit 076da97aa4
50 changed files with 1299 additions and 983 deletions

View File

@ -7,6 +7,7 @@ This file contains a high-level description of this package's evolution. Release
## 4.9.3 - TBD
* Convert NCZarr V2 to store all netcdf-4 specific info as attributes. This improves interoperability with other Zarr implementations by no longer using non-standard keys. See [Github #????](https://github.com/Unidata/netcdf-c/issues/????) for more information.
* Cleanup the option code for NETCDF_ENABLE_SET_LOG_LEVEL\[_FUNC\] See [Github #2931](https://github.com/Unidata/netcdf-c/issues/2931) for more information.
* Fix duplicate definition when using aws-sdk-cpp. See [Github #2928](https://github.com/Unidata/netcdf-c/issues/2928) for more information.
* Cleanup various obsolete options and do some code refactoring. See [Github #2926](https://github.com/Unidata/netcdf-c/issues/2926) for more information.

View File

@ -8,13 +8,15 @@ The NetCDF NCZarr Implementation
# NCZarr Introduction {#nczarr_introduction}
Beginning with netCDF version 4.8.0, the Unidata NetCDF group has extended the netcdf-c library to provide access to cloud storage (e.g. Amazon S3 <a href="#ref_aws">[1]</a> ).
Beginning with netCDF version 4.8.0, the Unidata NetCDF group has extended the netcdf-c library to support data stored using the Zarr data model and storage format [4,6]. As part of this support, netCDF adds support for accessing data stored using cloud storage (e.g. Amazon S3 <a href="#ref_aws">[1]</a> ).
The goal of this project is to provide maximum interoperability between the netCDF Enhanced (netcdf-4) data model and the Zarr version 2 <a href="#ref_zarrv2">[4]</a> data model. This is embodied in the netcdf-c library so that it is possible to use the netcdf API to read and write Zarr formatted datasets.
The goal of this project, then, is to provide maximum interoperability between the netCDF Enhanced (netcdf-4) data model and the Zarr version 2 <a href="#ref_zarr">[4]</a><!-- or Version 3 <a href="#ref_zarrv3">[13]</a>--> data model. This is embodied in the netcdf-c library so that it is possible to use the netcdf API to read and write Zarr formatted datasets.
In order to better support the netcdf-4 data model, the netcdf-c library implements a limited set of extensions to the Zarr data model.
In order to better support the netcdf-4 data model, the netcdf-c library implements a limited set of extensions to the *Zarr* data model.
This extended model is referred to as *NCZarr*.
An important goal is that those extensions not interfere with reading of those extended datasets by other Zarr specification conforming implementations. This means that one can write a dataset using the NCZarr extensions and expect that dataset to be readable by other Zarr implementations.
Additionally, another goal is to ensure interoperability between *NCZarr*
formatted files and standard (aka pure) *Zarr* formatted files.
This means that (1) an *NCZarr* file can be read by any other *Zarr* library (and especially the Zarr-python library), and (2) a standard *Zarr* file can be read by netCDF. Of course, there limitations in that other *Zarr* libraries will not use the extra, *NCZarr* meta-data, and netCDF will have to "fake" meta-data not provided by a pure *Zarr* file.
As a secondary -- but equally important -- goal, it must be possible to use
the NCZarr library to read and write datasets that are pure Zarr,
@ -29,14 +31,12 @@ Notes on terminology in this document.
# The NCZarr Data Model {#nczarr_data_model}
NCZarr uses a data model <a href="#ref_nczarr">[4]</a> that, by design, extends the Zarr Version 2 Specification <a href="#ref_zarrv2">[6]</a> to add support for the NetCDF-4 data model.
NCZarr uses a data model that, by design, extends the Zarr Version 2 Specification <!--or Version 3 Specification-->.
__Note Carefully__: a legal _NCZarr_ dataset is also a legal _Zarr_ dataset under a specific assumption. This assumption is that within Zarr meta-data objects, like ''.zarray'', unrecognized dictionary keys are ignored.
If this assumption is true of an implementation, then the _NCZarr_ dataset is a legal _Zarr_ dataset and should be readable by that _Zarr_ implementation.
The inverse is true also. A legal _Zarr_ dataset is also a legal _NCZarr_
dataset, where "legal" means it conforms to the Zarr version 2 specification.
__Note Carefully__: a legal _NCZarr_ dataset is expected to also be a legal _Zarr_ dataset.
The inverse is true also. A legal _Zarr_ dataset is expected to also be a legal _NCZarr_ dataset, where "legal" means it conforms to the Zarr specification(s).
In addition, certain non-Zarr features are allowed and used.
Specifically the XArray ''\_ARRAY\_DIMENSIONS'' attribute is one such.
Specifically the XArray [7] ''\_ARRAY\_DIMENSIONS'' attribute is one such.
There are two other, secondary assumption:
@ -45,9 +45,10 @@ There are two other, secondary assumption:
filters](./md_filters.html "filters") for details.
Briefly, the data model supported by NCZarr is netcdf-4 minus
the user-defined types. However, a restricted form of String type
is supported (see Appendix E).
As with netcdf-4 chunking is supported. Filters and compression
the user-defined types and full String type support.
However, a restricted form of String type
is supported (see Appendix D).
As with netcdf-4, chunking is supported. Filters and compression
are also [supported](./md_filters.html "filters").
Specifically, the model supports the following.
@ -74,8 +75,8 @@ When specified, they are treated as chunked where the file consists of only one
This means that testing for contiguous or compact is not possible; the _nc_inq_var_chunking_ function will always return NC_CHUNKED and the chunksizes will be the same as the dimension sizes of the variable's dimensions.
Additionally, it should be noted that NCZarr supports scalar variables,
but Zarr does not; Zarr only supports dimensioned variables.
In order to support interoperability, NCZarr does the following.
but Zarr Version 2 does not; Zarr V2 only supports dimensioned variables.
In order to support interoperability, NCZarr V2 does the following.
1. A scalar variable is recorded in the Zarr metadata as if it has a shape of **[1]**.
2. A note is stored in the NCZarr metadata that this is actually a netCDF scalar variable.
@ -108,55 +109,62 @@ using URLs.
There are, however, some details that are important.
- Protocol: this should be _https_ or _s3_,or _file_.
The _s3_ scheme is equivalent to "https" plus setting "mode=nczarr,s3" (see below). Specifying "file" is mostly used for testing, but is used to support directory tree or zipfile format storage.
The _s3_ scheme is equivalent to "https" plus setting "mode=s3".
Specifying "file" is mostly used for testing, but also for directory tree or zipfile format storage.
## Client Parameters
The fragment part of a URL is used to specify information that is interpreted to specify what data format is to be used, as well as additional controls for that data format.
For NCZarr support, the following _key=value_ pairs are allowed.
- mode=nczarr|zarr|noxarray|file|zip|s3
For reading, _key=value_ pairs are provided for specifying the storage format.
- mode=nczarr|zarr
Typically one will specify two mode flags: one to indicate what format
to use and one to specify the way the dataset is to be stored.
For example, a common one is "mode=zarr,file"
Additional pairs are provided to specify the Zarr version.
- mode=v2<!--|v3-->
Additional pairs are provided to specify the storage medium: Amazon S3 vs File tree vs Zip file.
- mode=file|zip|s3
Note that when reading, an attempt will be made to infer the
format and Zarr version and storage medium format by probing the
file. If inferencing fails, then it is reported. In this case,
the client may need to add specific mode flags to avoid
inferencing.
Typically one will specify three mode flags: one to indicate what format
to use and one to specify the way the dataset is to be stored<!--,and one to specifiy the Zarr format version-->.
For example, a common one is "mode=zarr,file<!--,v2-->"
<!--If not specified, the version will be the default specified when
the netcdf-c library was built.-->
Obviously, when creating a file, inferring the type of file to create
is not possible so the mode flags must be set specifically.
This means that both the storage medium and the exact storage
format must be specified.
Using _mode=nczarr_ causes the URL to be interpreted as a
reference to a dataset that is stored in NCZarr format.
The _zarr_ mode tells the library to
use NCZarr, but to restrict its operation to operate on pure
Zarr Version 2 datasets.
The _zarr_ mode tells the library to use NCZarr, but to restrict its operation to operate on pure Zarr.
<!--The _v2_ mode specifies Version 2 and _v3_mode specifies Version 3.
If the version is not specified, it will default to the value specified when the netcdf-c library was built.-->
The modes _s3_, _file_, and _zip_ tell the library what storage
The modes _s3_, _file_, and _zip_ tell the library what storage medium
driver to use.
* The _s3_ driver is the default and indicates using Amazon S3 or some equivalent.
* The _file_ format stores data in a directory tree.
* The _zip_ format stores data in a local zip file.
* The _s3_ driver stores data using Amazon S3 or some equivalent.
* The _file_ driver stores data in a directory tree.
* The _zip_ driver stores data in a local zip file.
Note that It should be the case that zipping a _file_
As an aside, it should be the case that zipping a _file_
format directory tree will produce a file readable by the
_zip_ storage format, and vice-versa.
By default, the XArray convention is supported and used for
both NCZarr files and pure Zarr files. This
means that every variable in the root group whose named dimensions
By default, the XArray convention is supported for Zarr Version 2
and used for both NCZarr files and pure Zarr files.
<!--It is not needed for Version 3 and is ignored.-->
This means that every variable in the root group whose named dimensions
are also in the root group will have an attribute called
*\_ARRAY\_DIMENSIONS* that stores those dimension names.
The _noxarray_ mode tells the library to disable the XArray support.
The netcdf-c library is capable of inferring additional mode flags based on the flags it finds. Currently we have the following inferences.
- _zarr_ => _nczarr_
So for example: ````...#mode=zarr,zip```` is equivalent to this.
````...#mode=nczarr,zarr,zip
````
<!--
- log=&lt;output-stream&gt;: this control turns on logging output,
which is useful for debugging and testing.
If just _log_ is used
then it is equivalent to _log=stderr_.
-->
# NCZarr Map Implementation {#nczarr_mapimpl}
Internally, the nczarr implementation has a map abstraction that allows different storage formats to be used.
@ -192,7 +200,7 @@ be a prefix of any other key.
There several other concepts of note.
1. __Dataset__ - a dataset is the complete tree contained by the key defining
the root of the dataset.
the root of the dataset. The term __File__ will often be used as a synonym.
Technically, the root of the tree is the key \<dataset\>/.zgroup, where .zgroup can be considered the _superblock_ of the dataset.
2. __Object__ - equivalent of the S3 object; Each object has a unique key
and "contains" data in the form of an arbitrary sequence of 8-bit bytes.
@ -277,14 +285,15 @@ As with other URLS (e.g. DAP), these kind of URLS can be passed as the path argu
# NCZarr versus Pure Zarr. {#nczarr_purezarr}
The NCZARR format extends the pure Zarr format by adding extra keys such as ''\_NCZARR\_ARRAY'' inside the ''.zarray'' object.
It is possible to suppress the use of these extensions so that the netcdf library can read and write a pure zarr formatted file.
This is controlled by using ''mode=zarr'', which is an alias for the
''mode=nczarr,zarr'' combination.
The primary effects of using pure zarr are described in the [Translation Section](@ref nczarr_translation).
There are some constraints on the reading of Zarr datasets using the NCZarr implementation.
The NCZARR format extends the pure Zarr format by adding extra attributes such as ''\_nczarr\_array'' inside the ''.zattr'' object.
It is possible to suppress the use of these extensions so that the netcdf library can write a pure zarr formatted file. But this probably unnecessary
since these attributes should be readable by any other Zarr implementation.
But these extra attributes might be seen as clutter and so it is possible
to suppress them when writing using *mode=zarr*.
Reading of pure Zarr files created using other implementations is a necessary
compatibility feature of NCZarr.
This requirement imposed some constraints on the reading of Zarr datasets using the NCZarr implementation.
1. Zarr allows some primitive types not recognized by NCZarr.
Over time, the set of unrecognized types is expected to diminish.
Examples of currently unsupported types are as follows:
@ -333,13 +342,14 @@ The reason for this is that the bucket name forms the initial segment in the key
## Data Model
The NCZarr storage format is almost identical to that of the the standard Zarr version 2 format.
The NCZarr storage format is almost identical to that of the the standard Zarr format.
The data model differs as follows.
1. Zarr only supports anonymous dimensions -- NCZarr supports only shared (named) dimensions.
2. Zarr attributes are untyped -- or perhaps more correctly characterized as of type string.
3. Zarr does not explicitly support unlimited dimensions -- NCZarr does support them.
## Storage Format
## Storage Medium
Consider both NCZarr and Zarr, and assume S3 notions of bucket and object.
In both systems, Groups and Variables (Array in Zarr) map to S3 objects.
@ -347,8 +357,7 @@ Containment is modeled using the fact that the dataset's key is a prefix of the
So for example, if variable _v1_ is contained in top level group g1 -- _/g1 -- then the key for _v1_ is _/g1/v_.
Additional meta-data information is stored in special objects whose name start with ".z".
In Zarr, the following special objects exist.
In Zarr Version 2, the following special objects exist.
1. Information about a group is kept in a special object named _.zgroup_;
so for example the object _/g1/.zgroup_.
2. Information about an array is kept as a special object named _.zarray_;
@ -359,45 +368,46 @@ so for example the objects _/g1/.zattr_ and _/g1/v1/.zattr_.
The first three contain meta-data objects in the form of a string representing a JSON-formatted dictionary.
The NCZarr format uses the same objects as Zarr, but inserts NCZarr
specific key-value pairs in them to hold NCZarr specific information
The value of each of these keys is a JSON dictionary containing a variety
specific attributes in the *.zattr* object to hold NCZarr specific information
The value of each of these attributes is a JSON dictionary containing a variety
of NCZarr specific information.
These keys are as follows:
These NCZarr-specific attributes are as follows:
_\_nczarr_superblock\__ -- this is in the top level group -- key _/.zarr_.
_\_nczarr_superblock\__ -- this is in the top level group's *.zattr* object.
It is in effect the "superblock" for the dataset and contains
any netcdf specific dataset level information.
It is also used to verify that a given key is the root of a dataset.
Currently it contains the following key(s):
* "version" -- the NCZarr version defining the format of the dataset.
Currently it contains keys that are ignored and exist only to ensure that
older netcdf library versions do not crash.
* "version" -- the NCZarr version defining the format of the dataset (deprecated).
_\_nczarr_group\__ -- this key appears in every _.zgroup_ object.
_\_nczarr_group\__ -- this key appears in every group's _.zattr_ object.
It contains any netcdf specific group information.
Specifically it contains the following keys:
* "dims" -- the name and size of shared dimensions defined in this group, as well an optional flag indictating if the dimension is UNLIMITED.
* "vars" -- the name of variables defined in this group.
* "dimensions" -- the name and size of shared dimensions defined in this group, as well an optional flag indictating if the dimension is UNLIMITED.
* "arrays" -- the name of variables defined in this group.
* "groups" -- the name of sub-groups defined in this group.
These lists allow walking the NCZarr dataset without having to use the potentially costly search operation.
_\_nczarr_array\__ -- this key appears in every _.zarray_ object.
_\_nczarr_array\__ -- this key appears in the *.zattr* object associated
with a _.zarray_ object.
It contains netcdf specific array information.
Specifically it contains the following keys:
* dimrefs -- the names of the shared dimensions referenced by the variable.
* storage -- indicates if the variable is chunked vs contiguous in the netcdf sense.
* dimension_references -- the fully qualified names of the shared dimensions referenced by the variable.
* storage -- indicates if the variable is chunked vs contiguous in the netcdf sense. Also signals if a variable is scalar.
_\_nczarr_attr\__ -- this key appears in every _.zattr_ object.
This means that technically, it is attribute, but one for which access
is normally surpressed .
_\_nczarr_attr\__ -- this attribute appears in every _.zattr_ object.
Specifically it contains the following keys:
* types -- the types of all of the other attributes in the _.zattr_ object.
* types -- the types of all attributes in the _.zattr_ object.
## Translation {#nczarr_translation}
With some constraints, it is possible for an nczarr library to read the pure Zarr format and for a zarr library to read the nczarr format.
The latter case, zarr reading nczarr is possible if the zarr library is willing to ignore keys whose name it does not recognize; specifically anything beginning with _\_nczarr\__.
With some loss of netcdf-4 information, it is possible for an nczarr library to read the pure Zarr format and for other zarr libraries to read the nczarr format.
The former case, nczarr reading zarr is also possible if the nczarr can simulate or infer the contents of the missing _\_nczarr\_xxx_ objects.
The latter case, zarr reading nczarr, is trival because all of the nczarr metadata is stored as ordinary, String valued (but JSON syntax), attributes.
The former case, nczarr reading zarr is possible assuming the nczarr code can simulate or infer the contents of the missing _\_nczarr\_xxx_ attributes.
As a rule this can be done as follows.
1. _\_nczarr_group\__ -- The list of contained variables and sub-groups can be computed using the search API to list the keys "contained" in the key for a group.
The search looks for occurrences of _.zgroup_, _.zattr_, _.zarray_ to infer the keys for the contained groups, attribute sets, and arrays (variables).
@ -405,9 +415,8 @@ Constructing the set of "shared dimensions" is carried out
by walking all the variables in the whole dataset and collecting
the set of unique integer shapes for the variables.
For each such dimension length, a top level dimension is created
named ".zdim_<len>" where len is the integer length.
2. _\_nczarr_array\__ -- The dimrefs are inferred by using the shape
in _.zarray_ and creating references to the simulated shared dimension.
named "_Anonymous_Dimension_<len>" where len is the integer length.
2. _\_nczarr_array\__ -- The dimension referencess are inferred by using the shape in _.zarray_ and creating references to the simulated shared dimensions.
netcdf specific information.
3. _\_nczarr_attr\__ -- The type of each attribute is inferred by trying to parse the first attribute value string.
@ -417,13 +426,15 @@ In order to accomodate existing implementations, certain mode tags are provided
## XArray
The Xarray [XArray Zarr Encoding Specification](http://xarray.pydata.org/en/latest/internals.html#zarr-encoding-specification) Zarr implementation uses its own mechanism for specifying shared dimensions.
The Xarray [7] Zarr implementation uses its own mechanism for specifying shared dimensions.
It uses a special attribute named ''_ARRAY_DIMENSIONS''.
The value of this attribute is a list of dimension names (strings).
An example might be ````["time", "lon", "lat"]````.
It is essentially equivalent to the ````_nczarr_array "dimrefs" list````, except that the latter uses fully qualified names so the referenced dimensions can be anywhere in the dataset.
It is almost equivalent to the ````_nczarr_array "dimension_references" list````, except that the latter uses fully qualified names so the referenced dimensions can be anywhere in the dataset. The Xarray dimension list differs from the netcdf-4 shared dimensions in two ways.
1. Specifying Xarray in a non-root group has no meaning in the current Xarray specification.
2. A given name can be associated with different lengths, even within a single array. This is considered an error in NCZarr.
As of _netcdf-c_ version 4.8.2, The Xarray ''_ARRAY_DIMENSIONS'' attribute is supported for both NCZarr and pure Zarr.
The Xarray ''_ARRAY_DIMENSIONS'' attribute is supported for both NCZarr and pure Zarr.
If possible, this attribute will be read/written by default,
but can be suppressed if the mode value "noxarray" is specified.
If detected, then these dimension names are used to define shared dimensions.
@ -431,6 +442,8 @@ The following conditions will cause ''_ARRAY_DIMENSIONS'' to not be written.
* The variable is not in the root group,
* Any dimension referenced by the variable is not in the root group.
Note that this attribute is not needed for Zarr Version 3, and is ignored.
# Examples {#nczarr_examples}
Here are a couple of examples using the _ncgen_ and _ncdump_ utilities.
@ -453,34 +466,17 @@ Here are a couple of examples using the _ncgen_ and _ncdump_ utilities.
```
5. Create an nczarr file using the s3 protocol with a specific profile
```
ncgen -4 -lb -o 's3://datasetbucket/rootkey\#mode=nczarr,awsprofile=unidata' dataset.cdl
ncgen -4 -lb -o "s3://datasetbucket/rootkey\#mode=nczarr&awsprofile=unidata" dataset.cdl
```
Note that the URL is internally translated to this
```
'https://s2.&lt;region&gt.amazonaws.com/datasetbucket/rootkey#mode=nczarr,awsprofile=unidata' dataset.cdl
```
# References {#nczarr_bib}
<a name="ref_aws">[1]</a> [Amazon Simple Storage Service Documentation](https://docs.aws.amazon.com/s3/index.html)<br>
<a name="ref_awssdk">[2]</a> [Amazon Simple Storage Service Library](https://github.com/aws/aws-sdk-cpp)<br>
<a name="ref_libzip">[3]</a> [The LibZip Library](https://libzip.org/)<br>
<a name="ref_nczarr">[4]</a> [NetCDF ZARR Data Model Specification](https://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf-zarr-data-model-specification)<br>
<a name="ref_python">[5]</a> [Python Documentation: 8.3.
collections — High-performance dataset datatypes](https://docs.python.org/2/library/collections.html)<br>
<a name="ref_zarrv2">[6]</a> [Zarr Version 2 Specification](https://zarr.readthedocs.io/en/stable/spec/v2.html)<br>
<a name="ref_xarray">[7]</a> [XArray Zarr Encoding Specification](http://xarray.pydata.org/en/latest/internals.html#zarr-encoding-specification)<br>
<a name="dynamic_filter_loading">[8]</a> [Dynamic Filter Loading](https://support.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/HDF5DynamicallyLoadedFilters.pdf)<br>
<a name="official_hdf5_filters">[9]</a> [Officially Registered Custom HDF5 Filters](https://portal.hdfgroup.org/display/support/Registered+Filter+Plugins)<br>
<a name="blosc-c-impl">[10]</a> [C-Blosc Compressor Implementation](https://github.com/Blosc/c-blosc)<br>
<a name="ref_awssdk_conda">[11]</a> [Conda-forge packages / aws-sdk-cpp](https://anaconda.org/conda-forge/aws-sdk-cpp)<br>
<a name="ref_gdal">[12]</a> [GDAL Zarr](https://gdal.org/drivers/raster/zarr.html)<br>
````
"https://s2.&lt;region&gt.amazonaws.com/datasetbucket/rootkey\#mode=nczarr&awsprofile=unidata"
````
# Appendix A. Building NCZarr Support {#nczarr_build}
Currently the following build cases are known to work.
Note that this does not include S3 support.
A separate tabulation of S3 support is in the document cloud.md.
A separate tabulation of S3 support is in the document _cloud.md_.
<table>
<tr><td><u>Operating System</u><td><u>Build System</u><td><u>NCZarr</u>
@ -551,24 +547,9 @@ Some of the relevant limits are as follows:
Note that the limit is defined in terms of bytes and not (Unicode) characters.
This affects the depth to which groups can be nested because the key encodes the full path name of a group.
# Appendix C. NCZarr Version 1 Meta-Data Representation. {#nczarr_version1}
# Appendix C. JSON Attribute Convention. {#nczarr_json}
In NCZarr Version 1, the NCZarr specific metadata was represented using new objects rather than as keys in existing Zarr objects.
Due to conflicts with the Zarr specification, that format is deprecated in favor of the one described above.
However the netcdf-c NCZarr support can still read the version 1 format.
The version 1 format defines three specific objects: _.nczgroup_, _.nczarray_,_.nczattr_.
These are stored in parallel with the corresponding Zarr objects. So if there is a key of the form "/x/y/.zarray", then there is also a key "/x/y/.nczarray".
The content of these objects is the same as the contents of the corresponding keys. So the value of the ''_NCZARR_ARRAY'' key is the same as the content of the ''.nczarray'' object. The list of connections is as follows:
* ''.nczarr'' <=> ''_nczarr_superblock_''
* ''.nczgroup <=> ''_nczarr_group_''
* ''.nczarray <=> ''_nczarr_array_''
* ''.nczattr <=> ''_nczarr_attr_''
# Appendix D. JSON Attribute Convention. {#nczarr_json}
The Zarr V2 specification is somewhat vague on what is a legal
The Zarr V2 <!--(and V3)--> specification is somewhat vague on what is a legal
value for an attribute. The examples all show one of two cases:
1. A simple JSON scalar atomic values (e.g. int, float, char, etc), or
2. A JSON array of such values.
@ -581,7 +562,7 @@ complex JSON expression. An example is the GDAL Driver
convention <a href='#ref_gdal'>[12]</a>, where the value is a complex
JSON dictionary.
In order for NCZarr to be as consistent as possible with Zarr Version 2,
In order for NCZarr to be as consistent as possible with Zarr,
it is desirable to support this convention for attribute values.
This means that there must be some way to handle an attribute
whose value is not either of the two cases above. That is, its value
@ -611,12 +592,12 @@ There are mutiple cases to consider.
3. The netcdf attribute **is** of type NC_CHAR and its value &ndash; taken as a single sequence of characters &ndash;
**is** parseable as a legal JSON expression.
* Parse to produce a JSON expression and write that expression.
* Use "|U1" as the dtype and store in the NCZarr metadata.
* Use "|J0" as the dtype and store in the NCZarr metadata.
4. The netcdf attribute **is** of type NC_CHAR and its value &ndash; taken as a single sequence of characters &ndash;
**is not** parseable as a legal JSON expression.
* Convert to a JSON string and write that expression
* Use "|U1" as the dtype and store in the NCZarr metadata.
* Use ">S1" as the dtype and store in the NCZarr metadata.
## Reading an attribute:
@ -640,10 +621,7 @@ and then store it as the equivalent netcdf vector.
* If the dtype is not defined, then infer the dtype based on the first JSON value in the array,
and then store it as the equivalent netcdf vector.
3. The JSON expression is an array some of whose values are dictionaries or (sub-)arrays.
* Un-parse the expression to an equivalent sequence of characters, and then store it as of type NC_CHAR.
3. The JSON expression is a dictionary.
3. The attribute is any other JSON structure.
* Un-parse the expression to an equivalent sequence of characters, and then store it as of type NC_CHAR.
## Notes
@ -654,7 +632,7 @@ actions "read-write-read" is equivalent to a single "read" and "write-read-write
The "almost" caveat is necessary because (1) whitespace may be added or lost during the sequence of operations,
and (2) numeric precision may change.
# Appendix E. Support for string types
# Appendix D. Support for string types
Zarr supports a string type, but it is restricted to
fixed size strings. NCZarr also supports such strings,
@ -702,6 +680,182 @@ the above types should always appear as strings,
and the type that signals NC_CHAR (in NCZarr)
would be handled by Zarr as a string of length 1.
<!--
# Appendix E. Zarr Version 3: NCZarr Version 3 Meta-Data Representation. {#nczarr_version3}
For Zarr version 3, the added NCZarr specific metadata is stored
as attributes pretty much the same as for Version 2.
Specifically, the following Netcdf-4 meta-data information needs to be captured by NCZarr:
1. Shared dimensions: name and size.
2. Unlimited dimensions: which dimensions are unlimited.
3. Attribute types.
4. Netcdf types not included in Zarr: currently "char" and "string".
5. Zarr types not included in Netcdf: currently only "complex(32|64)"
This extra netcdfd-4 meta-data to attributes so as to not interfere with existing implementations.
## Supported Types
Zarr version 3 supports the following "atomic" types:
bool, int8, uint8, int16, uint16, int32, uint32, int64, uint64, float32, float64.
It also defines two structured type: complex64 and complex128.
NCZarr supports all of the atomic types.
Specialized support is provided for the following
Netcdf types: char, string.
The Zarr types bool and complex64 are not yet supported, but will be added shortly.
The type complex128 is not supported at all.
The Zarr type "bool" will appear in the netcdf types as
the enum type "_bool" whose netcdf declaration is as follows:
````
ubyte enum _bool_t {FALSE=0, TRUE=1};
````
The type complex64 will be supported by by defining this compound type:
````
compound _Complex64_t { float64 i; float64 j;}
````
Strings present a problem because there is a proposal
to add variable length strings to the Zarr version 3 specification;
fixed-length strings would not be supported at all.
But strings are important in Netcdf, so a forward compatible
representation is provided where the type is string
and its maximum size is specified.
For arrays, the Netcdf types "char" and "string" are stored
in the Zarr file as of type "uint8" and "r<8*n>", respectively
where _n_ is the maximum length of the string in bytes (not characters).
The fact that they represent "char" and "string" is encoded in the "_nczarr_array" attribute (see below).
## NCZarr Superblock
The *_nczarr_superblock* attribute is used as a useful marker to signal that a file is in fact NCZarr as opposed to Zarr.
This attribute is stored in the *zarr.info* attributes in the root group of the Zarr file.
The relevant attribute has the following format:
````
"_nczarr_superblock": {
"version": "3.0.0",
format": 3
}
````
## Group Annotations
The optional *_nczarr_group* attribute is stored in the attributes of a Zarr group within
the *zarr.json* object in that group.
The relevant attribute has the following format:
````
"_nczarr_group": {
\"dimensions\": [{name: <dimname>, size: <integer>, unlimited: 1|0},...],
\"arrays\": ["<name>",...],
\"subgroups\": ["<name>",...]
}
````
Its purpose is two-fold:
1. record the objects immediately within that group
2. define netcdf-4 dimenension objects within that group.
## Array Annotations
In order to support Netcdf concepts in Zarr, it may be necessary
to annotate a Zarr array with extra information.
The optional *_nczarr_array* attribute is stored in the attributes of a Zarr array within
the *zarr.json* object in that array.
The relevant attribute has the following format:
````
"_nczarr_array": {
\"dimension_references\": [\"/g1/g2/d1\", \"/d2\",...],
\"type_alias\": "<string indicating special type aliasing>" // optional
}
````
The *dimension_references* key is an expansion of the "dimensions" key
found in the *zarr.json* object for an array.
The problem with "dimensions" is that it specifies a simple name for each
dimension, whereas netcdf-4 requires that the array references dimension objects
that may appear in groups anywhere in the file. These references are encoded
as FQNs "pointing" to a specific dimension declaration (see *_nczarr_group* attribute
defined previously).
FQN is an acronym for "Fully Qualified Name".
It is a series of names separated by the "/" character, much
like a file system path.
It identifies the group in which the dimension is ostensibly "defined" in the Netcdf sense.
For example ````/d1```` defines a dimension "d1" defined in the root group.
Similarly ````/g1/g2/d2```` defines a dimension "d2" defined in the
group g2, which in turn is a subgroup of group g1, which is a subgroup
of the root group.
The *type_alias* key is used to annotate the type of an array
to allow discovery of netcdf-4 specific types.
Specifically, there are three current cases:
| dtype | type_alias |
| ----- | ---------- |
| uint8 | char |
| rn | string |
| uint8 | json |
If, for example, an array's dtype is specified as *uint8*, then it may be that
it is actually of unsigned 8-bit integer type. But it may actually be of some
netcdf-4 type that is encoded as *uint8* in order to be recognized by other -- pure zarr--
implementations. So, for example, if the netcdf-4 type is *char*, then the array's
dtype is *uint8*, but its type alias is *char*.
## Attribute Type Annotation
In Zarr version 3, group and array attributes are stored inside
the corresponding _zarr.info_. object under the dictionary key "attributes".
Note that this decision is still under discussion and it may be changed
to store attributes in an object separate from _zarr.info_.
Regardless of where the attributes are stored, and in order to
support netcdf-4 typed attributes, the per-attribute information
is stored as a special attribute called _\_nczarr_attrs\__ defined to hold
NCZarr specific attribute information. Currently, it only holds
the attribute typing information.
It can appear in any *zarr.json* object: group or array.
Its form is this:
````
"_nczarr_attrs": {
"attribute_types": [
{"name": "attr1", "configuration": {"type": "<dtype>"}},
...
]
}
````
There is one entry for every attribute (including itself) giving the type
of that attribute.
It should be noted that Zarr allows the value of an attribute to be an arbitrary
JSON-encoded structure. In order to support this in netcdf-4, is such a structure
is encountered as an attribute value, then it typed as *json* (see previously
described table).
## Codec Specification
The Zarr version 3 representation of codecs is slightly different
than that used by Zarr version 2.
In version 2, the codec is represented by this JSON template.
````
{"id": "<codec name>" "<param>": "<value>", "<param>": "<value>", ...}
````
In version 3, the codec is represented by this JSON template.
````
{"name": "<codec name>" "configuration": {"<param>": "<value>", "<param>": "<value>", ...}}
````
-->
# References {#nczarr_bib}
<a name="ref_aws">[1]</a> [Amazon Simple Storage Service Documentation](https://docs.aws.amazon.com/s3/index.html)<br>
<a name="ref_awssdk">[2]</a> [Amazon Simple Storage Service Library](https://github.com/aws/aws-sdk-cpp)<br>
<a name="ref_libzip">[3]</a> [The LibZip Library](https://libzip.org/)<br>
<a name="ref_nczarr">[4]</a> [NetCDF ZARR Data Model Specification](https://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf-zarr-data-model-specification)<br>
<a name="ref_python">[5]</a> [Python Documentation: 8.3.
collections — High-performance dataset datatypes](https://docs.python.org/2/library/collections.html)<br>
<a name="ref_zarrv2">[6]</a> [Zarr Version 2 Specification](https://zarr.readthedocs.io/en/stable/spec/v2.html)<br>
<a name="ref_xarray">[7]</a> [XArray Zarr Encoding Specification](http://xarray.pydata.org/en/latest/internals.html#zarr-encoding-specification)<br>
<a name="dynamic_filter_loading">[8]</a> [Dynamic Filter Loading](https://support.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/HDF5DynamicallyLoadedFilters.pdf)<br>
<a name="official_hdf5_filters">[9]</a> [Officially Registered Custom HDF5 Filters](https://portal.hdfgroup.org/display/support/Registered+Filter+Plugins)<br>
<a name="blosc-c-impl">[10]</a> [C-Blosc Compressor Implementation](https://github.com/Blosc/c-blosc)<br>
<a name="ref_awssdk_conda">[11]</a> [Conda-forge packages / aws-sdk-cpp](https://anaconda.org/conda-forge/aws-sdk-cpp)<br>
<a name="ref_gdal">[12]</a> [GDAL Zarr](https://gdal.org/drivers/raster/zarr.html)<br>
<!--
<a name="ref_nczarrv3">[13]</a> [NetCDF ZARR Data Model Specification Version 3](https://zarr-specs.readthedocs.io/en/latest/specs.html)
-->
# Change Log {#nczarr_changelog}
[Note: minor text changes are not included.]
@ -710,6 +864,12 @@ intended to be a detailed chronology. Rather, it provides highlights
that will be of interest to NCZarr users. In order to see exact changes,
It is necessary to use the 'git diff' command.
## 03/31/2024
1. Document the change to V2 to using attributes to hold NCZarr metadata.
## 01/31/2024
1. Add description of support for Zarr version 3 as an appendix.
## 3/10/2023
1. Move most of the S3 text to the cloud.md document.
@ -729,4 +889,4 @@ include arbitrary JSON expressions; see Appendix D for more details.
__Author__: Dennis Heimbigner<br>
__Email__: dmh at ucar dot edu<br>
__Initial Version__: 4/10/2020<br>
__Last Revised__: 3/8/2023
__Last Revised__: 4/02/2024

View File

@ -512,6 +512,5 @@ extern void NC_initialize_reserved(void);
#define NC_NCZARR_GROUP "_nczarr_group"
#define NC_NCZARR_ARRAY "_nczarr_array"
#define NC_NCZARR_ATTR "_nczarr_attr"
#define NC_NCZARR_ATTR_UC "_NCZARR_ATTR" /* deprecated */
#endif /* _NC4INTERNAL_ */

View File

@ -57,7 +57,7 @@ typedef struct NCjson {
int sort; /* of this object */
char* string; /* sort != DICT|ARRAY */
struct NCjlist {
int len;
size_t len;
struct NCjson** contents;
} list; /* sort == DICT|ARRAY */
} NCjson;
@ -108,7 +108,14 @@ OPTEXPORT int NCJaddstring(NCjson* json, int sort, const char* s);
OPTEXPORT int NCJappend(NCjson* object, NCjson* value);
/* Insert key-value pair into a dict object. key will be copied */
OPTEXPORT int NCJinsert(NCjson* object, char* key, NCjson* value);
OPTEXPORT int NCJinsert(NCjson* object, const char* key, NCjson* value);
/* Insert key-value pair as strings into a dict object.
key and value will be copied */
OPTEXPORT int NCJinsertstring(NCjson* object, const char* key, const char* value);
/* Insert key-value pair where value is an int */
OPTEXPORT int NCJinsertint(NCjson* object, const char* key, long long ivalue);
/* Unparser to convert NCjson object to text in buffer */
OPTEXPORT int NCJunparse(const NCjson* json, unsigned flags, char** textp);
@ -131,8 +138,10 @@ OPTEXPORT const char* NCJtotext(const NCjson* json);
#define NCJsort(x) ((x)->sort)
#define NCJstring(x) ((x)->string)
#define NCJlength(x) ((x)==NULL ? 0 : (x)->list.len)
#define NCJdictlength(x) ((x)==NULL ? 0 : (x)->list.len/2)
#define NCJcontents(x) ((x)->list.contents)
#define NCJith(x,i) ((x)->list.contents[i])
#define NCJdictith(x,i) ((x)->list.contents[2*i])
/* Setters */
#define NCJsetsort(x,s) (x)->sort=(s)

View File

@ -57,7 +57,7 @@ typedef struct NCjson {
int sort; /* of this object */
char* string; /* sort != DICT|ARRAY */
struct NCjlist {
int len;
size_t len;
struct NCjson** contents;
} list; /* sort == DICT|ARRAY */
} NCjson;
@ -108,7 +108,14 @@ OPTEXPORT int NCJaddstring(NCjson* json, int sort, const char* s);
OPTEXPORT int NCJappend(NCjson* object, NCjson* value);
/* Insert key-value pair into a dict object. key will be copied */
OPTEXPORT int NCJinsert(NCjson* object, char* key, NCjson* value);
OPTEXPORT int NCJinsert(NCjson* object, const char* key, NCjson* value);
/* Insert key-value pair as strings into a dict object.
key and value will be copied */
OPTEXPORT int NCJinsertstring(NCjson* object, const char* key, const char* value);
/* Insert key-value pair where value is an int */
OPTEXPORT int NCJinsertint(NCjson* object, const char* key, long long ivalue);
/* Unparser to convert NCjson object to text in buffer */
OPTEXPORT int NCJunparse(const NCjson* json, unsigned flags, char** textp);
@ -131,8 +138,10 @@ OPTEXPORT const char* NCJtotext(const NCjson* json);
#define NCJsort(x) ((x)->sort)
#define NCJstring(x) ((x)->string)
#define NCJlength(x) ((x)==NULL ? 0 : (x)->list.len)
#define NCJdictlength(x) ((x)==NULL ? 0 : (x)->list.len/2)
#define NCJcontents(x) ((x)->list.contents)
#define NCJith(x,i) ((x)->list.contents[i])
#define NCJdictith(x,i) ((x)->list.contents[2*i])
/* Setters */
#define NCJsetsort(x,s) (x)->sort=(s)
@ -278,7 +287,9 @@ static int NCJnewstring(int sort, const char* value, NCjson** jsonp);
static int NCJnewstringn(int sort, size_t len, const char* value, NCjson** jsonp);
static int NCJclone(const NCjson* json, NCjson** clonep);
static int NCJaddstring(NCjson* json, int sort, const char* s);
static int NCJinsert(NCjson* object, char* key, NCjson* jvalue);
static int NCJinsert(NCjson* object, const char* key, NCjson* jvalue);
static int NCJinsertstring(NCjson* object, const char* key, const char* value);
static int NCJinsertint(NCjson* object, const char* key, long long ivalue);
static int NCJappend(NCjson* object, NCjson* value);
static int NCJunparse(const NCjson* json, unsigned flags, char** textp);
#else /*!NETCDF_JSON_H*/
@ -1050,7 +1061,7 @@ done:
/* Insert key-value pair into a dict object. key will be strdup'd */
OPTSTATIC int
NCJinsert(NCjson* object, char* key, NCjson* jvalue)
NCJinsert(NCjson* object, const char* key, NCjson* jvalue)
{
int stat = NCJ_OK;
NCjson* jkey = NULL;
@ -1063,6 +1074,36 @@ done:
return NCJTHROW(stat);
}
/* Insert key-value pair as strings into a dict object.
key and value will be strdup'd */
OPTSTATIC int
NCJinsertstring(NCjson* object, const char* key, const char* value)
{
int stat = NCJ_OK;
NCjson* jvalue = NULL;
if(value == NULL)
NCJnew(NCJ_NULL,&jvalue);
else
NCJnewstring(NCJ_STRING,value,&jvalue);
NCJinsert(object,key,jvalue);
done:
return NCJTHROW(stat);
}
/* Insert key-value pair with value being an integer */
OPTSTATIC int
NCJinsertint(NCjson* object, const char* key, long long ivalue)
{
int stat = NCJ_OK;
NCjson* jvalue = NULL;
char digits[128];
snprintf(digits,sizeof(digits),"%lld",ivalue);
NCJnewstring(NCJ_STRING,digits,&jvalue);
NCJinsert(object,key,jvalue);
done:
return NCJTHROW(stat);
}
/* Append value to an array or dict object. */
OPTSTATIC int
NCJappend(NCjson* object, NCjson* value)

View File

@ -128,7 +128,9 @@ static int NCJnewstring(int sort, const char* value, NCjson** jsonp);
static int NCJnewstringn(int sort, size_t len, const char* value, NCjson** jsonp);
static int NCJclone(const NCjson* json, NCjson** clonep);
static int NCJaddstring(NCjson* json, int sort, const char* s);
static int NCJinsert(NCjson* object, char* key, NCjson* jvalue);
static int NCJinsert(NCjson* object, const char* key, NCjson* jvalue);
static int NCJinsertstring(NCjson* object, const char* key, const char* value);
static int NCJinsertint(NCjson* object, const char* key, long long ivalue);
static int NCJappend(NCjson* object, NCjson* value);
static int NCJunparse(const NCjson* json, unsigned flags, char** textp);
#else /*!NETCDF_JSON_H*/
@ -900,7 +902,7 @@ done:
/* Insert key-value pair into a dict object. key will be strdup'd */
OPTSTATIC int
NCJinsert(NCjson* object, char* key, NCjson* jvalue)
NCJinsert(NCjson* object, const char* key, NCjson* jvalue)
{
int stat = NCJ_OK;
NCjson* jkey = NULL;
@ -913,6 +915,36 @@ done:
return NCJTHROW(stat);
}
/* Insert key-value pair as strings into a dict object.
key and value will be strdup'd */
OPTSTATIC int
NCJinsertstring(NCjson* object, const char* key, const char* value)
{
int stat = NCJ_OK;
NCjson* jvalue = NULL;
if(value == NULL)
NCJnew(NCJ_NULL,&jvalue);
else
NCJnewstring(NCJ_STRING,value,&jvalue);
NCJinsert(object,key,jvalue);
done:
return NCJTHROW(stat);
}
/* Insert key-value pair with value being an integer */
OPTSTATIC int
NCJinsertint(NCjson* object, const char* key, long long ivalue)
{
int stat = NCJ_OK;
NCjson* jvalue = NULL;
char digits[128];
snprintf(digits,sizeof(digits),"%lld",ivalue);
NCJnewstring(NCJ_STRING,digits,&jvalue);
NCJinsert(object,key,jvalue);
done:
return NCJTHROW(stat);
}
/* Append value to an array or dict object. */
OPTSTATIC int
NCJappend(NCjson* object, NCjson* value)

View File

@ -122,7 +122,7 @@ NC_s3sdkinitialize(void)
}
/* Get environment information */
NC_s3sdkenvironment(void);
NC_s3sdkenvironment();
return NC_NOERR;
}

View File

@ -269,8 +269,8 @@ ncz_open_rootgroup(NC_FILE_INFO_T* dataset)
if((stat=nczm_concat(NULL,ZGROUP,&rootpath)))
goto done;
if((stat = NCZ_downloadjson(zfile->map, rootpath, &json)))
goto done;
if((stat = NCZ_downloadjson(zfile->map, rootpath, &json))) goto done;
if(json == NULL) goto done;
/* Process the json */
for(i=0;i<nclistlength(json->contents);i+=2) {
const NCjson* key = nclistget(json->contents,i);
@ -315,7 +315,7 @@ applycontrols(NCZ_FILE_INFO_T* zinfo)
int stat = NC_NOERR;
const char* value = NULL;
NClist* modelist = nclistnew();
int noflags = 0; /* track non-default negative flags */
size64_t noflags = 0; /* track non-default negative flags */
if((value = controllookup(zinfo->controllist,"mode")) != NULL) {
if((stat = NCZ_comma_parse(value,modelist))) goto done;

View File

@ -49,7 +49,7 @@ EXTERNL int NCZ_stringconvert(nc_type typid, size_t len, void* data0, NCjson** j
/* zsync.c */
EXTERNL int ncz_sync_file(NC_FILE_INFO_T* file, int isclose);
EXTERNL int ncz_sync_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp, int isclose);
EXTERNL int ncz_sync_atts(NC_FILE_INFO_T*, NC_OBJ* container, NCindex* attlist, int isclose);
EXTERNL int ncz_sync_atts(NC_FILE_INFO_T*, NC_OBJ* container, NCindex* attlist, NCjson* jatts, NCjson* jtypes, int isclose);
EXTERNL int ncz_read_grp(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp);
EXTERNL int ncz_read_atts(NC_FILE_INFO_T* file, NC_OBJ* container);
EXTERNL int ncz_read_vars(NC_FILE_INFO_T* file, NC_GRP_INFO_T* grp);
@ -62,8 +62,6 @@ EXTERNL int NCZ_grpkey(const NC_GRP_INFO_T* grp, char** pathp);
EXTERNL int NCZ_varkey(const NC_VAR_INFO_T* var, char** pathp);
EXTERNL int NCZ_dimkey(const NC_DIM_INFO_T* dim, char** pathp);
EXTERNL int ncz_splitkey(const char* path, NClist* segments);
EXTERNL int NCZ_readdict(NCZMAP* zmap, const char* key, NCjson** jsonp);
EXTERNL int NCZ_readarray(NCZMAP* zmap, const char* key, NCjson** jsonp);
EXTERNL int ncz_nctypedecode(const char* snctype, nc_type* nctypep);
EXTERNL int ncz_nctype2dtype(nc_type nctype, int endianness, int purezarr,int len, char** dnamep);
EXTERNL int ncz_dtype2nctype(const char* dtype, nc_type typehint, int purezarr, nc_type* nctypep, int* endianp, int* typelenp);

View File

@ -51,7 +51,7 @@ ncz_getattlist(NC_GRP_INFO_T *grp, int varid, NC_VAR_INFO_T **varp, NCindex **at
{
NC_VAR_INFO_T *var;
if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, varid)))
if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, (size_t)varid)))
return NC_ENOTVAR;
assert(var->hdr.id == varid);
@ -120,7 +120,7 @@ ncz_get_att_special(NC_FILE_INFO_T* h5, NC_VAR_INFO_T* var, const char* name,
/* The global reserved attributes */
if(strcmp(name,NCPROPS)==0) {
int len;
size_t len;
if(h5->provenance.ncproperties == NULL)
{stat = NC_ENOTATT; goto done;}
if(mem_type == NC_NAT) mem_type = NC_CHAR;
@ -138,7 +138,7 @@ ncz_get_att_special(NC_FILE_INFO_T* h5, NC_VAR_INFO_T* var, const char* name,
if(strcmp(name,SUPERBLOCKATT)==0)
iv = (unsigned long long)h5->provenance.superblockversion;
else /* strcmp(name,ISNETCDF4ATT)==0 */
iv = NCZ_isnetcdf4(h5);
iv = (unsigned long long)NCZ_isnetcdf4(h5);
if(mem_type == NC_NAT) mem_type = NC_INT;
if(data)
switch (mem_type) {
@ -279,8 +279,8 @@ NCZ_del_att(int ncid, int varid, const char *name)
NC_FILE_INFO_T *h5;
NC_ATT_INFO_T *att;
NCindex* attlist = NULL;
int i;
size_t deletedid;
size_t i;
int deletedid;
int retval;
/* Name must be provided. */
@ -516,7 +516,7 @@ ncz_put_att(NC_GRP_INFO_T* grp, int varid, const char *name, nc_type file_type,
/* For an existing att, if we're not in define mode, the len
must not be greater than the existing len for classic model. */
if (!(h5->flags & NC_INDEF) &&
len * nc4typelen(file_type) > (size_t)att->len * nc4typelen(att->nc_typeid))
len * (size_t)nc4typelen(file_type) > (size_t)att->len * (size_t)nc4typelen(att->nc_typeid))
{
if (h5->cmode & NC_CLASSIC_MODEL)
return NC_ENOTINDEFINE;
@ -980,7 +980,7 @@ int
ncz_create_fillvalue(NC_VAR_INFO_T* var)
{
int stat = NC_NOERR;
int i;
size_t i;
NC_ATT_INFO_T* fv = NULL;
/* Have the var's attributes been read? */

View File

@ -258,7 +258,7 @@ NCZ_compute_all_slice_projections(
NCZSliceProjections* results)
{
int stat = NC_NOERR;
size64_t r;
int r;
for(r=0;r<common->rank;r++) {
/* Compute each of the rank SliceProjections instances */

View File

@ -72,7 +72,7 @@ zclose_group(NC_GRP_INFO_T *grp)
{
int stat = NC_NOERR;
NCZ_GRP_INFO_T* zgrp;
int i;
size_t i;
assert(grp && grp->format_grp_info != NULL);
LOG((3, "%s: grp->name %s", __func__, grp->hdr.name));
@ -123,7 +123,7 @@ zclose_gatts(NC_GRP_INFO_T* grp)
{
int stat = NC_NOERR;
NC_ATT_INFO_T *att;
int a;
size_t a;
for(a = 0; a < ncindexsize(grp->att); a++) {
NCZ_ATT_INFO_T* zatt = NULL;
att = (NC_ATT_INFO_T* )ncindexith(grp->att, a);
@ -149,7 +149,7 @@ NCZ_zclose_var1(NC_VAR_INFO_T* var)
int stat = NC_NOERR;
NCZ_VAR_INFO_T* zvar;
NC_ATT_INFO_T* att;
int a;
size_t a;
assert(var && var->format_var_info);
zvar = var->format_var_info;;
@ -191,7 +191,7 @@ zclose_vars(NC_GRP_INFO_T* grp)
{
int stat = NC_NOERR;
NC_VAR_INFO_T* var;
int i;
size_t i;
for(i = 0; i < ncindexsize(grp->vars); i++) {
var = (NC_VAR_INFO_T*)ncindexith(grp->vars, i);
@ -215,7 +215,7 @@ zclose_dims(NC_GRP_INFO_T* grp)
{
int stat = NC_NOERR;
NC_DIM_INFO_T* dim;
int i;
size_t i;
for(i = 0; i < ncindexsize(grp->dim); i++) {
NCZ_DIM_INFO_T* zdim;
@ -265,7 +265,7 @@ static int
zclose_types(NC_GRP_INFO_T* grp)
{
int stat = NC_NOERR;
int i;
size_t i;
NC_TYPE_INFO_T* type;
for(i = 0; i < ncindexsize(grp->type); i++)
@ -289,7 +289,7 @@ static int
zwrite_vars(NC_GRP_INFO_T *grp)
{
int stat = NC_NOERR;
int i;
size_t i;
assert(grp && grp->format_grp_info != NULL);
LOG((3, "%s: grp->name %s", __func__, grp->hdr.name));

View File

@ -22,7 +22,6 @@
#define NCZ_CHUNKSIZE_FACTOR (10)
#define NCZ_MIN_CHUNK_SIZE (2)
/**************************************************/
/* Constants */
@ -39,56 +38,43 @@
# endif
#endif
/* V1 reserved objects */
#define NCZMETAROOT "/.nczarr"
#define NCZGROUP ".nczgroup"
#define NCZARRAY ".nczarray"
#define NCZATTRS ".nczattrs"
/* Deprecated */
#define NCZVARDEP ".nczvar"
#define NCZATTRDEP ".nczattr"
#define ZMETAROOT "/.zgroup"
#define ZMETAATTR "/.zattrs"
#define ZGROUP ".zgroup"
#define ZATTRS ".zattrs"
#define ZARRAY ".zarray"
/* Pure Zarr pseudo names */
#define ZDIMANON "_zdim"
/* V2 Reserved Attributes */
/*
Inserted into /.zgroup
For nczarr version 2.x.x, the following (key,value)
pairs are stored in .zgroup and/or .zarray.
Inserted into /.zattrs in root group
_nczarr_superblock: {"version": "2.0.0"}
Inserted into any .zgroup
Inserted into any group level .zattrs
"_nczarr_group": "{
\"dimensions\": {\"d1\": \"1\", \"d2\": \"1\",...}
\"variables\": [\"v1\", \"v2\", ...]
\"dimensions\": [{name: <dimname>, size: <integer>, unlimited: 1|0},...],
\"arrays\": [\"v1\", \"v2\", ...]
\"groups\": [\"g1\", \"g2\", ...]
}"
Inserted into any .zarray
Inserted into any array level .zattrs
"_nczarr_array": "{
\"dimensions\": [\"/g1/g2/d1\", \"/d2\",...]
\"storage\": \"scalar\"|\"contiguous\"|\"compact\"|\"chunked\"
\"dimension_references\": [\"/g1/g2/d1\", \"/d2\",...]
\"storage\": \"scalar\"|\"contiguous\"|\"chunked\"
}"
Inserted into any .zattrs ? or should it go into the container?
Inserted into any .zattrs
"_nczarr_attr": "{
\"types\": {\"attr1\": \"<i4\", \"attr2\": \"<i1\",...}
}
+
+Note: _nczarr_attr type include non-standard use of a zarr type "|U1" => NC_CHAR.
+
*/
#define NCZ_V2_SUPERBLOCK "_nczarr_superblock"
#define NCZ_V2_GROUP "_nczarr_group"
#define NCZ_V2_ARRAY "_nczarr_array"
#define NCZ_V2_ATTR NC_NCZARR_ATTR
#define NCZ_V2_SUPERBLOCK_UC "_NCZARR_SUPERBLOCK"
#define NCZ_V2_GROUP_UC "_NCZARR_GROUP"
#define NCZ_V2_ARRAY_UC "_NCZARR_ARRAY"
#define NCZ_V2_ATTR_UC NC_NCZARR_ATTR_UC
#define NCZ_V2_ATTR "_nczarr_attr" /* Must match value in include/nc4internal.h */
#define NCZARRCONTROL "nczarr"
#define PUREZARRCONTROL "zarr"

File diff suppressed because it is too large Load Diff

View File

@ -226,8 +226,9 @@ ncz_splitkey(const char* key, NClist* segments)
@internal Down load a .z... structure into memory
@param zmap - [in] controlling zarr map
@param key - [in] .z... object to load
@param jsonp - [out] root of the loaded json
@param jsonp - [out] root of the loaded json (NULL if key does not exist)
@return NC_NOERR
@return NC_EXXX
@author Dennis Heimbigner
*/
int
@ -238,17 +239,22 @@ NCZ_downloadjson(NCZMAP* zmap, const char* key, NCjson** jsonp)
char* content = NULL;
NCjson* json = NULL;
if((stat = nczmap_len(zmap, key, &len)))
goto done;
switch(stat = nczmap_len(zmap, key, &len)) {
case NC_NOERR: break;
case NC_ENOOBJECT: case NC_EEMPTY:
stat = NC_NOERR;
goto exit;
default: goto done;
}
if((content = malloc(len+1)) == NULL)
{stat = NC_ENOMEM; goto done;}
if((stat = nczmap_read(zmap, key, 0, len, (void*)content)))
goto done;
content[len] = '\0';
if((stat = NCJparse(content,0,&json)) < 0)
{stat = NC_ENCZARR; goto done;}
exit:
if(jsonp) {*jsonp = json; json = NULL;}
done:
@ -310,13 +316,9 @@ NCZ_createdict(NCZMAP* zmap, const char* key, NCjson** jsonp)
NCjson* json = NULL;
/* See if it already exists */
stat = NCZ_downloadjson(zmap,key,&json);
if(stat != NC_NOERR) {
if(stat == NC_EEMPTY) {/* create it */
if((stat = nczmap_def(zmap,key,NCZ_ISMETA)))
goto done;
} else
goto done;
if((stat = NCZ_downloadjson(zmap,key,&json))) goto done;
ifjson == NULL) {
if((stat = nczmap_def(zmap,key,NCZ_ISMETA))) goto done;
} else {
/* Already exists, fail */
stat = NC_EINVAL;
@ -346,18 +348,14 @@ NCZ_createarray(NCZMAP* zmap, const char* key, NCjson** jsonp)
int stat = NC_NOERR;
NCjson* json = NULL;
stat = NCZ_downloadjson(zmap,key,&json);
if(stat != NC_NOERR) {
if(stat == NC_EEMPTY) {/* create it */
if((stat = nczmap_def(zmap,key,NCZ_ISMETA)))
goto done;
/* Create the initial array */
if((stat = NCJnew(NCJ_ARRAY,&json)))
goto done;
} else {
stat = NC_EINVAL;
goto done;
}
if((stat = NCZ_downloadjson(zmap,key,&json))) goto done;
if(json == NULL) { /* create it */
if((stat = nczmap_def(zmap,key,NCZ_ISMETA))) goto done;
/* Create the initial array */
if((stat = NCJnew(NCJ_ARRAY,&json))) goto done;
} else {
stat = NC_EINVAL;
goto done;
}
if(json->sort != NCJ_ARRAY) {stat = NC_ENCZARR; goto done;}
if(jsonp) {*jsonp = json; json = NULL;}
@ -367,54 +365,6 @@ done:
}
#endif /*0*/
/**
@internal Get contents of a meta object; fail it it does not exist
@param zmap - [in] map
@param key - [in] key of the object
@param jsonp - [out] return parsed json
@return NC_NOERR
@return NC_EEMPTY [object did not exist]
@author Dennis Heimbigner
*/
int
NCZ_readdict(NCZMAP* zmap, const char* key, NCjson** jsonp)
{
int stat = NC_NOERR;
NCjson* json = NULL;
if((stat = NCZ_downloadjson(zmap,key,&json)))
goto done;
if(NCJsort(json) != NCJ_DICT) {stat = NC_ENCZARR; goto done;}
if(jsonp) {*jsonp = json; json = NULL;}
done:
NCJreclaim(json);
return stat;
}
/**
@internal Get contents of a meta object; fail it it does not exist
@param zmap - [in] map
@param key - [in] key of the object
@param jsonp - [out] return parsed json
@return NC_NOERR
@return NC_EEMPTY [object did not exist]
@author Dennis Heimbigner
*/
int
NCZ_readarray(NCZMAP* zmap, const char* key, NCjson** jsonp)
{
int stat = NC_NOERR;
NCjson* json = NULL;
if((stat = NCZ_downloadjson(zmap,key,&json)))
goto done;
if(NCJsort(json) != NCJ_ARRAY) {stat = NC_ENCZARR; goto done;}
if(jsonp) {*jsonp = json; json = NULL;}
done:
NCJreclaim(json);
return stat;
}
#if 0
/**
@internal Given an nc_type, produce the corresponding

View File

@ -78,7 +78,7 @@ NCZ_set_var_chunk_cache(int ncid, int varid, size_t cachesize, size_t nelems, fl
assert(grp && h5);
/* Find the var. */
if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, varid)))
if (!(var = (NC_VAR_INFO_T *)ncindexith(grp->vars, (size_t)varid)))
{retval = NC_ENOTVAR; goto done;}
assert(var && var->hdr.id == varid);
@ -140,7 +140,7 @@ fprintf(stderr,"xxx: adjusting cache for: %s\n",var->hdr.name);
zcache->chunksize = zvar->chunksize;
zcache->chunkcount = 1;
if(var->ndims > 0) {
int i;
size_t i;
for(i=0;i<var->ndims;i++) {
zcache->chunkcount *= var->chunksizes[i];
}
@ -184,7 +184,7 @@ NCZ_create_chunk_cache(NC_VAR_INFO_T* var, size64_t chunksize, char dimsep, NCZC
cache->chunkcount = 1;
if(var->ndims > 0) {
int i;
size_t i;
for(i=0;i<var->ndims;i++) {
cache->chunkcount *= var->chunksizes[i];
}
@ -297,7 +297,7 @@ NCZ_read_cache_chunk(NCZChunkCache* cache, const size64_t* indices, void** datap
/* Create a new entry */
if((entry = calloc(1,sizeof(NCZCacheEntry)))==NULL)
{stat = NC_ENOMEM; goto done;}
memcpy(entry->indices,indices,rank*sizeof(size64_t));
memcpy(entry->indices,indices,(size_t)rank*sizeof(size64_t));
/* Create the key for this cache */
if((stat = NCZ_buildchunkpath(cache,indices,&entry->key))) goto done;
entry->hashkey = hkey;
@ -496,7 +496,8 @@ done:
int
NCZ_ensure_fill_chunk(NCZChunkCache* cache)
{
int i, stat = NC_NOERR;
int stat = NC_NOERR;
size_t i;
NC_VAR_INFO_T* var = cache->var;
nc_type typeid = var->type_info->hdr.id;
size_t typesize = var->type_info->size;
@ -605,7 +606,7 @@ int
NCZ_buildchunkkey(size_t R, const size64_t* chunkindices, char dimsep, char** keyp)
{
int stat = NC_NOERR;
int r;
size_t r;
NCbytes* key = ncbytesnew();
if(keyp) *keyp = NULL;
@ -670,7 +671,7 @@ put_chunk(NCZChunkCache* cache, NCZCacheEntry* entry)
if((stat = NC_reclaim_data_all(file->controller,tid,entry->data,cache->chunkcount))) goto done;
entry->data = NULL;
entry->data = strchunk; strchunk = NULL;
entry->size = cache->chunkcount * maxstrlen;
entry->size = (cache->chunkcount * (size64_t)maxstrlen);
entry->isfixedstring = 1;
}
@ -865,7 +866,7 @@ NCZ_dumpxcacheentry(NCZChunkCache* cache, NCZCacheEntry* e, NCbytes* buf)
{
char s[8192];
char idx[64];
int i;
size_t i;
ncbytescat(buf,"{");
snprintf(s,sizeof(s),"modified=%u isfiltered=%u indices=",

View File

@ -49,13 +49,15 @@ static NC_reservedatt NC_reserved[] = {
{NC_ATT_FORMAT, READONLYFLAG}, /*_Format*/
{ISNETCDF4ATT, READONLYFLAG|NAMEONLYFLAG|VIRTUALFLAG}, /*_IsNetcdf4*/
{NCPROPS,READONLYFLAG|NAMEONLYFLAG|HIDDENATTRFLAG}, /*_NCProperties*/
{NC_NCZARR_ATTR_UC, READONLYFLAG|NAMEONLYFLAG|HIDDENATTRFLAG}, /*_NCZARR_ATTR */
{NC_ATT_COORDINATES, READONLYFLAG|HIDDENATTRFLAG}, /*_Netcdf4Coordinates*/
{NC_ATT_DIMID_NAME, READONLYFLAG|HIDDENATTRFLAG}, /*_Netcdf4Dimid*/
{SUPERBLOCKATT, READONLYFLAG|NAMEONLYFLAG|VIRTUALFLAG}, /*_SuperblockVersion*/
{NC_ATT_NC3_STRICT_NAME, READONLYFLAG}, /*_nc3_strict*/
{NC_ATT_NC3_STRICT_NAME, READONLYFLAG}, /*_nc3_strict*/
{NC_NCZARR_ATTR, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_attr */
{NC_NCZARR_GROUP, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_group */
{NC_NCZARR_ARRAY, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_array */
{NC_NCZARR_SUPERBLOCK, READONLYFLAG|HIDDENATTRFLAG}, /*_nczarr_superblock */
};
#define NRESERVED (sizeof(NC_reserved) / sizeof(NC_reservedatt)) /*|NC_reservedatt*/

View File

@ -228,7 +228,7 @@ ref_any.cdl ref_oldformat.cdl ref_oldformat.zip ref_newformatpure.cdl \
ref_groups.h5 ref_byte.zarr.zip ref_byte_fill_value_null.zarr.zip \
ref_groups_regular.cdl ref_byte.cdl ref_byte_fill_value_null.cdl \
ref_jsonconvention.cdl ref_jsonconvention.zmap \
ref_string.cdl ref_string_nczarr.baseline ref_string_zarr.baseline ref_scalar.cdl \
ref_string.cdl ref_string_nczarr.baseline ref_string_zarr.baseline ref_scalar.cdl ref_scalar_nczarr.cdl \
ref_nulls_nczarr.baseline ref_nulls_zarr.baseline ref_nulls.cdl ref_notzarr.tar.gz
# Interoperability files

View File

@ -50,7 +50,7 @@ typedef struct Format {
int debug;
int linear;
int holevalue;
int rank;
size_t rank;
size_t dimlens[NC_MAX_VAR_DIMS];
size_t chunklens[NC_MAX_VAR_DIMS];
size_t chunkcounts[NC_MAX_VAR_DIMS];
@ -60,7 +60,7 @@ typedef struct Format {
} Format;
typedef struct Odometer {
int rank; /*rank */
size_t rank; /*rank */
size_t start[NC_MAX_VAR_DIMS];
size_t stop[NC_MAX_VAR_DIMS];
size_t max[NC_MAX_VAR_DIMS]; /* max size of ith index */
@ -71,11 +71,11 @@ typedef struct Odometer {
#define ceildiv(x,y) (((x) % (y)) == 0 ? ((x) / (y)) : (((x) / (y)) + 1))
static char* captured[4096];
static int ncap = 0;
static size_t ncap = 0;
extern int nc__testurl(const char*,char**);
Odometer* odom_new(int rank, const size_t* stop, const size_t* max);
Odometer* odom_new(size_t rank, const size_t* stop, const size_t* max);
void odom_free(Odometer* odom);
int odom_more(Odometer* odom);
int odom_next(Odometer* odom);
@ -120,9 +120,9 @@ cleanup(void)
}
Odometer*
odom_new(int rank, const size_t* stop, const size_t* max)
odom_new(size_t rank, const size_t* stop, const size_t* max)
{
int i;
size_t i;
Odometer* odom = NULL;
if((odom = calloc(1,sizeof(Odometer))) == NULL)
return NULL;
@ -339,12 +339,12 @@ dump(Format* format)
{
void* chunkdata = NULL; /*[CHUNKPROD];*/
Odometer* odom = NULL;
int r;
size_t r;
size_t offset[NC_MAX_VAR_DIMS];
int holechunk = 0;
char sindices[64];
#ifdef H5
int i;
size_t i;
hid_t fileid, grpid, datasetid;
hid_t dxpl_id = H5P_DEFAULT; /*data transfer property list */
unsigned int filter_mask = 0;
@ -388,7 +388,7 @@ dump(Format* format)
if((chunkdata = calloc(sizeof(int),format->chunkprod))==NULL) usage(NC_ENOMEM);
printf("rank=%d dims=(%s) chunks=(%s)\n",format->rank,printvector(format->rank,format->dimlens),
printf("rank=%zu dims=(%s) chunks=(%s)\n",format->rank,printvector(format->rank,format->dimlens),
printvector(format->rank,format->chunklens));
while(odom_more(odom)) {
@ -506,12 +506,14 @@ done:
int
main(int argc, char** argv)
{
int i,stat = NC_NOERR;
int stat = NC_NOERR;
size_t i;
Format format;
int ncid, varid, dimids[NC_MAX_VAR_DIMS];
int vtype, storage;
int mode;
int c;
int r;
memset(&format,0,sizeof(format));
@ -577,7 +579,8 @@ main(int argc, char** argv)
/* Get the info about the var */
if((stat=nc_inq_varid(ncid,format.var_name,&varid))) usage(stat);
if((stat=nc_inq_var(ncid,varid,NULL,&vtype,&format.rank,dimids,NULL))) usage(stat);
if((stat=nc_inq_var(ncid,varid,NULL,&vtype,&r,dimids,NULL))) usage(stat);
format.rank = (size_t)r;
if(format.rank == 0) usage(NC_EDIMSIZE);
if((stat=nc_inq_var_chunking(ncid,varid,&storage,format.chunklens))) usage(stat);
if(storage != NC_CHUNKED) usage(NC_EBADCHUNK);

View File

@ -4,39 +4,21 @@ dimensions:
dim1 = 4 ;
dim2 = 4 ;
variables:
int ivar(dim0, dim1, dim2) ;
ivar:_FillValue = -2147483647 ;
ivar:_Storage = @chunked@ ;
ivar:_ChunkSizes = 4, 4, 4 ;
ivar:_Filter = @IH5@ ;
ivar:_Codecs = @ICX@ ;
float fvar(dim0, dim1, dim2) ;
fvar:_FillValue = 9.96921e+36f ;
fvar:_Storage = @chunked@ ;
fvar:_ChunkSizes = 4, 4, 4 ;
fvar:_Filter = @FH5@ ;
fvar:_Codecs = @FCX@ ;
int ivar(dim0, dim1, dim2) ;
ivar:_FillValue = -2147483647 ;
ivar:_Storage = @chunked@ ;
ivar:_ChunkSizes = 4, 4, 4 ;
ivar:_Filter = @IH5@ ;
ivar:_Codecs = @ICX@ ;
data:
ivar =
0, 1, 2, 3,
4, 5, 6, 7,
8, 9, 10, 11,
12, 13, 14, 15,
16, 17, 18, 19,
20, 21, 22, 23,
24, 25, 26, 27,
28, 29, 30, 31,
32, 33, 34, 35,
36, 37, 38, 39,
40, 41, 42, 43,
44, 45, 46, 47,
48, 49, 50, 51,
52, 53, 54, 55,
56, 57, 58, 59,
60, 61, 62, 63 ;
fvar =
0.5, 1.5, 2.5, 3.5,
4.5, 5.5, 6.5, 7.5,
@ -54,4 +36,22 @@ data:
52.5, 53.5, 54.5, 55.5,
56.5, 57.5, 58.5, 59.5,
60.5, 61.5, 62.5, 63.5 ;
ivar =
0, 1, 2, 3,
4, 5, 6, 7,
8, 9, 10, 11,
12, 13, 14, 15,
16, 17, 18, 19,
20, 21, 22, 23,
24, 25, 26, 27,
28, 29, 30, 31,
32, 33, 34, 35,
36, 37, 38, 39,
40, 41, 42, 43,
44, 45, 46, 47,
48, 49, 50, 51,
52, 53, 54, 55,
56, 57, 58, 59,
60, 61, 62, 63 ;
}

View File

@ -1,8 +1,8 @@
netcdf ref_byte {
dimensions:
_zdim_20 = 20 ;
_Anonymous_Dim_20 = 20 ;
variables:
ubyte byte(_zdim_20, _zdim_20) ;
ubyte byte(_Anonymous_Dim_20, _Anonymous_Dim_20) ;
byte:_Storage = "chunked" ;
byte:_ChunkSizes = 20, 20 ;

View File

@ -1,8 +1,8 @@
netcdf ref_byte_fill_value_null {
dimensions:
_zdim_20 = 20 ;
_Anonymous_Dim_20 = 20 ;
variables:
ubyte byt(_zdim_20, _zdim_20) ;
ubyte byt(_Anonymous_Dim_20, _Anonymous_Dim_20) ;
byt:_Storage = "chunked" ;
byt:_ChunkSizes = 20, 20 ;
byt:_NoFill = "true" ;

View File

@ -1,15 +1,15 @@
netcdf tmp_groups_regular {
dimensions:
_zdim_3 = 3 ;
_zdim_2 = 2 ;
_zdim_10 = 10 ;
_Anonymous_Dim_3 = 3 ;
_Anonymous_Dim_2 = 2 ;
_Anonymous_Dim_10 = 10 ;
// global attributes:
:_Format = "netCDF-4" ;
group: MyGroup {
variables:
int dset1(_zdim_3, _zdim_3) ;
int dset1(_Anonymous_Dim_3, _Anonymous_Dim_3) ;
dset1:_Storage = "chunked" ;
dset1:_ChunkSizes = 3, 3 ;
dset1:_NoFill = "true" ;
@ -24,7 +24,7 @@ group: MyGroup {
group: Group_A {
variables:
int dset2(_zdim_2, _zdim_10) ;
int dset2(_Anonymous_Dim_2, _Anonymous_Dim_10) ;
dset2:_Storage = "chunked" ;
dset2:_ChunkSizes = 2, 10 ;
dset2:_NoFill = "true" ;

View File

@ -5,8 +5,8 @@ variables:
int v(d1) ;
v:varjson1 = "{\"key1\": [1,2,3], \"key2\": {\"key3\": \"abc\"}}" ;
v:varjson2 = "[[1.0,0.0,0.0],[0.0,1.0,0.0],[0.0,0.0,1.0]]" ;
v:varvec1 = "1.0, 0.0, 0.0" ;
v:varvec2 = "[0.,0.,1.]" ;
v:varjson3 = "[0.,0.,1.]" ;
v:varchar1 = "1.0, 0.0, 0.0" ;
// global attributes:
:globalfloat = 1. ;

View File

@ -1,5 +1,5 @@
[0] /.zattrs : () |{"globalfloat": 1, "globalfloatvec": [1,2], "globalchar": "abc", "globalillegal": "[ [ 1.0, 0.0, 0.0 ], [ 0.0, 1.0, 0.0 ], [ 0.0, 0.0, 1.0 ", "_nczarr_attr": {"types": {"globalfloat": "<f8", "globalfloatvec": "<f8", "globalchar": ">S1", "globalillegal": ">S1", "_NCProperties": ">S1"}}}|
[1] /.zgroup : () |{"zarr_format": 2, "_nczarr_superblock": {"version": "2.0.0"}, "_nczarr_group": {"dims": {"d1": 1}, "vars": ["v"], "groups": []}}|
[3] /v/.zarray : () |{"zarr_format": 2, "shape": [1], "dtype": "<i4", "chunks": [1], "fill_value": -2147483647, "order": "C", "compressor": null, "filters": null, "_nczarr_array": {"dimrefs": ["/d1"], "storage": "chunked"}}|
[4] /v/.zattrs : () |{"varjson1": {"key1": [1,2,3], "key2": {"key3": "abc"}}, "varjson2": [[1.0,0.0,0.0],[0.0,1.0,0.0],[0.0,0.0,1.0]], "varvec1": "1.0, 0.0, 0.0", "varvec2": [0.,0.,1.], "_ARRAY_DIMENSIONS": ["d1"], "_nczarr_attr": {"types": {"varjson1": ">S1", "varjson2": ">S1", "varvec1": ">S1", "varvec2": ">S1"}}}|
[0] /.zattrs : () |{"globalfloat": 1, "globalfloatvec": [1,2], "globalchar": "abc", "globalillegal": "[ [ 1.0, 0.0, 0.0 ], [ 0.0, 1.0, 0.0 ], [ 0.0, 0.0, 1.0 ", "_nczarr_group": {"dimensions": {"d1": 1}, "arrays": ["v"], "groups": []}, "_nczarr_superblock": {"version": "2.0.0"}, "_nczarr_attr": {"types": {"globalfloat": "<f8", "globalfloatvec": "<f8", "globalchar": ">S1", "globalillegal": ">S1", "_NCProperties": ">S1", "_nczarr_group": "|J0", "_nczarr_superblock": "|J0", "_nczarr_attr": "|J0"}}}|
[1] /.zgroup : () |{"zarr_format": 2}|
[3] /v/.zarray : () |{"zarr_format": 2, "shape": [1], "dtype": "<i4", "chunks": [1], "fill_value": -2147483647, "order": "C", "compressor": null, "filters": null}|
[4] /v/.zattrs : () |{"varjson1": {"key1": [1,2,3], "key2": {"key3": "abc"}}, "varjson2": [[1.0,0.0,0.0],[0.0,1.0,0.0],[0.0,0.0,1.0]], "varjson3": [0.,0.,1.], "varchar1": "1.0, 0.0, 0.0", "_ARRAY_DIMENSIONS": ["d1"], "_nczarr_array": {"dimrefs": ["/d1"], "storage": "chunked"}, "_nczarr_attr": {"types": {"varjson1": ">S1", "varjson2": ">S1", "varjson3": ">S1", "varchar1": ">S1", "_nczarr_array": "|J0", "_nczarr_attr": "|J0"}}}|
[5] /v/0 : (4) (ubyte) |...|

View File

@ -1,8 +1,8 @@
netcdf nczarr2zarr {
dimensions:
_zdim_8 = 8 ;
_Anonymous_Dim_8 = 8 ;
variables:
int v(_zdim_8, _zdim_8) ;
int v(_Anonymous_Dim_8, _Anonymous_Dim_8) ;
v:_FillValue = -1 ;
data:

View File

@ -1,8 +1,8 @@
netcdf ref_oldformat {
dimensions:
lat = 8 ;
_zdim_8 = 8 ;
_zdim_10 = 10 ;
_Anonymous_Dim_8 = 8 ;
_Anonymous_Dim_10 = 10 ;
variables:
int lat(lat) ;
lat:_FillValue = -1 ;
@ -13,7 +13,7 @@ data:
group: g1 {
variables:
int pos(_zdim_8, _zdim_10) ;
int pos(_Anonymous_Dim_8, _Anonymous_Dim_10) ;
pos:_FillValue = -1 ;
string pos:pos_attr = "latXlon" ;

View File

@ -1,9 +1,9 @@
netcdf tmp_purezarr {
dimensions:
_zdim_2 = 2 ;
_zdim_5 = 5 ;
_Anonymous_Dim_2 = 2 ;
_Anonymous_Dim_5 = 5 ;
variables:
int i(_zdim_2, _zdim_5) ;
int i(_Anonymous_Dim_2, _Anonymous_Dim_5) ;
data:
i =

View File

@ -0,0 +1,8 @@
netcdf ref_scalar {
variables:
int v ;
v:_FillValue = -1 ;
data:
v = 17 ;
}

View File

@ -15,7 +15,7 @@ group: _zgroup {
group: _nczgroup {
// group attributes:
:data = "{\"dims\": {\"dim1\": 1},\"vars\": [],\"groups\": []}" ;
:data = "{\"dimensions\": {\"dim1\": 1},\"arrays\": [],\"groups\": []}" ;
} // group _nczgroup
group: _nczattr {

View File

@ -15,7 +15,7 @@ group: _zgroup {
group: _nczgroup {
// group attributes:
:data = "{\"dims\": {},\"vars\": [\"var1\"],\"groups\": []}" ;
:data = "{\"dimensions\": {},\"arrays\": [\"var1\"],\"groups\": []}" ;
} // group _nczgroup
group: _nczattr {

View File

@ -1 +1 @@
[0] /.nczarr : (0) ||
[0] /.zgroup : (0) ||

View File

@ -1,4 +1,4 @@
/meta2/.nczarray: |{
/meta2/.zarray: |{
"foo": 42,
"bar": "apples",
"baz": [1, 2, 3, 4],

View File

@ -1,8 +1,8 @@
[0] /
[1] /.nczarr
[1] /.zgroup
[2] /data1
[3] /data1/0
[4] /meta1
[5] /meta1/.zarray
[6] /meta2
[7] /meta2/.nczarray
[7] /meta2/.zarray

View File

@ -1,10 +1,10 @@
[0] /.nczarr : (0) ||
[0] /.zgroup : (0) ||
[2] /data1/0 : (25) (int) |0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24|
[4] /meta1/.zarray : (50) |{
"foo": 42,
"bar": "apples",
"baz": [1, 2, 3, 4]}|
[6] /meta2/.nczarray : (64) |{
[6] /meta2/.zarray : (64) |{
"foo": 42,
"bar": "apples",
"baz": [1, 2, 3, 4],

View File

@ -1,4 +1,4 @@
[0] /.nczarr : (0) ||
[0] /.zgroup : (0) ||
[2] /meta1/.zarray : (50) |{
"foo": 42,
"bar": "apples",

View File

@ -1,9 +1,9 @@
[0] /.nczarr : (0) ||
[0] /.zgroup : (0) ||
[2] /meta1/.zarray : (50) |{
"foo": 42,
"bar": "apples",
"baz": [1, 2, 3, 4]}|
[4] /meta2/.nczarray : (64) |{
[4] /meta2/.zarray : (64) |{
"foo": 42,
"bar": "apples",
"baz": [1, 2, 3, 4],

View File

@ -1 +1 @@
[0] /.nczarr : (0) ||
[0] /.zgroup : (0) ||

View File

@ -1,4 +1,4 @@
[0] /.nczarr : (50) |{
[0] /.zgroup : (50) |{
"foo": 42,
"bar": "apples",
"baz": [1, 2, 3, 4]}|

View File

@ -1,4 +1,4 @@
[0] /.nczarr : (50) |{
[0] /.zgroup : (50) |{
"foo": 42,
"bar": "apples",
"baz": [1, 2, 3, 4]}|

View File

@ -1,5 +1,5 @@
[0] /
[1] /.nczarr
[1] /.zgroup
[2] /data1
[3] /meta1
[4] /meta1/.zarray

View File

@ -15,7 +15,7 @@ group: _zgroup {
group: _nczgroup {
// group attributes:
:data = "{\"dims\": {},\"vars\": [],\"groups\": []}" ;
:data = "{\"dimensions\": {},\"arrays\": [],\"groups\": []}" ;
} // group _nczgroup
group: _nczattr {

View File

@ -23,15 +23,13 @@ deletemap $zext $file
${NCGEN} -4 -b -o "$fileurl" $srcdir/ref_jsonconvention.cdl
${NCDUMP} $fileurl > tmp_jsonconvention_${zext}.cdl
${ZMD} -h $fileurl > tmp_jsonconvention_${zext}.txt
# | sed -e 's/,key1=value1|key2=value2//' -e '/"_NCProperties"/ s/(378)/(354)/'
# Clean up extraneous changes so comparisons work
# remove '\n' from ref file before comparing
#sed -e 's|\\n||g' < ${srcdir}/ref_jsonconvention.cdl > tmp_jsonconvention_clean.cdl
cat < ${srcdir}/ref_jsonconvention.cdl > tmp_jsonconvention_clean.cdl
cat < tmp_jsonconvention_${zext}.cdl > tmp_jsonconvention_clean_${zext}.cdl
sed -e 's|\(.z[a-z][a-z]*\) : ([0-9][0-9]*)|\1 : ()|g' < tmp_jsonconvention_${zext}.txt >tmp1.tmp
sed -e 's|"_NCProperties": "version=[0-9],[^"]*",||' <tmp1.tmp > tmp_jsonconvention_clean_${zext}.txt
diff -b tmp_jsonconvention_clean.cdl tmp_jsonconvention_clean_${zext}.cdl
cat < tmp_jsonconvention_${zext}.cdl > tmp_jsonconvention_clean_${zext}.cdl
cat < tmp_jsonconvention_${zext}.txt > tmp_jsonconvention_clean_${zext}.txt
sed -i.bak -e 's|"_NCProperties": "version=[0-9],[^"]*",||' tmp_jsonconvention_clean_${zext}.txt
sed -i.bak -e 's|\(.z[a-z][a-z]*\) : ([0-9][0-9]*)|\1 : ()|g' tmp_jsonconvention_clean_${zext}.txt
# compare
diff -b ${srcdir}/ref_jsonconvention.cdl tmp_jsonconvention_clean_${zext}.cdl
diff -b ${srcdir}/ref_jsonconvention.zmap tmp_jsonconvention_clean_${zext}.txt
}

View File

@ -50,7 +50,7 @@ ${NCDUMP} -n ref_scalar $nczarrurl > tmp_scalar_nczarr_${zext}.cdl
${ZMD} -h $nczarrurl > tmp_scalar_nczarr_${zext}.txt
echo "*** verify"
diff -bw $top_srcdir/nczarr_test/ref_scalar.cdl tmp_scalar_nczarr_${zext}.cdl
diff -bw $top_srcdir/nczarr_test/ref_scalar_nczarr.cdl tmp_scalar_nczarr_${zext}.cdl
# Fixup
zarrscalar tmp_scalar_zarr_${zext}.cdl tmp_rescale_zarr_${zext}.cdl

View File

@ -159,7 +159,8 @@ done:
static int
cloneArray(NCjson* array, NCjson** clonep)
{
int i, stat=NC_NOERR;
int stat=NC_NOERR;
size_t i;
NCjson* clone = NULL;
if((stat=NCJnew(NCJ_ARRAY,&clone))) goto done;
for(i=0;i<NCJlength(array);i++) {
@ -276,7 +277,8 @@ dump(NCjson* json)
static void
dumpR(NCjson* json, int depth)
{
int ok, count, i;
int ok, count;
size_t i;
long long int64v;
double float64v;
@ -285,12 +287,12 @@ dumpR(NCjson* json, int depth)
case NCJ_STRING: printf("\"%s\"",NCJstring(json)); break;
case NCJ_INT:
ok = sscanf(NCJstring(json),"%lld%n",&int64v,&count);
if(ok != 1 || count != strlen(NCJstring(json))) goto fail;
if(ok != 1 || count != (int)strlen(NCJstring(json))) goto fail;
printf("%lld",int64v);
break;
case NCJ_DOUBLE:
ok = sscanf(NCJstring(json),"%lg%n",&float64v,&count);
if(ok != 1 || count != strlen(NCJstring(json))) goto fail;
if(ok != 1 || count != (int)strlen(NCJstring(json))) goto fail;
printf("%lg",float64v);
break;
case NCJ_BOOLEAN:

View File

@ -95,7 +95,7 @@ simplecreate(void)
if((stat = nczmap_create(impl,url,0,0,NULL,&map)))
goto done;
if((stat=nczm_concat(NULL,NCZMETAROOT,&path)))
if((stat=nczm_concat(NULL,ZMETAROOT,&path)))
goto done;
/* Write empty metadata content */
@ -158,7 +158,7 @@ writemeta2(void)
if((stat = nczmap_open(impl,url,NC_WRITE,0,NULL,&map)))
goto done;
if((stat=nczm_concat(META2,NCZARRAY,&path)))
if((stat=nczm_concat(META2,ZARRAY,&path)))
goto done;
if((stat = nczmap_write(map, path, strlen(metadata2), metadata2)))
goto done;
@ -228,7 +228,7 @@ readmeta2(void)
if((stat = nczmap_open(impl,url,0,0,NULL,&map)))
goto done;
if((stat = readkey(map,META2,NCZARRAY)))
if((stat = readkey(map,META2,ZARRAY)))
goto done;
done:
@ -309,7 +309,7 @@ readdata(void)
/* Validate */
for(i=0;i<DATA1LEN;i++) {
if(data1[i] != i) {
if(i != (size_t)data1[i]) {
fprintf(stderr,"data mismatch: is: %d should be: %llu\n",data1[i],i);
stat = NC_EINVAL;
goto done;

View File

@ -99,7 +99,7 @@ simplecreate(void)
printf("Pass: create: create: %s\n",url);
truekey = makekey(NCZMETAROOT);
truekey = makekey(ZMETAROOT);
if((stat = nczmap_write(map, truekey, 0, NULL)))
goto done;
printf("Pass: create: defineobj: %s\n",truekey);
@ -184,7 +184,7 @@ simplemeta(void)
report(PASS,"open",map);
/* Make sure .nczarr exists (from simplecreate) */
truekey = makekey(NCZMETAROOT);
truekey = makekey(ZMETAROOT);
if((stat = nczmap_exists(map,truekey)))
goto done;
report(PASS,".nczarr: exists",map);
@ -199,7 +199,7 @@ simplemeta(void)
report(PASS,".zarray: def",map);
free(truekey); truekey = NULL;
truekey = makekey(NCZMETAROOT);
truekey = makekey(ZMETAROOT);
if((stat = nczmap_write(map, truekey, strlen(metadata1), metadata1)))
goto done;
report(PASS,".nczarr: writemetadata",map);
@ -225,7 +225,7 @@ simplemeta(void)
report(PASS,"re-open",map);
/* Read previously written */
truekey = makekey(NCZMETAROOT);
truekey = makekey(ZMETAROOT);
if((stat = nczmap_exists(map, truekey)))
goto done;
report(PASS,".nczarr: exists",map);

View File

@ -58,7 +58,7 @@ sortname(int thesort)
static void
jsontrace(NCjson* json, int depth)
{
int i;
size_t i;
if(json == NULL) goto done;
printf("[%d] sort=%s",depth,sortname(NCJsort(json)));
switch(NCJsort(json)) {