Commit Graph

3 Commits

Author SHA1 Message Date
Dennis Heimbigner
74b40fd788 Upgrade the nczarr code to match Zarr V2
Re: https://github.com/zarr-developers/zarr-python/pull/716

The Zarr version 2 spec has been extended to include the ability
to choose the dimension separator in chunk name keys. The legal
separators has been extended from {'.'} to {'.' '/'}.  So now it
is possible to use a key like "0/1/2/0" for chunk names.

This PR implements this for NCZarr. The V2 spec now says that
this separator can be set on a per-variable basis. For now, I
have chosen to allow this be set only globally by adding a key
named "ZARR.DIMENSION_SEPARATOR=<char>" in the
.daprc/.dodsrc/ncrc file. Currently, the only legal separator
characters are '.' (the default) and '/'. On writing, this key
will only be written if its value is different than the default.
This change caused problems because supporting a separator of '/'
is difficult to parse when keys/paths use '/' as the path separator.
A test case was added for this.

Additionally, make nczarr be enabled default by default. This required
some additional changes so that if zip and/or AWS S3 sdk are unavailable,
then they are disabled for NCZarr.

In addition the following unrelated changes were made.

1. Tested that pure-zarr mode could read an nczarr formatted store.
1. The .rc file handling now merges all known .rc files (.ncrc,.daprc, and .dodsrc) in that order and using those in HOME first, then in current directory. For duplicate entries, the later ones override the earlier ones. This change is to remove some of the conflicts inherent in the current .rc file load process. A set of test cases was also added.
1. Re-order tests in configure.ac and CMakeLists.txt so that if libcurl
   is not found then the other options that depend upon it properly
   are disabled.
1. I decided that xarray support should be enabled by default for pure
   zarr. In order to allow disabling, I added a new mode flag "noxarray".
1. Certain test in nczarr_test depend on use of .dodsrc. In order for these
   to work when testing in parallel, some inter-test dependencies needed to
   be added.
1. Improved authorization testing to use changes in thredds.ucar.edu
2021-04-24 19:48:15 -06:00
Dennis Heimbigner
0454d8e235 Addendum: This PR has been extended to include
interoperability fixed. We were given a Zarr format dataset
stored as a directory+file tree. This dataset uses the XArray
conventions and was generated by some non-Unidata Zarr implementation.
In attempting to process it with NCZarr, several interoperability
problems were discovered and fixed. This gives us more confidence
that NCZarr -- using pure zarr -- can interoperate with other
Zarr implementations.

Specific changes:
* Add test nczarr_test/run_interop.sh
* Support attributes with single value not enclosed in JSON array tags.
* Add mode inferencing and use it in nczarr_test/run_purezarr.sh
* Reduce size of tst_err_enddef.nc because it is more than 3 GB.
2021-04-02 18:39:50 -06:00
Dennis Heimbigner
2afbdbd18f Add support for the XArray Zarr _ARRAY_DIMENSIONS attribute
The XArray implementation that uses Zarr for storage
provides a mechanism to simulate named dimensions.
It does this by adding a per-variable attribute called
_ARRAY_DIMENSIONS. This attribute contains a list of names
to be matched against the shape values of the variable.
In effect a named dimension is created with the name
_ARRAY_DIMENSIONS(i) and length shape(i) for all i
in range 0..rank(variable).
Both read and write support is provided.

This XArray support is only invoked if the mode value
of "xarray" is defined. So for example, as in this URL.
````
https://s3.us-west-1.amazonaws.com/bucket/dataset#mode=nczarr,xarray,s3
````
Note that the "xarray" mode flag also implies mode flag "zarr", so the above
is equivalent to this URL.
````
https://s3.us-west-1.amazonaws.com/bucket/dataset#mode=nczarr,zarr,xarray,s3
````

The primary change to implement this was to unify the handling
of dimension references in libnczarr/zsync.

A test for this and other pure-zarr features was added as
nczarr_test/run_purezarr.sh

Other changes:
* Make sure distcheck leaves no files around.
* Change the special attribute flag DIMSCALEFLAG to HIDDENATTRFLAG
  to support the xarray attribute.
* Annotate the zmap implementations with feature flags such as
  WRITEONCE (for zip files).
2021-02-24 13:46:11 -07:00