2010-06-03 21:24:43 +08:00
|
|
|
/*********************************************************************
|
2018-12-07 06:36:53 +08:00
|
|
|
* Copyright 2018, University Corporation for Atmospheric Research
|
2010-06-03 21:24:43 +08:00
|
|
|
* See netcdf/README file for copying and redistribution conditions.
|
2017-09-01 06:16:21 +08:00
|
|
|
* Thanks to Philippe Poilbarbe and Antonio S. Cofiño for
|
2010-08-29 23:08:12 +08:00
|
|
|
* compression additions.
|
|
|
|
* $Id: nccopy.c 400 2010-08-27 21:02:52Z russ $
|
2010-06-03 21:24:43 +08:00
|
|
|
*********************************************************************/
|
|
|
|
|
|
|
|
#include "config.h" /* for USE_NETCDF4 macro */
|
2023-10-26 23:01:24 +08:00
|
|
|
#include <stddef.h>
|
2010-06-03 21:24:43 +08:00
|
|
|
#include <stdlib.h>
|
2017-05-15 08:10:02 +08:00
|
|
|
#include <stdio.h>
|
2011-03-22 02:38:10 +08:00
|
|
|
#ifdef HAVE_GETOPT_H
|
|
|
|
#include <getopt.h>
|
|
|
|
#endif
|
2020-05-19 09:36:28 +08:00
|
|
|
|
2021-05-27 04:27:27 +08:00
|
|
|
#if defined(_WIN32) && !defined(__MINGW32__)
|
2020-05-19 09:36:28 +08:00
|
|
|
#include "XGetopt.h"
|
|
|
|
#endif
|
|
|
|
|
2012-08-28 05:19:25 +08:00
|
|
|
#ifdef HAVE_UNISTD_H
|
2010-06-03 21:24:43 +08:00
|
|
|
#include <unistd.h>
|
|
|
|
#endif
|
|
|
|
#include <string.h>
|
2017-05-15 08:10:02 +08:00
|
|
|
#include "netcdf.h"
|
2019-02-01 12:13:06 +08:00
|
|
|
#include "netcdf_filter.h"
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
#include "netcdf_aux.h"
|
2010-06-03 21:24:43 +08:00
|
|
|
#include "nciter.h"
|
2011-01-06 07:48:47 +08:00
|
|
|
#include "utils.h"
|
2014-03-22 02:34:19 +08:00
|
|
|
#include "chunkspec.h"
|
2011-01-10 05:41:07 +08:00
|
|
|
#include "dimmap.h"
|
2013-01-24 01:45:29 +08:00
|
|
|
#include "nccomps.h"
|
2018-07-27 10:16:02 +08:00
|
|
|
#include "list.h"
|
2020-10-14 09:12:15 +08:00
|
|
|
#include "ncpathmgr.h"
|
2010-06-03 21:24:43 +08:00
|
|
|
|
2018-01-20 09:45:56 +08:00
|
|
|
#undef DEBUGFILTER
|
2020-06-06 07:03:29 +08:00
|
|
|
#undef DEBUGCHUNK
|
2018-01-20 03:06:02 +08:00
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
/* default bytes of memory we are willing to allocate for variable
|
|
|
|
* values during copy */
|
|
|
|
#define COPY_BUFFER_SIZE (5000000)
|
2011-07-08 02:36:00 +08:00
|
|
|
#define COPY_CHUNKCACHE_PREEMPTION (1.0f) /* for copying, can eject fully read chunks */
|
2010-06-03 21:24:43 +08:00
|
|
|
#define SAME_AS_INPUT (-1) /* default, if kind not specified */
|
2014-03-05 06:03:49 +08:00
|
|
|
#define CHUNK_THRESHOLD (8192) /* non-record variables with fewer bytes don't get chunked */
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
#ifndef USE_NETCDF4
|
2010-08-29 23:08:12 +08:00
|
|
|
#define NC_CLASSIC_MODEL 0x0100 /* Enforce classic model if netCDF-4 not available. */
|
2010-06-03 21:24:43 +08:00
|
|
|
#endif
|
|
|
|
|
2017-05-15 08:10:02 +08:00
|
|
|
/* Ascii characters requiring escaping as lead*/
|
|
|
|
#define ESCAPESD "0123456789"
|
|
|
|
#define ESCAPES " !\"#$%&'()*,:;<=>?[]\\^`{|}~"
|
|
|
|
|
2018-07-27 10:16:02 +08:00
|
|
|
#define DFALTUNLIMSIZE (4* 1<<20) /*4 megabytes */
|
|
|
|
|
2017-05-15 08:10:02 +08:00
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
|
|
|
|
/* The unique id for a variable requires also the enclosing group id */
|
|
|
|
typedef struct VarID {
|
|
|
|
int grpid;
|
|
|
|
int varid;
|
|
|
|
} VarID;
|
|
|
|
|
|
|
|
#define MAX_FILTER_SPECS 64
|
|
|
|
#define MAX_FILTER_PARAMS 256
|
|
|
|
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
struct FilterOption {
|
2020-02-17 03:59:33 +08:00
|
|
|
char* fqn; /* Of variable */
|
2018-03-03 07:55:58 +08:00
|
|
|
int nofilter; /* 1=> do not apply any filters to this variable */
|
2020-09-28 02:43:46 +08:00
|
|
|
NC_H5_Filterspec pfs;
|
2017-05-15 08:10:02 +08:00
|
|
|
};
|
|
|
|
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
static List* filteroptions = NULL;
|
2018-03-03 07:55:58 +08:00
|
|
|
static int suppressfilters = 0; /* 1 => do not apply any output filters unless specified */
|
2017-05-15 08:10:02 +08:00
|
|
|
|
2020-02-17 03:59:33 +08:00
|
|
|
/* Forward declaration, because copy_type, copy_vlen_type call each other */
|
|
|
|
static int copy_type(int igrp, nc_type typeid, int ogrp);
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
static void freefilteroptlist(List* specs);
|
2020-09-28 02:43:46 +08:00
|
|
|
static void freefilterlist(size_t nfilters, NC_H5_Filterspec** filters);
|
2020-02-17 03:59:33 +08:00
|
|
|
|
2017-05-15 08:10:02 +08:00
|
|
|
#endif
|
|
|
|
|
2018-07-27 10:16:02 +08:00
|
|
|
/* table of formats for legal -k values */
|
|
|
|
static struct Kvalues {
|
|
|
|
char* name;
|
|
|
|
int kind;
|
|
|
|
} legalkinds[] = {
|
|
|
|
/* NetCDF-3 classic format (32-bit offsets) */
|
|
|
|
{"classic", NC_FORMAT_CLASSIC}, /* canonical format name */
|
|
|
|
{"nc3", NC_FORMAT_CLASSIC}, /* short format name */
|
|
|
|
{"1", NC_FORMAT_CLASSIC}, /* deprecated, use "-3" or "-k nc3" instead */
|
|
|
|
|
|
|
|
/* NetCDF-3 64-bit offset format */
|
|
|
|
{"64-bit offset", NC_FORMAT_64BIT_OFFSET}, /* canonical format name */
|
|
|
|
{"nc6", NC_FORMAT_64BIT_OFFSET}, /* short format name */
|
|
|
|
{"2", NC_FORMAT_64BIT_OFFSET}, /* deprecated, use "-6" or "-k nc6" instead */
|
|
|
|
{"64-bit-offset", NC_FORMAT_64BIT_OFFSET}, /* deprecated alias */
|
|
|
|
|
|
|
|
/* NetCDF-4 HDF5-based format */
|
|
|
|
{"netCDF-4", NC_FORMAT_NETCDF4}, /* canonical format name */
|
|
|
|
{"nc4", NC_FORMAT_NETCDF4}, /* short format name */
|
|
|
|
{"3", NC_FORMAT_NETCDF4}, /* deprecated, use "-4" or "-k nc4" instead */
|
|
|
|
{"netCDF4", NC_FORMAT_NETCDF4}, /* deprecated aliases */
|
|
|
|
{"hdf5", NC_FORMAT_NETCDF4},
|
|
|
|
{"enhanced", NC_FORMAT_NETCDF4},
|
|
|
|
|
|
|
|
/* NetCDF-4 HDF5-based format, restricted to classic data model */
|
|
|
|
{"netCDF-4 classic model", NC_FORMAT_NETCDF4_CLASSIC}, /* canonical format name */
|
|
|
|
{"nc7", NC_FORMAT_NETCDF4_CLASSIC}, /* short format name */
|
|
|
|
{"4", NC_FORMAT_NETCDF4_CLASSIC}, /* deprecated, use "-7" or -k nc7" */
|
|
|
|
{"netCDF-4-classic", NC_FORMAT_NETCDF4_CLASSIC}, /* deprecated aliases */
|
|
|
|
{"netCDF-4_classic", NC_FORMAT_NETCDF4_CLASSIC},
|
|
|
|
{"netCDF4_classic", NC_FORMAT_NETCDF4_CLASSIC},
|
|
|
|
{"hdf5-nc3", NC_FORMAT_NETCDF4_CLASSIC},
|
|
|
|
{"enhanced-nc3", NC_FORMAT_NETCDF4_CLASSIC},
|
|
|
|
|
|
|
|
/* The 64-bit data (CDF5) kind (5) */
|
|
|
|
{"5", NC_FORMAT_CDF5},
|
|
|
|
{"64-bit-data", NC_FORMAT_CDF5},
|
|
|
|
{"64-bit data", NC_FORMAT_CDF5},
|
|
|
|
{"nc5", NC_FORMAT_CDF5},
|
|
|
|
{"cdf5", NC_FORMAT_CDF5},
|
|
|
|
|
|
|
|
/* null terminate*/
|
|
|
|
{NULL,0}
|
|
|
|
};
|
|
|
|
|
2010-08-29 23:08:12 +08:00
|
|
|
/* Global variables for command-line requests */
|
2011-01-06 07:48:47 +08:00
|
|
|
char *progname; /* for error messages */
|
2011-07-08 02:36:00 +08:00
|
|
|
static int option_kind = SAME_AS_INPUT;
|
2010-09-01 11:21:08 +08:00
|
|
|
static int option_deflate_level = -1; /* default, compress output only if input compressed */
|
|
|
|
static int option_shuffle_vars = NC_NOSHUFFLE; /* default, no shuffling on compression */
|
|
|
|
static int option_fix_unlimdims = 0; /* default, preserve unlimited dimensions */
|
2018-07-27 10:16:02 +08:00
|
|
|
static List* option_chunkspecs = NULL; /* default, no chunk specification */
|
2011-07-08 02:36:00 +08:00
|
|
|
static size_t option_copy_buffer_size = COPY_BUFFER_SIZE;
|
|
|
|
static size_t option_chunk_cache_size = CHUNK_CACHE_SIZE; /* default from config.h */
|
|
|
|
static size_t option_chunk_cache_nelems = CHUNK_CACHE_NELEMS; /* default from config.h */
|
2012-04-13 01:18:06 +08:00
|
|
|
static int option_read_diskless = 0; /* default, don't read input into memory on open */
|
|
|
|
static int option_write_diskless = 0; /* default, don't write output to diskless file */
|
2018-02-25 11:36:24 +08:00
|
|
|
#ifdef USE_NETCDF4
|
2018-07-27 10:16:02 +08:00
|
|
|
static size_t option_min_chunk_bytes = CHUNK_THRESHOLD; /* default, don't chunk variable if prod of
|
2012-12-12 01:49:40 +08:00
|
|
|
* chunksizes of its dimensions is smaller
|
|
|
|
* than this */
|
2018-02-25 11:36:24 +08:00
|
|
|
#endif
|
2013-01-24 01:45:29 +08:00
|
|
|
static int option_nlgrps = 0; /* Number of groups specified with -g
|
|
|
|
* option on command line */
|
|
|
|
static char** option_lgrps = 0; /* list of group names specified with -g
|
|
|
|
* option on command line */
|
|
|
|
static idnode_t* option_grpids = 0; /* list of grpids matching list specified with -g option */
|
|
|
|
static bool_t option_grpstruct = false; /* if -g set, copy structure for non-selected groups */
|
|
|
|
static int option_nlvars = 0; /* Number of variables specified with -v * option on command line */
|
2018-07-27 10:16:02 +08:00
|
|
|
static char** option_lvars = 0; /* list of variable names specified with -v
|
|
|
|
* option on command line */
|
|
|
|
static bool_t option_varstruct = false; /* if -v set, copy structure for non-selected vars */
|
2012-12-12 01:49:40 +08:00
|
|
|
static int option_compute_chunkcaches = 0; /* default, don't try still flaky estimate of
|
|
|
|
* chunk cache for each variable */
|
2010-06-03 21:24:43 +08:00
|
|
|
/* get group id in output corresponding to group igrp in input,
|
|
|
|
* given parent group id (or root group id) parid in output. */
|
|
|
|
static int
|
|
|
|
get_grpid(int igrp, int parid, int *ogrpp) {
|
|
|
|
int stat = NC_NOERR;
|
2012-04-20 23:42:55 +08:00
|
|
|
int ogid = parid; /* like igrp but in output file */
|
2010-06-03 21:24:43 +08:00
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
int inparid;
|
|
|
|
|
|
|
|
/* if not root group, get corresponding output groupid from group name */
|
|
|
|
stat = nc_inq_grp_parent(igrp, &inparid);
|
|
|
|
if(stat == NC_NOERR) { /* not root group */
|
|
|
|
char grpname[NC_MAX_NAME + 1];
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_grpname(igrp, grpname));
|
|
|
|
NC_CHECK(nc_inq_grp_ncid(parid, grpname, &ogid));
|
2010-06-03 21:24:43 +08:00
|
|
|
} else if(stat == NC_ENOGRP) { /* root group */
|
|
|
|
stat = NC_NOERR;
|
|
|
|
} else {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(stat);
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
#endif /* USE_NETCDF4 */
|
|
|
|
*ogrpp = ogid;
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2012-12-12 01:49:40 +08:00
|
|
|
/* Return size in bytes of a variable value */
|
|
|
|
static size_t
|
|
|
|
val_size(int grpid, int varid) {
|
|
|
|
nc_type vartype;
|
|
|
|
size_t value_size;
|
|
|
|
NC_CHECK(nc_inq_vartype(grpid, varid, &vartype));
|
|
|
|
NC_CHECK(nc_inq_type(grpid, vartype, NULL, &value_size));
|
|
|
|
return value_size;
|
|
|
|
}
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
#ifdef USE_NETCDF4
|
2010-09-30 23:19:07 +08:00
|
|
|
/* Get parent id needed to define a new group from its full name in an
|
|
|
|
* open file identified by ncid. Assumes all intermediate groups are
|
|
|
|
* already defined. */
|
|
|
|
static int
|
|
|
|
nc_inq_parid(int ncid, const char *fullname, int *locidp) {
|
|
|
|
char *parent = strdup(fullname);
|
|
|
|
char *slash = "/"; /* groupname separator */
|
|
|
|
char *last_slash;
|
2010-12-16 05:45:05 +08:00
|
|
|
if(parent == NULL) {
|
2013-01-24 01:45:29 +08:00
|
|
|
return NC_ENOMEM; /* exits */
|
2012-11-17 05:37:43 +08:00
|
|
|
}
|
|
|
|
last_slash = strrchr(parent, '/');
|
2012-08-01 06:17:42 +08:00
|
|
|
if(last_slash == parent || last_slash == NULL) { /* parent is root */
|
2010-09-30 23:19:07 +08:00
|
|
|
free(parent);
|
|
|
|
parent = strdup(slash);
|
|
|
|
} else {
|
|
|
|
*last_slash = '\0'; /* truncate to get parent name */
|
|
|
|
}
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_grp_full_ncid(ncid, parent, locidp));
|
|
|
|
free(parent);
|
2013-01-24 01:45:29 +08:00
|
|
|
return NC_NOERR;
|
2010-09-30 23:19:07 +08:00
|
|
|
}
|
|
|
|
|
2017-05-15 08:10:02 +08:00
|
|
|
/* Compute the fully qualified name of a (grpid,varid) pair; caller must free */
|
|
|
|
static int
|
|
|
|
computeFQN(VarID vid, char** fqnp)
|
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
size_t len;
|
|
|
|
char* fqn = NULL;
|
|
|
|
char vname[NC_MAX_NAME+1];
|
|
|
|
char escname[(2*NC_MAX_NAME)+1];
|
|
|
|
int first;
|
|
|
|
char *p, *q;
|
|
|
|
|
|
|
|
if((stat = nc_inq_grpname_full(vid.grpid,&len,NULL))) goto done;
|
|
|
|
fqn = (char*)malloc(len+1+(2*NC_MAX_NAME)+1);
|
|
|
|
if(fqn == NULL) {stat = NC_ENOMEM; goto done;}
|
|
|
|
if((stat=nc_inq_grpname_full(vid.grpid,&len,fqn))) goto done;
|
|
|
|
fqn[len] = '\0'; /* guarantee */
|
|
|
|
if((stat=nc_inq_varname(vid.grpid,vid.varid,vname))) goto done;
|
|
|
|
vname[NC_MAX_NAME] = '\0';
|
|
|
|
if(strlen(fqn) > 1) strcat(fqn,"/");
|
|
|
|
p = vname;
|
|
|
|
q = escname;
|
|
|
|
for(first=1;*p;first=0) {
|
|
|
|
if((first && strchr(ESCAPESD,*p) != NULL)
|
2018-07-27 10:16:02 +08:00
|
|
|
|| strchr(ESCAPES,*p) != NULL) *q++ = '\\';
|
2017-05-15 08:10:02 +08:00
|
|
|
*q++ = *p++;
|
|
|
|
}
|
|
|
|
*q++ = '\0'; /* guarantee */
|
|
|
|
strcat(fqn,escname);
|
|
|
|
done:
|
Fix various problem around VLEN's
re: https://github.com/Unidata/netcdf-c/issues/541
re: https://github.com/Unidata/netcdf-c/issues/1208
re: https://github.com/Unidata/netcdf-c/issues/2078
re: https://github.com/Unidata/netcdf-c/issues/2041
re: https://github.com/Unidata/netcdf-c/issues/2143
For a long time, there have been known problems with the
management of complex types containing VLENs. This also
involves the string type because it is stored as a VLEN of
chars.
This PR (mostly) fixes this problem. But note that it adds new
functions to netcdf.h (see below) and this may require bumping
the .so number. These new functions can be removed, if desired,
in favor of functions in netcdf_aux.h, but netcdf.h seems the
better place for them because they are intended as alternatives
to the nc_free_vlen and nc_free_string functions already in
netcdf.h.
The term complex type refers to any type that directly or
transitively references a VLEN type. So an array of VLENS, a
compound with a VLEN field, and so on.
In order to properly handle instances of these complex types, it
is necessary to have function that can recursively walk
instances of such types to perform various actions on them. The
term "deep" is also used to mean recursive.
At the moment, the two operations needed by the netcdf library are:
* free'ing an instance of the complex type
* copying an instance of the complex type.
The current library does only shallow free and shallow copy of
complex types. This means that only the top level is properly
free'd or copied, but deep internal blocks in the instance are
not touched.
Note that the term "vector" will be used to mean a contiguous (in
memory) sequence of instances of some type. Given an array with,
say, dimensions 2 X 3 X 4, this will be stored in memory as a
vector of length 2*3*4=24 instances.
The use cases are primarily these.
## nc_get_vars
Suppose one is reading a vector of instances using nc_get_vars
(or nc_get_vara or nc_get_var, etc.). These functions will
return the vector in the top-level memory provided. All
interior blocks (form nested VLEN or strings) will have been
dynamically allocated.
After using this vector of instances, it is necessary to free
(aka reclaim) the dynamically allocated memory, otherwise a
memory leak occurs. So, the recursive reclaim function is used
to walk the returned instance vector and do a deep reclaim of
the data.
Currently functions are defined in netcdf.h that are supposed to
handle this: nc_free_vlen(), nc_free_vlens(), and
nc_free_string(). Unfortunately, these functions only do a
shallow free, so deeply nested instances are not properly
handled by them.
Note that internally, the provided data is immediately written so
there is no need to copy it. But the caller may need to reclaim the
data it passed into the function.
## nc_put_att
Suppose one is writing a vector of instances as the data of an attribute
using, say, nc_put_att.
Internally, the incoming attribute data must be copied and stored
so that changes/reclamation of the input data will not affect
the attribute.
Again, the code inside the netcdf library does only shallow copying
rather than deep copy. As a result, one sees effects such as described
in Github Issue https://github.com/Unidata/netcdf-c/issues/2143.
Also, after defining the attribute, it may be necessary for the user
to free the data that was provided as input to nc_put_att().
## nc_get_att
Suppose one is reading a vector of instances as the data of an attribute
using, say, nc_get_att.
Internally, the existing attribute data must be copied and returned
to the caller, and the caller is responsible for reclaiming
the returned data.
Again, the code inside the netcdf library does only shallow copying
rather than deep copy. So this can lead to memory leaks and errors
because the deep data is shared between the library and the user.
# Solution
The solution is to build properly recursive reclaim and copy
functions and use those as needed.
These recursive functions are defined in libdispatch/dinstance.c
and their signatures are defined in include/netcdf.h.
For back compatibility, corresponding "ncaux_XXX" functions
are defined in include/netcdf_aux.h.
````
int nc_reclaim_data(int ncid, nc_type xtypeid, void* memory, size_t count);
int nc_reclaim_data_all(int ncid, nc_type xtypeid, void* memory, size_t count);
int nc_copy_data(int ncid, nc_type xtypeid, const void* memory, size_t count, void* copy);
int nc_copy_data_all(int ncid, nc_type xtypeid, const void* memory, size_t count, void** copyp);
````
There are two variants. The first two, nc_reclaim_data() and
nc_copy_data(), assume the top-level vector is managed by the
caller. For reclaim, this is so the user can use, for example, a
statically allocated vector. For copy, it assumes the user
provides the space into which the copy is stored.
The second two, nc_reclaim_data_all() and
nc_copy_data_all(), allows the functions to manage the
top-level. So for nc_reclaim_data_all, the top level is
assumed to be dynamically allocated and will be free'd by
nc_reclaim_data_all(). The nc_copy_data_all() function
will allocate the top level and return a pointer to it to the
user. The user can later pass that pointer to
nc_reclaim_data_all() to reclaim the instance(s).
# Internal Changes
The netcdf-c library internals are changed to use the proper
reclaim and copy functions. It turns out that the places where
these functions are needed is quite pervasive in the netcdf-c
library code. Using these functions also allows some
simplification of the code since the stdata and vldata fields of
NC_ATT_INFO are no longer needed. Currently this is commented
out using the SEPDATA \#define macro. When any bugs are largely
fixed, all this code will be removed.
# Known Bugs
1. There is still one known failure that has not been solved.
All the failures revolve around some variant of this .cdl file.
The proximate cause of failure is the use of a VLEN FillValue.
````
netcdf x {
types:
float(*) row_of_floats ;
dimensions:
m = 5 ;
variables:
row_of_floats ragged_array(m) ;
row_of_floats ragged_array:_FillValue = {-999} ;
data:
ragged_array = {10, 11, 12, 13, 14}, {20, 21, 22, 23}, {30, 31, 32},
{40, 41}, _ ;
}
````
When a solution is found, I will either add it to this PR or post a new PR.
# Related Changes
* Mark nc_free_vlen(s) as deprecated in favor of ncaux_reclaim_data.
* Remove the --enable-unfixed-memory-leaks option.
* Remove the NC_VLENS_NOTEST code that suppresses some vlen tests.
* Document this change in docs/internal.md
* Disable the tst_vlen_data test in ncdump/tst_nccopy4.sh.
* Mark types as fixed size or not (transitively) to optimize the reclaim
and copy functions.
# Misc. Changes
* Make Doxygen process libdispatch/daux.c
* Make sure the NC_ATT_INFO_T.container field is set.
2022-01-09 09:30:00 +08:00
|
|
|
if(stat == NC_NOERR && fqnp != NULL) {*fqnp = fqn; fqn = NULL;}
|
2017-05-15 08:10:02 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
2019-02-09 09:48:17 +08:00
|
|
|
parsevarlist(char* vars, List* vlist)
|
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
char* q = NULL;
|
|
|
|
int nvars = 0;
|
|
|
|
|
|
|
|
/* Special case 1: empty set of vars */
|
|
|
|
if(vars == NULL || strlen(vars)==0) {stat = NC_EINVAL; goto done;}
|
|
|
|
|
|
|
|
/* Special case 2: "*" */
|
|
|
|
if(strcmp(vars,"*")==0) {
|
|
|
|
listpush(vlist,strdup("*"));
|
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
|
2020-02-17 03:59:33 +08:00
|
|
|
/* Walk delimitng on '&' separators */
|
2019-02-09 09:48:17 +08:00
|
|
|
for(q=vars;*q;q++) {
|
|
|
|
if(*q == '\\') q++;
|
2020-02-17 03:59:33 +08:00
|
|
|
else if(*q == '&') {*q = '\0'; nvars++;}
|
2019-02-09 09:48:17 +08:00
|
|
|
/* else continue */
|
|
|
|
}
|
|
|
|
nvars++; /*for last var*/
|
|
|
|
/* Rewalk to capture the variables */
|
|
|
|
for(q=vars;nvars > 0; nvars--) {
|
|
|
|
listpush(vlist,strdup(q));
|
|
|
|
q += (strlen(q)+1); /* move to next */
|
|
|
|
}
|
|
|
|
|
|
|
|
done:
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int
|
|
|
|
parsefilterspec(const char* optarg0, List* speclist)
|
2017-05-15 08:10:02 +08:00
|
|
|
{
|
2018-01-24 07:00:11 +08:00
|
|
|
int stat = NC_NOERR;
|
2017-05-15 08:10:02 +08:00
|
|
|
char* optarg = NULL;
|
2017-10-09 05:56:45 +08:00
|
|
|
char* p = NULL;
|
2017-05-15 08:10:02 +08:00
|
|
|
char* remainder = NULL;
|
2019-02-09 09:48:17 +08:00
|
|
|
List* vlist = NULL;
|
|
|
|
int i;
|
|
|
|
int isnone = 0;
|
2020-02-17 03:59:33 +08:00
|
|
|
size_t nfilters = 0;
|
2020-09-28 02:43:46 +08:00
|
|
|
NC_H5_Filterspec** filters = NULL;
|
2017-10-09 05:56:45 +08:00
|
|
|
|
2019-02-09 09:48:17 +08:00
|
|
|
if(optarg0 == NULL || strlen(optarg0) == 0 || speclist == NULL) return 0;
|
2017-05-15 08:10:02 +08:00
|
|
|
optarg = strdup(optarg0);
|
2019-02-09 09:48:17 +08:00
|
|
|
/* Delimit the initial set of variables, taking escapes into account */
|
2017-05-15 08:10:02 +08:00
|
|
|
p = optarg;
|
|
|
|
remainder = NULL;
|
2019-02-09 09:48:17 +08:00
|
|
|
for(;;p++) {
|
|
|
|
if(*p == '\0') {remainder = p; break;}
|
2017-05-15 08:10:02 +08:00
|
|
|
else if(*p == ',') {*p = '\0'; remainder = p+1; break;}
|
2019-02-09 09:48:17 +08:00
|
|
|
else if(*p == '\\') p++;
|
2017-05-15 08:10:02 +08:00
|
|
|
/* else continue */
|
|
|
|
}
|
2019-02-09 09:48:17 +08:00
|
|
|
/* Parse the variable list */
|
|
|
|
if((vlist = listnew()) == NULL) {stat = NC_ENOMEM; goto done;}
|
2020-12-08 02:29:12 +08:00
|
|
|
if((stat=parsevarlist(optarg,vlist))) goto done;
|
2019-02-09 09:48:17 +08:00
|
|
|
|
|
|
|
if(strcasecmp(remainder,"none") != 0) {
|
|
|
|
/* Collect the id+parameters */
|
2020-09-28 02:43:46 +08:00
|
|
|
if((stat=ncaux_h5filterspec_parselist(remainder,NULL,&nfilters,&filters))) goto done;
|
2020-02-17 03:59:33 +08:00
|
|
|
} else {
|
2019-02-09 09:48:17 +08:00
|
|
|
isnone = 1;
|
2020-02-17 03:59:33 +08:00
|
|
|
if(nfilters == 0) {
|
|
|
|
/* Add a fake filter */
|
2020-09-28 02:43:46 +08:00
|
|
|
NC_H5_Filterspec* nilspec = (NC_H5_Filterspec*)calloc(1,sizeof(NC_H5_Filterspec));
|
2020-02-17 03:59:33 +08:00
|
|
|
if(nilspec == NULL) {stat = NC_ENOMEM; goto done;}
|
|
|
|
nfilters = 1;
|
2020-09-28 02:43:46 +08:00
|
|
|
filters = calloc(1,sizeof(NC_H5_Filterspec**));
|
2020-02-17 03:59:33 +08:00
|
|
|
if(filters == NULL) {free(nilspec); stat = NC_ENOMEM; goto done;}
|
|
|
|
filters[0] = nilspec; nilspec = NULL;
|
|
|
|
}
|
|
|
|
}
|
2020-12-08 02:29:12 +08:00
|
|
|
|
2019-02-09 09:48:17 +08:00
|
|
|
/* Construct a spec entry for each element in vlist */
|
|
|
|
for(i=0;i<listlength(vlist);i++) {
|
2020-02-17 03:59:33 +08:00
|
|
|
int k;
|
2019-02-09 09:48:17 +08:00
|
|
|
size_t vlen;
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
struct FilterOption* filtopt = NULL;
|
2019-02-09 09:48:17 +08:00
|
|
|
const char* var = listget(vlist,i);
|
|
|
|
if(var == NULL || strlen(var) == 0) continue;
|
|
|
|
vlen = strlen(var);
|
2020-02-17 03:59:33 +08:00
|
|
|
for(k=0;k<nfilters;k++) {
|
2020-09-28 02:43:46 +08:00
|
|
|
NC_H5_Filterspec* nsf = filters[k];
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
if((filtopt = calloc(1,sizeof(struct FilterOption)))==NULL)
|
2020-02-17 03:59:33 +08:00
|
|
|
{stat = NC_ENOMEM; goto done;}
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
filtopt->fqn = malloc(vlen+1+1); /* make room for nul and possible prefix '/' */
|
|
|
|
if(filtopt->fqn == NULL) {stat = NC_ENOMEM; goto done;}
|
|
|
|
filtopt->fqn[0] = '\0'; /* for strlcat */
|
|
|
|
if(strcmp(var,"*") != 0 && var[0] != '/') strlcat(filtopt->fqn,"/",vlen+2);
|
|
|
|
strlcat(filtopt->fqn,var,vlen+2);
|
2020-02-17 03:59:33 +08:00
|
|
|
if(isnone)
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
filtopt->nofilter = 1;
|
2020-02-17 03:59:33 +08:00
|
|
|
else {
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
filtopt->pfs = *nsf;
|
2020-02-17 03:59:33 +08:00
|
|
|
if(nsf->nparams != 0) {
|
|
|
|
/* Duplicate the params */
|
2020-09-28 02:43:46 +08:00
|
|
|
filtopt->pfs.params = calloc(filtopt->pfs.nparams,sizeof(unsigned int));
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
if(filtopt->pfs.params == NULL) {stat = NC_ENOMEM; goto done;}
|
2020-09-28 02:43:46 +08:00
|
|
|
memcpy(filtopt->pfs.params,nsf->params,sizeof(unsigned int)*filtopt->pfs.nparams);
|
2020-02-17 03:59:33 +08:00
|
|
|
} else
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
filtopt->pfs.params = NULL;
|
2020-02-17 03:59:33 +08:00
|
|
|
}
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
listpush(filteroptions,filtopt);
|
|
|
|
filtopt = NULL;
|
2018-01-24 07:00:11 +08:00
|
|
|
}
|
|
|
|
}
|
2020-12-08 02:29:12 +08:00
|
|
|
|
2018-03-03 07:55:58 +08:00
|
|
|
done:
|
2020-02-17 03:59:33 +08:00
|
|
|
freefilterlist(nfilters,filters);
|
2019-02-09 09:48:17 +08:00
|
|
|
if(vlist) listfreeall(vlist);
|
2018-03-03 07:55:58 +08:00
|
|
|
if(optarg) free(optarg);
|
2018-01-24 07:00:11 +08:00
|
|
|
return stat;
|
2017-05-15 08:10:02 +08:00
|
|
|
}
|
|
|
|
|
2021-02-01 06:10:39 +08:00
|
|
|
/* Return 1 if variable has only active (ie not none) filters */
|
2020-02-17 03:59:33 +08:00
|
|
|
static int
|
2021-02-01 06:10:39 +08:00
|
|
|
varfiltersactive(const char* ofqn)
|
2019-05-24 06:35:03 +08:00
|
|
|
{
|
2020-12-08 02:29:12 +08:00
|
|
|
int i;
|
2021-02-01 06:10:39 +08:00
|
|
|
int hasnone = 0;
|
|
|
|
int hasactive = 0;
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
/* See which output filter options are defined for this output variable */
|
|
|
|
for(i=0;i<listlength(filteroptions);i++) {
|
|
|
|
struct FilterOption* opt = listget(filteroptions,i);
|
2021-02-01 06:10:39 +08:00
|
|
|
if(strcmp(opt->fqn,"*")==0 || strcmp(opt->fqn,ofqn)==0)
|
|
|
|
{if(opt->nofilter) hasnone = 1;} else {hasactive = 1;}
|
2019-05-24 06:35:03 +08:00
|
|
|
}
|
2021-02-01 06:10:39 +08:00
|
|
|
return (hasactive && !hasnone ? 1 : 0);
|
2019-05-24 06:35:03 +08:00
|
|
|
}
|
|
|
|
|
2021-02-01 06:10:39 +08:00
|
|
|
/* Return 1 if variable has "none" filters */
|
|
|
|
static int
|
|
|
|
varfilterssuppress(const char* ofqn)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
int hasnone = 0;
|
|
|
|
/* See which output filter options are defined for this output variable */
|
|
|
|
for(i=0;i<listlength(filteroptions);i++) {
|
|
|
|
struct FilterOption* opt = listget(filteroptions,i);
|
|
|
|
if(strcmp(opt->fqn,"*")==0 || strcmp(opt->fqn,ofqn)==0)
|
|
|
|
{if(opt->nofilter) hasnone = 1;}
|
|
|
|
}
|
|
|
|
return hasnone || suppressfilters;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Return list of active filters */
|
2020-02-17 03:59:33 +08:00
|
|
|
static List*
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
filteroptsforvar(const char* ofqn)
|
2020-02-17 03:59:33 +08:00
|
|
|
{
|
2020-12-08 02:29:12 +08:00
|
|
|
int i;
|
2020-02-17 03:59:33 +08:00
|
|
|
List* list = listnew();
|
2021-02-01 06:10:39 +08:00
|
|
|
/* See which output filter options are defined for this output variable;
|
|
|
|
both active and none. */
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
for(i=0;i<listlength(filteroptions);i++) {
|
|
|
|
struct FilterOption* opt = listget(filteroptions,i);
|
2021-02-01 06:10:39 +08:00
|
|
|
if(strcmp(opt->fqn,"*")==0 || strcmp(opt->fqn,ofqn)==0) {
|
|
|
|
if(!opt->nofilter) /* Add to the list */
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
listpush(list,opt);
|
2020-02-17 03:59:33 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
return list;
|
|
|
|
}
|
2017-05-15 08:10:02 +08:00
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
/* Return size of chunk in bytes for a variable varid in a group igrp, or 0 if
|
2020-03-01 03:06:21 +08:00
|
|
|
* layout is contiguous|compact */
|
2010-06-03 21:24:43 +08:00
|
|
|
static int
|
|
|
|
inq_var_chunksize(int igrp, int varid, size_t* chunksizep) {
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int ndims;
|
2010-07-30 06:41:05 +08:00
|
|
|
size_t *chunksizes;
|
2010-06-03 21:24:43 +08:00
|
|
|
int dim;
|
2020-03-01 03:06:21 +08:00
|
|
|
int contig = NC_CONTIGUOUS;
|
2010-06-03 21:24:43 +08:00
|
|
|
nc_type vartype;
|
|
|
|
size_t value_size;
|
|
|
|
size_t prod;
|
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_vartype(igrp, varid, &vartype));
|
2010-06-03 21:24:43 +08:00
|
|
|
/* from type, get size in memory needed for each value */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_type(igrp, vartype, NULL, &value_size));
|
2010-06-03 21:24:43 +08:00
|
|
|
prod = value_size;
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_varndims(igrp, varid, &ndims));
|
2023-11-25 02:20:52 +08:00
|
|
|
chunksizes = (size_t *) emalloc((size_t)(ndims + 1) * sizeof(size_t));
|
2020-03-01 03:06:21 +08:00
|
|
|
contig = NC_CHUNKED;
|
|
|
|
NC_CHECK(nc_inq_var_chunking(igrp, varid, &contig, NULL));
|
|
|
|
if(contig != NC_CHUNKED) {
|
2010-06-03 21:24:43 +08:00
|
|
|
*chunksizep = 0;
|
2010-09-30 23:19:07 +08:00
|
|
|
} else {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_var_chunking(igrp, varid, &contig, chunksizes));
|
2010-09-30 23:19:07 +08:00
|
|
|
for(dim = 0; dim < ndims; dim++) {
|
|
|
|
prod *= chunksizes[dim];
|
|
|
|
}
|
|
|
|
*chunksizep = prod;
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
2010-07-30 06:41:05 +08:00
|
|
|
free(chunksizes);
|
2010-06-03 21:24:43 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2011-07-08 02:36:00 +08:00
|
|
|
/* Return estimated number of elems required in chunk cache and
|
|
|
|
* estimated size of chunk cache adequate to efficiently copy input
|
|
|
|
* variable ivarid to output variable ovarid, which may have different
|
|
|
|
* chunk size and shape */
|
|
|
|
static int
|
|
|
|
inq_var_chunking_params(int igrp, int ivarid, int ogrp, int ovarid,
|
|
|
|
size_t* chunkcache_sizep,
|
|
|
|
size_t *chunkcache_nelemsp,
|
2018-07-27 10:16:02 +08:00
|
|
|
float * chunkcache_preemptionp)
|
2011-07-08 02:36:00 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int ndims;
|
|
|
|
size_t *ichunksizes, *ochunksizes;
|
|
|
|
int dim;
|
2020-03-01 03:06:21 +08:00
|
|
|
int icontig = NC_CONTIGUOUS, ocontig = NC_CONTIGUOUS;
|
2011-07-08 02:36:00 +08:00
|
|
|
nc_type vartype;
|
|
|
|
size_t value_size;
|
|
|
|
size_t prod, iprod, oprod;
|
|
|
|
size_t nelems;
|
|
|
|
*chunkcache_nelemsp = CHUNK_CACHE_NELEMS;
|
|
|
|
*chunkcache_sizep = CHUNK_CACHE_SIZE;
|
|
|
|
*chunkcache_preemptionp = COPY_CHUNKCACHE_PREEMPTION;
|
|
|
|
|
|
|
|
NC_CHECK(nc_inq_varndims(igrp, ivarid, &ndims));
|
2020-03-01 03:06:21 +08:00
|
|
|
icontig = (ocontig = NC_CHUNKED);
|
|
|
|
NC_CHECK(nc_inq_var_chunking(igrp, ivarid, &icontig, NULL));
|
|
|
|
NC_CHECK(nc_inq_var_chunking(ogrp, ovarid, &ocontig, NULL));
|
|
|
|
if(icontig != NC_CHUNKED && ocontig != NC_CHUNKED) { /* no chunking in input or output */
|
2011-07-08 02:36:00 +08:00
|
|
|
*chunkcache_nelemsp = 0;
|
|
|
|
*chunkcache_sizep = 0;
|
|
|
|
*chunkcache_preemptionp = 0;
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
NC_CHECK(nc_inq_vartype(igrp, ivarid, &vartype));
|
|
|
|
NC_CHECK(nc_inq_type(igrp, vartype, NULL, &value_size));
|
|
|
|
iprod = value_size;
|
|
|
|
|
2020-03-01 03:06:21 +08:00
|
|
|
if(icontig == NC_CHUNKED && ocontig != NC_CHUNKED) { /* chunking only in input */
|
2011-07-08 02:36:00 +08:00
|
|
|
*chunkcache_nelemsp = 1; /* read one input chunk at a time */
|
|
|
|
*chunkcache_sizep = iprod;
|
|
|
|
*chunkcache_preemptionp = 1.0f;
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2023-11-25 02:20:52 +08:00
|
|
|
ichunksizes = (size_t *) emalloc((size_t)(ndims + 1) * sizeof(size_t));
|
2020-03-01 03:06:21 +08:00
|
|
|
if(icontig != NC_CHUNKED) { /* if input contiguous|compact, treat as if chunked on
|
2011-07-08 02:36:00 +08:00
|
|
|
* first dimension */
|
|
|
|
ichunksizes[0] = 1;
|
|
|
|
for(dim = 1; dim < ndims; dim++) {
|
|
|
|
ichunksizes[dim] = dim;
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
NC_CHECK(nc_inq_var_chunking(igrp, ivarid, &icontig, ichunksizes));
|
|
|
|
}
|
|
|
|
|
2020-03-01 03:06:21 +08:00
|
|
|
/* now can pretend chunking in both input and output */
|
2023-11-25 02:20:52 +08:00
|
|
|
ochunksizes = (size_t *) emalloc((size_t)(ndims + 1) * sizeof(size_t));
|
2011-07-08 02:36:00 +08:00
|
|
|
NC_CHECK(nc_inq_var_chunking(ogrp, ovarid, &ocontig, ochunksizes));
|
|
|
|
|
|
|
|
nelems = 1;
|
|
|
|
oprod = value_size;
|
|
|
|
for(dim = 0; dim < ndims; dim++) {
|
|
|
|
nelems += 1 + (ichunksizes[dim] - 1) / ochunksizes[dim];
|
|
|
|
iprod *= ichunksizes[dim];
|
|
|
|
oprod *= ochunksizes[dim];
|
|
|
|
}
|
|
|
|
prod = iprod + oprod * (nelems - 1);
|
|
|
|
*chunkcache_nelemsp = nelems;
|
|
|
|
*chunkcache_sizep = prod;
|
|
|
|
free(ichunksizes);
|
|
|
|
free(ochunksizes);
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2017-09-01 06:16:21 +08:00
|
|
|
/*
|
2010-06-03 21:24:43 +08:00
|
|
|
* copy a user-defined variable length type in the group igrp to the
|
|
|
|
* group ogrp
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
copy_vlen_type(int igrp, nc_type itype, int ogrp)
|
|
|
|
{
|
2017-09-01 06:16:21 +08:00
|
|
|
int stat = NC_NOERR;
|
2010-06-03 21:24:43 +08:00
|
|
|
nc_type ibasetype;
|
|
|
|
nc_type obasetype; /* base type in target group */
|
|
|
|
char name[NC_MAX_NAME];
|
|
|
|
size_t size;
|
|
|
|
char basename[NC_MAX_NAME];
|
|
|
|
size_t basesize;
|
|
|
|
nc_type vlen_type;
|
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_vlen(igrp, itype, name, &size, &ibasetype));
|
2010-06-03 21:24:43 +08:00
|
|
|
/* to get base type id in target group, use name of base type in
|
|
|
|
* source group */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_type(igrp, ibasetype, basename, &basesize));
|
2010-06-03 21:24:43 +08:00
|
|
|
stat = nc_inq_typeid(ogrp, basename, &obasetype);
|
|
|
|
/* if no such type, create it now */
|
|
|
|
if(stat == NC_EBADTYPE) {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(copy_type(igrp, ibasetype, ogrp));
|
2010-06-03 21:24:43 +08:00
|
|
|
stat = nc_inq_typeid(ogrp, basename, &obasetype);
|
|
|
|
}
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(stat);
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
/* Now we know base type exists in output and we know its type id */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_def_vlen(ogrp, name, obasetype, &vlen_type));
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2017-09-01 06:16:21 +08:00
|
|
|
/*
|
2010-06-03 21:24:43 +08:00
|
|
|
* copy a user-defined opaque type in the group igrp to the group ogrp
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
copy_opaque_type(int igrp, nc_type itype, int ogrp)
|
|
|
|
{
|
2017-09-01 06:16:21 +08:00
|
|
|
int stat = NC_NOERR;
|
2010-06-03 21:24:43 +08:00
|
|
|
nc_type otype;
|
|
|
|
char name[NC_MAX_NAME];
|
|
|
|
size_t size;
|
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_opaque(igrp, itype, name, &size));
|
|
|
|
NC_CHECK(nc_def_opaque(ogrp, size, name, &otype));
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2017-09-01 06:16:21 +08:00
|
|
|
/*
|
2010-06-03 21:24:43 +08:00
|
|
|
* copy a user-defined enum type in the group igrp to the group ogrp
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
copy_enum_type(int igrp, nc_type itype, int ogrp)
|
|
|
|
{
|
2017-09-01 06:16:21 +08:00
|
|
|
int stat = NC_NOERR;
|
2010-06-03 21:24:43 +08:00
|
|
|
nc_type otype;
|
|
|
|
nc_type basetype;
|
|
|
|
size_t basesize;
|
|
|
|
size_t nmembers;
|
|
|
|
char name[NC_MAX_NAME];
|
|
|
|
int i;
|
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_enum(igrp, itype, name, &basetype, &basesize, &nmembers));
|
|
|
|
NC_CHECK(nc_def_enum(ogrp, basetype, name, &otype));
|
2010-06-03 21:24:43 +08:00
|
|
|
for(i = 0; i < nmembers; i++) { /* insert enum members */
|
|
|
|
char ename[NC_MAX_NAME];
|
|
|
|
long long val; /* large enough to hold any integer type */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_enum_member(igrp, itype, i, ename, &val));
|
|
|
|
NC_CHECK(nc_insert_enum(ogrp, otype, ename, &val));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2017-09-01 06:16:21 +08:00
|
|
|
/*
|
2010-06-03 21:24:43 +08:00
|
|
|
* copy a user-defined compound type in the group igrp to the group ogrp
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
copy_compound_type(int igrp, nc_type itype, int ogrp)
|
|
|
|
{
|
2017-09-01 06:16:21 +08:00
|
|
|
int stat = NC_NOERR;
|
2010-06-03 21:24:43 +08:00
|
|
|
char name[NC_MAX_NAME];
|
|
|
|
size_t size;
|
|
|
|
size_t nfields;
|
|
|
|
nc_type otype;
|
|
|
|
int fid;
|
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_compound(igrp, itype, name, &size, &nfields));
|
|
|
|
NC_CHECK(nc_def_compound(ogrp, size, name, &otype));
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
for (fid = 0; fid < nfields; fid++) {
|
|
|
|
char fname[NC_MAX_NAME];
|
|
|
|
char ftypename[NC_MAX_NAME];
|
|
|
|
size_t foff;
|
|
|
|
nc_type iftype, oftype;
|
2010-08-02 01:16:08 +08:00
|
|
|
int fndims;
|
2010-06-03 21:24:43 +08:00
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_compound_field(igrp, itype, fid, fname, &foff, &iftype, &fndims, NULL));
|
2010-06-03 21:24:43 +08:00
|
|
|
/* type ids in source don't necessarily correspond to same
|
|
|
|
* typeids in destination, so look up destination typeid by using
|
|
|
|
* field type name */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_type(igrp, iftype, ftypename, NULL));
|
|
|
|
NC_CHECK(nc_inq_typeid(ogrp, ftypename, &oftype));
|
2010-06-03 21:24:43 +08:00
|
|
|
if(fndims == 0) {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_insert_compound(ogrp, otype, fname, foff, oftype));
|
2010-06-03 21:24:43 +08:00
|
|
|
} else { /* field is array type */
|
2010-08-02 01:16:08 +08:00
|
|
|
int *fdimsizes;
|
2023-11-25 02:20:52 +08:00
|
|
|
fdimsizes = (int *) emalloc((size_t)(fndims + 1) * sizeof(int));
|
2017-09-01 06:16:21 +08:00
|
|
|
stat = nc_inq_compound_field(igrp, itype, fid, NULL, NULL, NULL,
|
2010-08-02 01:16:08 +08:00
|
|
|
NULL, fdimsizes);
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_insert_array_compound(ogrp, otype, fname, foff, oftype, fndims, fdimsizes));
|
2010-08-02 01:16:08 +08:00
|
|
|
free(fdimsizes);
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2017-09-01 06:16:21 +08:00
|
|
|
/*
|
2010-06-03 21:24:43 +08:00
|
|
|
* copy a user-defined type in the group igrp to the group ogrp
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
copy_type(int igrp, nc_type typeid, int ogrp)
|
|
|
|
{
|
2017-09-01 06:16:21 +08:00
|
|
|
int stat = NC_NOERR;
|
2010-06-03 21:24:43 +08:00
|
|
|
nc_type type_class;
|
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_user_type(igrp, typeid, NULL, NULL, NULL, NULL, &type_class));
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
switch(type_class) {
|
|
|
|
case NC_VLEN:
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(copy_vlen_type(igrp, typeid, ogrp));
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
|
|
|
case NC_OPAQUE:
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(copy_opaque_type(igrp, typeid, ogrp));
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
|
|
|
case NC_ENUM:
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(copy_enum_type(igrp, typeid, ogrp));
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
|
|
|
case NC_COMPOUND:
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(copy_compound_type(igrp, typeid, ogrp));
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
|
|
|
default:
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(NC_EBADTYPE);
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2010-09-30 23:19:07 +08:00
|
|
|
/* Copy a group and all its subgroups, recursively, from iroot to
|
|
|
|
* oroot, the ncids of input file and output file. This just creates
|
|
|
|
* all the groups in the destination, but doesn't copy anything that's
|
|
|
|
* in the groups yet. */
|
2010-06-03 21:24:43 +08:00
|
|
|
static int
|
2010-09-30 23:19:07 +08:00
|
|
|
copy_groups(int iroot, int oroot)
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
2024-01-15 21:28:19 +08:00
|
|
|
size_t numgrps;
|
2011-01-06 07:48:47 +08:00
|
|
|
int *grpids;
|
2010-06-03 21:24:43 +08:00
|
|
|
int i;
|
|
|
|
|
2010-09-30 23:19:07 +08:00
|
|
|
/* get total number of groups and their ids, including all descendants */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_grps_full(iroot, &numgrps, NULL));
|
2012-12-12 01:49:40 +08:00
|
|
|
if(numgrps > 1) { /* there's always 1 root group */
|
2024-01-15 21:28:19 +08:00
|
|
|
grpids = emalloc(numgrps * sizeof(int));
|
2012-12-12 01:49:40 +08:00
|
|
|
NC_CHECK(nc_inq_grps_full(iroot, NULL, grpids));
|
|
|
|
/* create corresponding new groups in ogrp, except for root group */
|
|
|
|
for(i = 1; i < numgrps; i++) {
|
|
|
|
char *grpname_full;
|
|
|
|
char grpname[NC_MAX_NAME];
|
|
|
|
size_t len_name;
|
2014-04-23 06:11:37 +08:00
|
|
|
int ogid = 0, oparid = 0, iparid = 0;
|
2012-12-12 01:49:40 +08:00
|
|
|
/* get full group name of input group */
|
|
|
|
NC_CHECK(nc_inq_grpname(grpids[i], grpname));
|
2013-01-24 01:45:29 +08:00
|
|
|
if (option_grpstruct || group_wanted(grpids[i], option_nlgrps, option_grpids)) {
|
2018-07-27 10:16:02 +08:00
|
|
|
NC_CHECK(nc_inq_grpname_full(grpids[i], &len_name, NULL));
|
2013-01-24 01:45:29 +08:00
|
|
|
grpname_full = emalloc(len_name + 1);
|
|
|
|
NC_CHECK(nc_inq_grpname_full(grpids[i], &len_name, grpname_full));
|
|
|
|
/* Make sure, the parent group is also wanted (root group is always wanted) */
|
|
|
|
NC_CHECK(nc_inq_parid(iroot, grpname_full, &iparid));
|
2017-09-01 06:16:21 +08:00
|
|
|
if (!option_grpstruct && !group_wanted(iparid, option_nlgrps, option_grpids)
|
2013-01-24 01:45:29 +08:00
|
|
|
&& iparid != iroot) {
|
|
|
|
error("ERROR: trying to copy a group but not the parent: %s", grpname_full);
|
|
|
|
}
|
|
|
|
/* get id of parent group of corresponding group in output.
|
|
|
|
* Note that this exists, because nc_inq_groups returned
|
|
|
|
* grpids in preorder, so parents are always copied before
|
|
|
|
* their subgroups */
|
|
|
|
NC_CHECK(nc_inq_parid(oroot, grpname_full, &oparid));
|
|
|
|
NC_CHECK(nc_inq_grpname(grpids[i], grpname));
|
|
|
|
/* define corresponding group in output */
|
|
|
|
NC_CHECK(nc_def_grp(oparid, grpname, &ogid));
|
|
|
|
free(grpname_full);
|
|
|
|
}
|
2012-12-12 01:49:40 +08:00
|
|
|
}
|
|
|
|
free(grpids);
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
2017-09-01 06:16:21 +08:00
|
|
|
return stat;
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
|
2017-09-01 06:16:21 +08:00
|
|
|
/*
|
2010-06-03 21:24:43 +08:00
|
|
|
* Copy the user-defined types in this group (igrp) and all its
|
|
|
|
* subgroups, recursively, to corresponding group in output (ogrp)
|
|
|
|
*/
|
|
|
|
static int
|
|
|
|
copy_types(int igrp, int ogrp)
|
|
|
|
{
|
2017-09-01 06:16:21 +08:00
|
|
|
int stat = NC_NOERR;
|
2010-06-03 21:24:43 +08:00
|
|
|
int ntypes;
|
|
|
|
nc_type *types = NULL;
|
|
|
|
int numgrps;
|
|
|
|
int *grpids = NULL;
|
|
|
|
int i;
|
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_typeids(igrp, &ntypes, NULL));
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
if(ntypes > 0) {
|
2023-11-25 02:20:52 +08:00
|
|
|
types = (nc_type *) emalloc((size_t)ntypes * sizeof(nc_type));
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_typeids(igrp, &ntypes, types));
|
2010-06-03 21:24:43 +08:00
|
|
|
for (i = 0; i < ntypes; i++) {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(copy_type(igrp, types[i], ogrp));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
free(types);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Copy types from subgroups */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_grps(igrp, &numgrps, NULL));
|
2010-06-03 21:24:43 +08:00
|
|
|
if(numgrps > 0) {
|
2023-11-25 02:20:52 +08:00
|
|
|
grpids = (int *)emalloc(sizeof(int) * (size_t)numgrps);
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_grps(igrp, &numgrps, grpids));
|
2010-06-03 21:24:43 +08:00
|
|
|
for(i = 0; i < numgrps; i++) {
|
2013-01-24 01:45:29 +08:00
|
|
|
if (option_grpstruct || group_wanted(grpids[i], option_nlgrps, option_grpids)) {
|
|
|
|
int ogid;
|
|
|
|
/* get groupid in output corresponding to grpids[i] in
|
|
|
|
* input, given parent group (or root group) ogrp in
|
|
|
|
* output */
|
|
|
|
NC_CHECK(get_grpid(grpids[i], ogrp, &ogid));
|
|
|
|
NC_CHECK(copy_types(grpids[i], ogid));
|
|
|
|
}
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
free(grpids);
|
|
|
|
}
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2017-05-15 08:10:02 +08:00
|
|
|
/* Copy netCDF-4 specific variable filter properties */
|
|
|
|
/* Watch out if input is netcdf-3 */
|
|
|
|
static int
|
2018-07-27 10:16:02 +08:00
|
|
|
copy_var_filter(int igrp, int varid, int ogrp, int o_varid, int inkind, int outkind)
|
2017-05-15 08:10:02 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
VarID vid = {igrp,varid};
|
|
|
|
VarID ovid = {ogrp,o_varid};
|
|
|
|
/* handle filter parameters, copying from input, overriding with command-line options */
|
2020-02-17 03:59:33 +08:00
|
|
|
List* ospecs = NULL;
|
|
|
|
List* inspecs = NULL;
|
|
|
|
List* actualspecs = NULL;
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
struct FilterOption inspec;
|
|
|
|
struct FilterOption* tmp = NULL;
|
2017-05-15 08:10:02 +08:00
|
|
|
char* ofqn = NULL;
|
2018-03-03 07:55:58 +08:00
|
|
|
int inputdefined, outputdefined, unfiltered;
|
2018-07-27 10:16:02 +08:00
|
|
|
int innc4 = (inkind == NC_FORMAT_NETCDF4 || inkind == NC_FORMAT_NETCDF4_CLASSIC);
|
|
|
|
int outnc4 = (outkind == NC_FORMAT_NETCDF4 || outkind == NC_FORMAT_NETCDF4_CLASSIC);
|
2021-02-01 06:10:39 +08:00
|
|
|
int suppressvarfilters = 0;
|
2017-05-15 08:10:02 +08:00
|
|
|
|
2018-07-27 10:16:02 +08:00
|
|
|
if(!outnc4)
|
2018-03-03 07:55:58 +08:00
|
|
|
goto done; /* Can only use filter when output is some netcdf4 variant */
|
2017-05-15 08:10:02 +08:00
|
|
|
|
|
|
|
/* Compute the output vid's FQN */
|
|
|
|
if((stat = computeFQN(ovid,&ofqn))) goto done;
|
2018-03-03 07:55:58 +08:00
|
|
|
|
|
|
|
/* Clear the in and out specs */
|
2020-02-17 03:59:33 +08:00
|
|
|
inspecs = listnew();
|
|
|
|
ospecs = NULL;
|
|
|
|
actualspecs = NULL;
|
2018-03-03 07:55:58 +08:00
|
|
|
|
2021-02-01 06:10:39 +08:00
|
|
|
if(varfilterssuppress(ofqn) || option_deflate_level == 0)
|
|
|
|
suppressvarfilters = 1;
|
|
|
|
|
2020-02-17 03:59:33 +08:00
|
|
|
/* Is there one or more filters on the output variable */
|
2018-03-03 07:55:58 +08:00
|
|
|
outputdefined = 0; /* default is no filter defined */
|
2021-02-01 06:10:39 +08:00
|
|
|
/* See if any output filter spec is defined for this output variable */
|
|
|
|
ospecs = filteroptsforvar(ofqn);
|
|
|
|
if(listlength(ospecs) > 0 && !suppressfilters && !suppressvarfilters)
|
2019-05-24 06:35:03 +08:00
|
|
|
outputdefined = 1;
|
2018-03-03 07:55:58 +08:00
|
|
|
|
2020-02-17 03:59:33 +08:00
|
|
|
/* Is there already a filter on the input variable */
|
2018-03-03 07:55:58 +08:00
|
|
|
inputdefined = 0; /* default is no filter defined */
|
|
|
|
/* Only bother to look if input is netcdf-4 variant */
|
2018-07-27 10:16:02 +08:00
|
|
|
if(innc4) {
|
2020-02-17 03:59:33 +08:00
|
|
|
size_t nfilters;
|
2020-12-08 02:29:12 +08:00
|
|
|
unsigned int* ids = NULL;
|
2020-02-17 03:59:33 +08:00
|
|
|
int k;
|
2020-09-28 02:43:46 +08:00
|
|
|
if((stat = nc_inq_var_filter_ids(vid.grpid,vid.varid,&nfilters,NULL)))
|
2020-02-17 03:59:33 +08:00
|
|
|
goto done;
|
2020-09-28 02:43:46 +08:00
|
|
|
if(nfilters > 0) ids = (unsigned int*)calloc(nfilters,sizeof(unsigned int));
|
|
|
|
if((stat = nc_inq_var_filter_ids(vid.grpid,vid.varid,&nfilters,ids)))
|
2020-02-17 03:59:33 +08:00
|
|
|
goto done;
|
|
|
|
memset(&inspec,0,sizeof(inspec));
|
2020-12-08 02:29:12 +08:00
|
|
|
|
2020-02-17 03:59:33 +08:00
|
|
|
for(k=0;k<nfilters;k++) {
|
2020-09-28 02:43:46 +08:00
|
|
|
inspec.pfs.filterid = ids[k];
|
|
|
|
stat=nc_inq_var_filter_info(vid.grpid,vid.varid,inspec.pfs.filterid,&inspec.pfs.nparams,NULL);
|
2020-02-17 03:59:33 +08:00
|
|
|
if(stat && stat != NC_ENOFILTER)
|
2018-03-03 07:55:58 +08:00
|
|
|
goto done; /* true error */
|
2020-02-17 03:59:33 +08:00
|
|
|
if(inspec.pfs.nparams > 0) {
|
2020-09-28 02:43:46 +08:00
|
|
|
inspec.pfs.params = (unsigned int*)calloc(sizeof(char*),inspec.pfs.nparams);
|
|
|
|
if((stat=nc_inq_var_filter_info(vid.grpid,vid.varid,inspec.pfs.filterid,NULL,inspec.pfs.params)))
|
2020-02-17 03:59:33 +08:00
|
|
|
goto done;
|
|
|
|
}
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
tmp = malloc(sizeof(struct FilterOption));
|
2020-02-17 03:59:33 +08:00
|
|
|
*tmp = inspec;
|
|
|
|
memset(&inspec,0,sizeof(inspec)); /*reset*/
|
|
|
|
listpush(inspecs,tmp);
|
|
|
|
inputdefined = 1;
|
|
|
|
}
|
|
|
|
nullfree(ids);
|
2017-05-15 08:10:02 +08:00
|
|
|
}
|
2018-03-03 07:55:58 +08:00
|
|
|
|
2021-02-01 06:10:39 +08:00
|
|
|
/* Rules for choosing output filter are as follows (Ugh!):
|
2018-03-03 07:55:58 +08:00
|
|
|
|
2020-02-17 03:59:33 +08:00
|
|
|
global output input Actual Output
|
|
|
|
suppress filter(s) filter(s) filter
|
2019-05-24 06:35:03 +08:00
|
|
|
-----------------------------------------------------------
|
|
|
|
1 true undefined NA unfiltered
|
|
|
|
2 true 'none' NA unfiltered
|
2020-02-17 03:59:33 +08:00
|
|
|
3 true defined NA use output filter(s)
|
|
|
|
4 false undefined defined use input filter(s)
|
2019-05-24 06:35:03 +08:00
|
|
|
5 false 'none' NA unfiltered
|
2020-02-17 03:59:33 +08:00
|
|
|
6 false defined NA use output filter(s)
|
2019-05-24 06:35:03 +08:00
|
|
|
7 false undefined undefined unfiltered
|
2020-02-17 03:59:33 +08:00
|
|
|
8 false defined defined use output filter(s)
|
2018-03-03 07:55:58 +08:00
|
|
|
*/
|
|
|
|
|
|
|
|
unfiltered = 0;
|
|
|
|
if(suppressfilters && !outputdefined) /* row 1 */
|
2018-11-28 07:09:17 +08:00
|
|
|
unfiltered = 1;
|
2021-02-01 06:10:39 +08:00
|
|
|
else if(suppressfilters || suppressvarfilters) /* row 2 */
|
|
|
|
unfiltered = 1;
|
|
|
|
else if(suppressfilters && outputdefined) /* row 3 */
|
|
|
|
actualspecs = ospecs;
|
2018-03-03 07:55:58 +08:00
|
|
|
else if(!suppressfilters && !outputdefined && inputdefined) /* row 4 */
|
2021-02-01 06:10:39 +08:00
|
|
|
actualspecs = inspecs;
|
|
|
|
else if(!suppressfilters && suppressvarfilters) /* row 5 */
|
|
|
|
unfiltered = 1;
|
|
|
|
else if(!suppressfilters && outputdefined) /* row 6*/
|
|
|
|
actualspecs = ospecs;
|
|
|
|
else if(!suppressfilters && !outputdefined && !inputdefined) /* row 7 */
|
|
|
|
unfiltered = 1;
|
2020-02-17 03:59:33 +08:00
|
|
|
else if(!suppressfilters && outputdefined && inputdefined) /* row 8 */
|
|
|
|
actualspecs = ospecs;
|
2018-03-03 07:55:58 +08:00
|
|
|
|
|
|
|
/* Apply actual filter spec if any */
|
|
|
|
if(!unfiltered) {
|
2020-02-17 03:59:33 +08:00
|
|
|
/* add all the actual filters */
|
|
|
|
int k;
|
|
|
|
for(k=0;k<listlength(actualspecs);k++) {
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
struct FilterOption* actual = (struct FilterOption*)listget(actualspecs,k);
|
2020-09-28 02:43:46 +08:00
|
|
|
if((stat=nc_def_var_filter(ovid.grpid,ovid.varid,
|
2020-02-17 03:59:33 +08:00
|
|
|
actual->pfs.filterid,
|
|
|
|
actual->pfs.nparams,
|
2020-09-28 02:43:46 +08:00
|
|
|
actual->pfs.params)))
|
2017-05-15 08:10:02 +08:00
|
|
|
goto done;
|
2020-02-17 03:59:33 +08:00
|
|
|
}
|
2017-05-15 08:10:02 +08:00
|
|
|
}
|
|
|
|
done:
|
|
|
|
/* Cleanup */
|
2018-03-03 07:55:58 +08:00
|
|
|
if(ofqn != NULL) free(ofqn);
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
freefilteroptlist(inspecs); inspecs = NULL;
|
2020-02-17 03:59:33 +08:00
|
|
|
listfree(ospecs); ospecs = NULL; /* Contents are also in filterspecs */
|
2018-03-03 07:55:58 +08:00
|
|
|
/* Note we do not clean actualspec because it is a copy of in|out spec */
|
2017-05-15 08:10:02 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2018-07-27 10:16:02 +08:00
|
|
|
/* Propagate chunking from input to output taking -c flags into account. */
|
|
|
|
/* Subsumes old set_var_chunked */
|
2019-05-24 06:35:03 +08:00
|
|
|
/* Must make sure we do not override the default chunking when input is classic */
|
2018-07-27 10:16:02 +08:00
|
|
|
static int
|
|
|
|
copy_chunking(int igrp, int i_varid, int ogrp, int o_varid, int ndims, int inkind, int outkind)
|
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int innc4 = (inkind == NC_FORMAT_NETCDF4 || inkind == NC_FORMAT_NETCDF4_CLASSIC);
|
|
|
|
int outnc4 = (outkind == NC_FORMAT_NETCDF4 || outkind == NC_FORMAT_NETCDF4_CLASSIC);
|
2019-05-24 06:35:03 +08:00
|
|
|
VarID ovid;
|
|
|
|
char* ofqn = NULL;
|
2020-03-01 03:06:21 +08:00
|
|
|
int icontig = NC_CONTIGUOUS;
|
|
|
|
int ocontig = NC_CONTIGUOUS;
|
|
|
|
size_t ichunkp[NC_MAX_VAR_DIMS];
|
|
|
|
size_t ochunkp[NC_MAX_VAR_DIMS];
|
|
|
|
size_t dimlens[NC_MAX_VAR_DIMS];
|
2020-09-02 03:44:24 +08:00
|
|
|
size_t perdimchunklen[NC_MAX_VAR_DIMS]; /* the values of relevant -c dim/n specifications */
|
2020-06-25 06:34:32 +08:00
|
|
|
size_t dfaltchunkp[NC_MAX_VAR_DIMS]; /* default chunking for ovarid */
|
2020-03-01 03:06:21 +08:00
|
|
|
int is_unlimited = 0;
|
2018-07-27 10:16:02 +08:00
|
|
|
|
|
|
|
/* First, check the file kinds */
|
|
|
|
if(!outnc4)
|
|
|
|
return stat; /* no chunking */
|
|
|
|
|
2020-03-01 03:06:21 +08:00
|
|
|
memset(ichunkp,0,sizeof(ichunkp));
|
|
|
|
memset(ochunkp,0,sizeof(ochunkp));
|
|
|
|
memset(dimlens,0,sizeof(dimlens));
|
2020-09-02 03:44:24 +08:00
|
|
|
memset(perdimchunklen,0,sizeof(perdimchunklen));
|
2020-06-25 06:34:32 +08:00
|
|
|
memset(dfaltchunkp,0,sizeof(dfaltchunkp));
|
2020-03-01 03:06:21 +08:00
|
|
|
|
|
|
|
/* Get the chunking, if any, on the current input variable */
|
|
|
|
if(innc4) {
|
|
|
|
NC_CHECK(nc_inq_var_chunking(igrp, i_varid, &icontig, ichunkp));
|
|
|
|
/* pretend that this is same as a -c option */
|
|
|
|
} else { /* !innc4 */
|
|
|
|
icontig = NC_CONTIGUOUS;
|
2020-12-08 02:29:12 +08:00
|
|
|
ichunkp[0] = 0;
|
2020-03-01 03:06:21 +08:00
|
|
|
}
|
2018-07-27 10:16:02 +08:00
|
|
|
|
|
|
|
/* If var specific chunking was specified for this output variable
|
|
|
|
then it overrides all else.
|
|
|
|
*/
|
|
|
|
|
2020-03-01 03:06:21 +08:00
|
|
|
/* Note, using goto done instead of nested if-then-else */
|
2018-07-27 10:16:02 +08:00
|
|
|
|
2020-03-01 03:06:21 +08:00
|
|
|
/* First check on output contiguous'ness */
|
|
|
|
/* Note: the chunkspecs are defined in terms of input variable+grp ids.
|
|
|
|
The grp may differ if !innc4 && outnc4 */
|
|
|
|
if(varchunkspec_omit(igrp,i_varid))
|
|
|
|
ocontig = NC_CONTIGUOUS;
|
|
|
|
else if(varchunkspec_exists(igrp,i_varid))
|
|
|
|
ocontig = varchunkspec_kind(igrp,i_varid);
|
|
|
|
else
|
|
|
|
ocontig = icontig;
|
|
|
|
|
|
|
|
/* Figure out the chunking even if we do not decide to do so*/
|
|
|
|
if(varchunkspec_exists(igrp,i_varid)
|
|
|
|
&& varchunkspec_kind(igrp,i_varid) == NC_CHUNKED)
|
|
|
|
memcpy(ochunkp,varchunkspec_chunksizes(igrp,i_varid),ndims*sizeof(size_t));
|
|
|
|
|
|
|
|
/* If any kind of output filter was specified, then not contiguous */
|
|
|
|
ovid.grpid = ogrp;
|
|
|
|
ovid.varid = o_varid;
|
|
|
|
if((stat=computeFQN(ovid,&ofqn))) goto done;
|
2021-02-01 06:10:39 +08:00
|
|
|
if(!varfilterssuppress(ofqn) && (option_deflate_level > 0 || varfiltersactive(ofqn)))
|
2020-03-01 03:06:21 +08:00
|
|
|
ocontig = NC_CHUNKED;
|
|
|
|
|
2020-09-02 03:44:24 +08:00
|
|
|
/* See about dim-specific chunking; does not override specific variable chunk spec*/
|
2019-05-24 06:35:03 +08:00
|
|
|
{
|
2018-07-27 10:16:02 +08:00
|
|
|
int idim;
|
|
|
|
/* size of a chunk: product of dimension chunksizes and size of value */
|
|
|
|
size_t csprod;
|
|
|
|
size_t typesize;
|
|
|
|
int dimids[NC_MAX_VAR_DIMS];
|
|
|
|
|
2019-05-24 06:35:03 +08:00
|
|
|
/* See if dim-specific chunking was suppressed */
|
2020-09-02 03:44:24 +08:00
|
|
|
if(dimchunkspec_omit()) { /* no chunking at all on output, except as overridden by e.g. compression */
|
|
|
|
ocontig = NC_CONTIGUOUS;
|
2020-03-01 03:06:21 +08:00
|
|
|
goto next2;
|
2020-09-02 03:44:24 +08:00
|
|
|
}
|
2018-07-27 10:16:02 +08:00
|
|
|
|
2020-03-01 03:06:21 +08:00
|
|
|
/* Setup for possible output chunking */
|
|
|
|
typesize = val_size(igrp, i_varid);
|
2018-07-27 10:16:02 +08:00
|
|
|
csprod = typesize;
|
|
|
|
memset(&dimids,0,sizeof(dimids));
|
|
|
|
|
|
|
|
/* Prepare to iterate over the dimids of this input variable */
|
|
|
|
NC_CHECK(nc_inq_vardimid(igrp, i_varid, dimids));
|
|
|
|
|
2020-09-02 03:44:24 +08:00
|
|
|
/* Capture dimension lengths for all dimensions of variable */
|
|
|
|
/* Also, capture per-dimension -c specs even if we decide to not chunk */
|
2018-07-27 10:16:02 +08:00
|
|
|
for(idim = 0; idim < ndims; idim++) {
|
|
|
|
int idimid = dimids[idim];
|
|
|
|
int odimid = dimmap_odimid(idimid);
|
2018-11-27 08:56:44 +08:00
|
|
|
|
2018-07-27 10:16:02 +08:00
|
|
|
/* Get input dimension length */
|
2020-03-01 03:06:21 +08:00
|
|
|
NC_CHECK(nc_inq_dimlen(igrp, idimid, &dimlens[idim]));
|
2018-07-27 10:16:02 +08:00
|
|
|
|
|
|
|
/* Check for unlimited */
|
|
|
|
if(dimmap_ounlim(odimid)) {
|
|
|
|
is_unlimited = 1;
|
2020-03-01 03:06:21 +08:00
|
|
|
ocontig = NC_CHUNKED; /* force chunking */
|
2018-07-27 10:16:02 +08:00
|
|
|
}
|
|
|
|
|
2020-03-01 03:06:21 +08:00
|
|
|
if(dimchunkspec_exists(idimid)) {
|
2020-09-02 03:44:24 +08:00
|
|
|
/* If the -c set a chunk size for this dimension, capture it */
|
|
|
|
perdimchunklen[idim] = dimchunkspec_size(idimid); /* Save it */
|
2020-03-01 03:06:21 +08:00
|
|
|
ocontig = NC_CHUNKED; /* force chunking */
|
|
|
|
}
|
2018-07-27 10:16:02 +08:00
|
|
|
|
|
|
|
/* Default for unlimited is max(4 megabytes, current dim size) */
|
|
|
|
if(is_unlimited) {
|
|
|
|
size_t mb4dimsize = DFALTUNLIMSIZE / typesize;
|
2020-03-01 03:06:21 +08:00
|
|
|
if(dimlens[idim] > mb4dimsize)
|
|
|
|
dimlens[idim] = mb4dimsize;
|
2018-07-27 10:16:02 +08:00
|
|
|
}
|
2020-03-01 03:06:21 +08:00
|
|
|
}
|
2020-06-25 06:34:32 +08:00
|
|
|
|
|
|
|
/* Get the current default chunking on the output variable */
|
2020-12-08 02:29:12 +08:00
|
|
|
/* Unfortunately, there is no way to get this info except by
|
2020-06-25 06:34:32 +08:00
|
|
|
forcing chunking */
|
|
|
|
if(ocontig == NC_CHUNKED) {
|
|
|
|
/* this may fail if chunking is not possible, in which case ignore */
|
|
|
|
int ret = nc_def_var_chunking(ogrp, o_varid, NC_CHUNKED, dfaltchunkp);
|
|
|
|
if(ret == NC_NOERR) {
|
|
|
|
int storage;
|
|
|
|
NC_CHECK(nc_inq_var_chunking(ogrp, o_varid, &storage, dfaltchunkp));
|
|
|
|
if(storage != NC_CHUNKED) return NC_EINTERNAL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-09-02 03:44:24 +08:00
|
|
|
/* compute the final ochunksizes: precedence is output, per-dim-spec, input, defaults, dimlen */
|
2020-03-01 03:06:21 +08:00
|
|
|
for(idim = 0; idim < ndims; idim++) {
|
2020-09-02 03:44:24 +08:00
|
|
|
if(ochunkp[idim] == 0) { /* use -c dim/n if specified */
|
|
|
|
if(perdimchunklen[idim] != 0)
|
|
|
|
ochunkp[idim] = perdimchunklen[idim];
|
|
|
|
}
|
|
|
|
if(ochunkp[idim] == 0) { /* use input chunk size */
|
2020-03-01 03:06:21 +08:00
|
|
|
if(ichunkp[idim] != 0)
|
|
|
|
ochunkp[idim] = ichunkp[idim];
|
|
|
|
}
|
2020-09-02 03:44:24 +08:00
|
|
|
if(ochunkp[idim] == 0) { /* use chunk defaults */
|
2020-06-25 06:34:32 +08:00
|
|
|
if(dfaltchunkp[idim] != 0)
|
|
|
|
ochunkp[idim] = dfaltchunkp[idim];
|
|
|
|
}
|
2020-09-02 03:44:24 +08:00
|
|
|
if(ochunkp[idim] == 0) { /* last resort: use full dimension size */
|
2020-03-01 03:06:21 +08:00
|
|
|
if(dimlens[idim] != 0)
|
|
|
|
ochunkp[idim] = dimlens[idim];
|
|
|
|
}
|
|
|
|
if(ochunkp[idim] == 0) {stat = NC_EINTERNAL; goto done;}
|
2018-07-27 10:16:02 +08:00
|
|
|
/* compute on-going dimension product */
|
|
|
|
csprod *= ochunkp[idim];
|
|
|
|
}
|
2020-03-01 03:06:21 +08:00
|
|
|
/* if total chunksize is too small (and dim is not unlimited) => do not chunk */
|
|
|
|
if(csprod < option_min_chunk_bytes && !is_unlimited)
|
|
|
|
ocontig = NC_CONTIGUOUS; /* Force contiguous */
|
|
|
|
}
|
2018-07-27 10:16:02 +08:00
|
|
|
|
2019-05-24 06:35:03 +08:00
|
|
|
next2:
|
2020-03-01 03:06:21 +08:00
|
|
|
/* Apply the chunking, if any */
|
|
|
|
switch (ocontig) {
|
|
|
|
case NC_CHUNKED:
|
|
|
|
NC_CHECK(nc_def_var_chunking(ogrp, o_varid, NC_CHUNKED, ochunkp));
|
|
|
|
break;
|
|
|
|
case NC_CONTIGUOUS:
|
|
|
|
case NC_COMPACT:
|
|
|
|
NC_CHECK(nc_def_var_chunking(ogrp, o_varid, ocontig, NULL));
|
|
|
|
break;
|
|
|
|
default: stat = NC_EINVAL; goto done;
|
2019-07-15 05:18:03 +08:00
|
|
|
}
|
2018-07-27 10:16:02 +08:00
|
|
|
|
2019-07-15 05:18:03 +08:00
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
#ifdef DEBUGFILTER
|
|
|
|
{ int d;
|
|
|
|
size_t chunksizes[NC_MAX_VAR_DIMS];
|
|
|
|
char name[NC_MAX_NAME];
|
2020-03-01 03:06:21 +08:00
|
|
|
if(ocontig == NC_CONTIGUOUS) {
|
2019-07-15 05:18:03 +08:00
|
|
|
fprintf(stderr,"contig]\n");
|
2020-03-01 03:06:21 +08:00
|
|
|
} else if(ocontig == NC_COMPACT) {
|
|
|
|
fprintf(stderr,"compact]\n");
|
2018-07-27 10:16:02 +08:00
|
|
|
} else {
|
2019-07-15 05:18:03 +08:00
|
|
|
for(d=0;d<ndims;d++) {
|
2020-03-01 03:06:21 +08:00
|
|
|
totalsize *= ochunkp[d];
|
2019-07-15 05:18:03 +08:00
|
|
|
if(d > 0) fprintf(stderr,",");
|
2020-03-01 03:06:21 +08:00
|
|
|
fprintf(stderr,"%lu",(unsigned long)ochunkp[d]);
|
2019-07-15 05:18:03 +08:00
|
|
|
}
|
|
|
|
fprintf(stderr,"]=%llu\n",totalsize);
|
2018-07-27 10:16:02 +08:00
|
|
|
}
|
2019-07-15 05:18:03 +08:00
|
|
|
fflush(stderr);
|
|
|
|
}
|
|
|
|
#endif /*DEBUGFILTER*/
|
|
|
|
#endif /*USE_NETCDF4*/
|
2020-12-08 02:29:12 +08:00
|
|
|
|
2018-07-27 10:16:02 +08:00
|
|
|
done:
|
2019-05-24 06:35:03 +08:00
|
|
|
if(ofqn) free(ofqn);
|
2018-07-27 10:16:02 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Copy all netCDF-4 specific variable properties such as chunking,
|
|
|
|
* endianness, deflation, checksumming, fill, etc. */
|
|
|
|
static int
|
|
|
|
copy_var_specials(int igrp, int varid, int ogrp, int o_varid, int inkind, int outkind)
|
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int innc4 = (inkind == NC_FORMAT_NETCDF4 || inkind == NC_FORMAT_NETCDF4_CLASSIC);
|
|
|
|
int outnc4 = (outkind == NC_FORMAT_NETCDF4 || outkind == NC_FORMAT_NETCDF4_CLASSIC);
|
2019-02-09 09:48:17 +08:00
|
|
|
int deflated = 0; /* true iff deflation is applied */
|
2020-03-03 06:10:54 +08:00
|
|
|
int ndims;
|
2021-02-01 06:10:39 +08:00
|
|
|
char* ofqn = NULL;
|
|
|
|
int nofilters = 0;
|
|
|
|
VarID ovid = {ogrp,o_varid};
|
2018-07-27 10:16:02 +08:00
|
|
|
|
|
|
|
if(!outnc4)
|
|
|
|
return stat; /* Ignore non-netcdf4 files */
|
|
|
|
|
|
|
|
{ /* handle chunking parameters */
|
|
|
|
NC_CHECK(nc_inq_varndims(igrp, varid, &ndims));
|
|
|
|
if (ndims > 0) { /* no chunking for scalar variables */
|
|
|
|
NC_CHECK(copy_chunking(igrp, varid, ogrp, o_varid, ndims, inkind, outkind));
|
|
|
|
}
|
|
|
|
}
|
2021-02-01 06:10:39 +08:00
|
|
|
|
|
|
|
if((stat=computeFQN(ovid,&ofqn))) goto done;
|
|
|
|
nofilters = varfilterssuppress(ofqn);
|
|
|
|
|
|
|
|
if(ndims > 0 && !nofilters)
|
2018-07-27 10:16:02 +08:00
|
|
|
{ /* handle compression parameters, copying from input, overriding
|
|
|
|
* with command-line options */
|
|
|
|
int shuffle_in=0, deflate_in=0, deflate_level_in=0;
|
|
|
|
int shuffle_out=0, deflate_out=0, deflate_level_out=0;
|
|
|
|
if(innc4) { /* See if the input variable has deflation applied */
|
|
|
|
NC_CHECK(nc_inq_var_deflate(igrp, varid, &shuffle_in, &deflate_in, &deflate_level_in));
|
|
|
|
}
|
|
|
|
if(option_deflate_level == -1) {
|
|
|
|
/* not specified by -d flag, copy input compression and shuffling */
|
|
|
|
shuffle_out = shuffle_in;
|
|
|
|
deflate_out = deflate_in;
|
|
|
|
deflate_level_out = deflate_level_in;
|
|
|
|
} else if(option_deflate_level > 0) { /* change to specified compression, shuffling */
|
|
|
|
shuffle_out = option_shuffle_vars;
|
|
|
|
deflate_out=1;
|
|
|
|
deflate_level_out = option_deflate_level;
|
|
|
|
} else if(option_deflate_level == 0) { /* special case; force off */
|
|
|
|
shuffle_out = 0;
|
|
|
|
deflate_out = 0;
|
|
|
|
deflate_level_out = 0;
|
|
|
|
}
|
2021-02-01 06:10:39 +08:00
|
|
|
/* Apply output deflation (unless suppressed) */
|
2018-07-27 10:16:02 +08:00
|
|
|
if(outnc4) {
|
|
|
|
/* Note that if we invoke this function and even if shuffle and deflate flags are 0,
|
|
|
|
then default chunking will be turned on; so do a special check for that. */
|
|
|
|
if(shuffle_out != 0 || deflate_out != 0)
|
|
|
|
NC_CHECK(nc_def_var_deflate(ogrp, o_varid, shuffle_out, deflate_out, deflate_level_out));
|
2019-02-09 09:48:17 +08:00
|
|
|
deflated = deflate_out;
|
2018-07-27 10:16:02 +08:00
|
|
|
}
|
|
|
|
}
|
2021-02-01 06:10:39 +08:00
|
|
|
if(!nofilters && innc4 && outnc4 && ndims > 0)
|
2018-07-27 10:16:02 +08:00
|
|
|
{ /* handle checksum parameters */
|
|
|
|
int fletcher32 = 0;
|
|
|
|
NC_CHECK(nc_inq_var_fletcher32(igrp, varid, &fletcher32));
|
|
|
|
if(fletcher32 != 0) {
|
|
|
|
NC_CHECK(nc_def_var_fletcher32(ogrp, o_varid, fletcher32));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if(innc4 && outnc4)
|
|
|
|
{ /* handle endianness */
|
|
|
|
int endianness = 0;
|
|
|
|
NC_CHECK(nc_inq_var_endian(igrp, varid, &endianness));
|
|
|
|
if(endianness != NC_ENDIAN_NATIVE) { /* native is the default */
|
|
|
|
NC_CHECK(nc_def_var_endian(ogrp, o_varid, endianness));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2021-02-01 06:10:39 +08:00
|
|
|
if(!nofilters && !deflated && ndims > 0) {
|
2019-02-09 09:48:17 +08:00
|
|
|
/* handle other general filters */
|
|
|
|
NC_CHECK(copy_var_filter(igrp, varid, ogrp, o_varid, inkind, outkind));
|
|
|
|
}
|
2021-02-01 06:10:39 +08:00
|
|
|
done:
|
|
|
|
if(ofqn) free(ofqn);
|
2018-07-27 10:16:02 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
#if 0
|
|
|
|
Subsumed into copy_chunking.
|
2011-01-10 05:41:07 +08:00
|
|
|
/* Set output variable o_varid (in group ogrp) to use chunking
|
|
|
|
* specified on command line, only called for classic format input and
|
2011-01-18 06:15:26 +08:00
|
|
|
* netCDF-4 format output, so no existing chunk lengths to override. */
|
2011-01-10 05:41:07 +08:00
|
|
|
static int
|
|
|
|
set_var_chunked(int ogrp, int o_varid)
|
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int ndims;
|
|
|
|
int odim;
|
2011-01-18 06:15:26 +08:00
|
|
|
size_t chunk_threshold = CHUNK_THRESHOLD;
|
2011-01-10 05:41:07 +08:00
|
|
|
|
2018-07-27 10:16:02 +08:00
|
|
|
if(dimchunkspec_ndims() == 0) /* no chunking specified on command line */
|
2011-01-10 05:41:07 +08:00
|
|
|
return stat;
|
|
|
|
NC_CHECK(nc_inq_varndims(ogrp, o_varid, &ndims));
|
|
|
|
|
|
|
|
if (ndims > 0) { /* no chunking for scalar variables */
|
|
|
|
int chunked = 0;
|
|
|
|
int *dimids = (int *) emalloc(ndims * sizeof(int));
|
2011-01-18 06:15:26 +08:00
|
|
|
size_t varsize;
|
|
|
|
nc_type vartype;
|
|
|
|
size_t value_size;
|
|
|
|
int is_unlimited = 0;
|
|
|
|
|
2011-01-10 05:41:07 +08:00
|
|
|
NC_CHECK(nc_inq_vardimid (ogrp, o_varid, dimids));
|
2011-01-18 06:15:26 +08:00
|
|
|
NC_CHECK(nc_inq_vartype(ogrp, o_varid, &vartype));
|
|
|
|
/* from type, get size in memory needed for each value */
|
|
|
|
NC_CHECK(nc_inq_type(ogrp, vartype, NULL, &value_size));
|
|
|
|
varsize = value_size;
|
|
|
|
|
2011-01-10 05:41:07 +08:00
|
|
|
/* Determine if this variable should be chunked. A variable
|
|
|
|
* should be chunked if any of its dims are in command-line
|
2011-07-13 03:06:00 +08:00
|
|
|
* chunk spec. It will also be chunked if any of its
|
2011-01-10 05:41:07 +08:00
|
|
|
* dims are unlimited. */
|
|
|
|
for(odim = 0; odim < ndims; odim++) {
|
|
|
|
int odimid = dimids[odim];
|
2011-01-18 06:15:26 +08:00
|
|
|
int idimid = dimmap_idimid(odimid); /* corresponding dimid in input file */
|
|
|
|
if(dimmap_ounlim(odimid))
|
2015-01-02 22:09:16 +08:00
|
|
|
is_unlimited = 1; /* whether vriable is unlimited */
|
2011-01-10 05:41:07 +08:00
|
|
|
if(idimid != -1) {
|
2018-07-27 10:16:02 +08:00
|
|
|
size_t chunksize = dimchunkspec_size(idimid); /* from chunkspec */
|
2011-01-10 05:41:07 +08:00
|
|
|
size_t dimlen;
|
|
|
|
NC_CHECK(nc_inq_dimlen(ogrp, odimid, &dimlen));
|
2011-07-13 03:06:00 +08:00
|
|
|
if( (chunksize > 0) || dimmap_ounlim(odimid)) {
|
2017-09-01 06:16:21 +08:00
|
|
|
chunked = 1;
|
2011-01-10 05:41:07 +08:00
|
|
|
}
|
2015-01-02 22:09:16 +08:00
|
|
|
if(dimlen > 0) { /* dimlen for unlimited dims is still 0 before copying data */
|
|
|
|
varsize *= dimlen;
|
|
|
|
}
|
2011-01-10 05:41:07 +08:00
|
|
|
}
|
|
|
|
}
|
2011-01-18 06:15:26 +08:00
|
|
|
/* Don't chunk small variables that don't use an unlimited
|
|
|
|
* dimension. */
|
|
|
|
if(varsize < chunk_threshold && !is_unlimited)
|
|
|
|
chunked = 0;
|
|
|
|
|
2011-01-10 05:41:07 +08:00
|
|
|
if(chunked) {
|
2011-01-18 06:15:26 +08:00
|
|
|
/* Allocate chunksizes and set defaults to dimsize for any
|
2015-01-02 22:09:16 +08:00
|
|
|
* dimensions not mentioned in chunkspec, except use 1 for unlimited dims. */
|
2011-01-10 05:41:07 +08:00
|
|
|
size_t *chunkp = (size_t *) emalloc(ndims * sizeof(size_t));
|
|
|
|
for(odim = 0; odim < ndims; odim++) {
|
|
|
|
int odimid = dimids[odim];
|
|
|
|
int idimid = dimmap_idimid(odimid);
|
2018-07-27 10:16:02 +08:00
|
|
|
size_t chunksize = dimchunkspec_size(idimid);
|
2011-01-10 05:41:07 +08:00
|
|
|
if(chunksize > 0) {
|
|
|
|
chunkp[odim] = chunksize;
|
|
|
|
} else {
|
2015-01-02 22:09:16 +08:00
|
|
|
if(dimmap_ounlim(odimid)){
|
|
|
|
chunkp[odim] = 1;
|
|
|
|
} else {
|
|
|
|
NC_CHECK(nc_inq_dimlen(ogrp, odimid, &chunkp[odim]));
|
|
|
|
}
|
2011-01-10 05:41:07 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
NC_CHECK(nc_def_var_chunking(ogrp, o_varid, NC_CHUNKED, chunkp));
|
|
|
|
free(chunkp);
|
|
|
|
}
|
|
|
|
free(dimids);
|
|
|
|
}
|
|
|
|
return stat;
|
|
|
|
}
|
2018-07-27 10:16:02 +08:00
|
|
|
#endif
|
2011-01-10 05:41:07 +08:00
|
|
|
|
2018-07-27 10:16:02 +08:00
|
|
|
#if 0
|
2011-01-10 05:41:07 +08:00
|
|
|
/* Set variable to compression specified on command line */
|
2010-08-29 23:08:12 +08:00
|
|
|
static int
|
2010-09-02 03:57:36 +08:00
|
|
|
set_var_compressed(int ogrp, int o_varid)
|
2010-08-29 23:08:12 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
2014-03-22 02:34:19 +08:00
|
|
|
if (option_deflate_level > 0) {
|
2010-09-03 05:01:27 +08:00
|
|
|
int deflate = 1;
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_def_var_deflate(ogrp, o_varid, option_shuffle_vars, deflate, option_deflate_level));
|
2010-08-29 23:08:12 +08:00
|
|
|
}
|
|
|
|
return stat;
|
|
|
|
}
|
2018-07-27 10:16:02 +08:00
|
|
|
#endif
|
2010-08-29 23:08:12 +08:00
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
/* Release the variable chunk cache allocated for variable varid in
|
2011-07-08 02:36:00 +08:00
|
|
|
* group grp. This is not necessary, but will save some memory when
|
|
|
|
* processing one variable at a time. */
|
2012-01-30 02:56:29 +08:00
|
|
|
#ifdef UNUSED
|
2010-06-03 21:24:43 +08:00
|
|
|
static int
|
2011-07-08 02:36:00 +08:00
|
|
|
free_var_chunk_cache(int grp, int varid)
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
size_t chunk_cache_size = 1;
|
|
|
|
size_t cache_nelems = 1;
|
|
|
|
float cache_preemp = 0;
|
2011-07-08 02:36:00 +08:00
|
|
|
int kind;
|
|
|
|
NC_CHECK(nc_inq_format(grp, &kind));
|
|
|
|
if(kind == NC_FORMAT_NETCDF4 || kind == NC_FORMAT_NETCDF4_CLASSIC) {
|
2020-03-01 03:06:21 +08:00
|
|
|
int contig = NC_CONTIGUOUS
|
2011-07-08 02:36:00 +08:00
|
|
|
NC_CHECK(nc_inq_var_chunking(grp, varid, &contig, NULL));
|
2020-03-01 03:06:21 +08:00
|
|
|
if(contig == NC_CHUNKED) { /* chunked */
|
2011-07-08 02:36:00 +08:00
|
|
|
NC_CHECK(nc_set_var_chunk_cache(grp, varid, chunk_cache_size, cache_nelems, cache_preemp));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
return stat;
|
|
|
|
}
|
2012-01-30 02:56:29 +08:00
|
|
|
#endif
|
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
#endif /* USE_NETCDF4 */
|
|
|
|
|
2011-01-10 05:41:07 +08:00
|
|
|
/* Copy dimensions from group igrp to group ogrp, also associate input
|
|
|
|
* dimids with output dimids (they need not match, because the input
|
2011-01-18 06:15:26 +08:00
|
|
|
* dimensions may have been defined in a different order than we define
|
2011-01-10 05:41:07 +08:00
|
|
|
* the output dimensions here. */
|
2010-06-03 21:24:43 +08:00
|
|
|
static int
|
2011-01-10 05:41:07 +08:00
|
|
|
copy_dims(int igrp, int ogrp)
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int ndims;
|
|
|
|
int dgrp;
|
|
|
|
#ifdef USE_NETCDF4
|
2012-01-30 02:56:29 +08:00
|
|
|
int nunlims;
|
2010-08-02 01:16:08 +08:00
|
|
|
int *dimids;
|
|
|
|
int *unlimids;
|
2010-06-03 21:24:43 +08:00
|
|
|
#else
|
|
|
|
int unlimid;
|
2017-09-01 06:16:21 +08:00
|
|
|
#endif /* USE_NETCDF4 */
|
2010-06-03 21:24:43 +08:00
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_ndims(igrp, &ndims));
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
/* In netCDF-4 files, dimids may not be sequential because they
|
|
|
|
* may be defined in various groups, and we are only looking at one
|
|
|
|
* group at a time. */
|
|
|
|
/* Find the dimension ids in this group, don't include parents. */
|
2023-11-25 02:20:52 +08:00
|
|
|
dimids = (int *) emalloc((size_t)(ndims + 1) * sizeof(int));
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_dimids(igrp, NULL, dimids, 0));
|
2010-06-03 21:24:43 +08:00
|
|
|
/* Find the number of unlimited dimensions and get their IDs */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_unlimdims(igrp, &nunlims, NULL));
|
2023-11-25 02:20:52 +08:00
|
|
|
unlimids = (int *) emalloc((size_t)(nunlims + 1) * sizeof(int));
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_unlimdims(igrp, NULL, unlimids));
|
2010-06-03 21:24:43 +08:00
|
|
|
#else
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_unlimdim(igrp, &unlimid));
|
2010-06-03 21:24:43 +08:00
|
|
|
#endif /* USE_NETCDF4 */
|
|
|
|
|
|
|
|
/* Copy each dimension to output, including unlimited dimension(s) */
|
|
|
|
for (dgrp = 0; dgrp < ndims; dgrp++) {
|
|
|
|
char name[NC_MAX_NAME];
|
|
|
|
size_t length;
|
2011-01-10 05:41:07 +08:00
|
|
|
int i_is_unlim;
|
|
|
|
int o_is_unlim;
|
2010-12-31 02:17:04 +08:00
|
|
|
int idimid, odimid;
|
2012-01-30 02:56:29 +08:00
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
int uld;
|
|
|
|
#endif
|
2010-06-03 21:24:43 +08:00
|
|
|
|
2011-01-10 05:41:07 +08:00
|
|
|
i_is_unlim = 0;
|
2010-06-03 21:24:43 +08:00
|
|
|
#ifdef USE_NETCDF4
|
2010-12-31 02:17:04 +08:00
|
|
|
idimid = dimids[dgrp];
|
2010-06-03 21:24:43 +08:00
|
|
|
for (uld = 0; uld < nunlims; uld++) {
|
2010-12-31 02:17:04 +08:00
|
|
|
if(idimid == unlimids[uld]) {
|
2011-01-10 05:41:07 +08:00
|
|
|
i_is_unlim = 1;
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
2017-09-01 06:16:21 +08:00
|
|
|
}
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
#else
|
2010-12-31 02:17:04 +08:00
|
|
|
idimid = dgrp;
|
|
|
|
if(unlimid != -1 && (idimid == unlimid)) {
|
2011-01-10 05:41:07 +08:00
|
|
|
i_is_unlim = 1;
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
#endif /* USE_NETCDF4 */
|
|
|
|
|
2010-12-31 02:17:04 +08:00
|
|
|
stat = nc_inq_dim(igrp, idimid, name, &length);
|
2010-06-03 21:24:43 +08:00
|
|
|
if (stat == NC_EDIMSIZE && sizeof(size_t) < 8) {
|
2011-01-06 07:48:47 +08:00
|
|
|
error("dimension \"%s\" requires 64-bit platform", name);
|
2017-09-01 06:16:21 +08:00
|
|
|
}
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(stat);
|
2011-01-10 05:41:07 +08:00
|
|
|
o_is_unlim = i_is_unlim;
|
|
|
|
if(i_is_unlim && !option_fix_unlimdims) {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_def_dim(ogrp, name, NC_UNLIMITED, &odimid));
|
2010-06-03 21:24:43 +08:00
|
|
|
} else {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_def_dim(ogrp, name, length, &odimid));
|
2011-01-10 05:41:07 +08:00
|
|
|
o_is_unlim = 0;
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
2011-01-10 05:41:07 +08:00
|
|
|
/* Store (idimid, odimid) mapping for later use, also whether unlimited */
|
|
|
|
dimmap_store(idimid, odimid, i_is_unlim, o_is_unlim);
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
2010-08-02 01:16:08 +08:00
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
free(dimids);
|
|
|
|
free(unlimids);
|
2017-09-01 06:16:21 +08:00
|
|
|
#endif /* USE_NETCDF4 */
|
2010-06-03 21:24:43 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Copy the attributes for variable ivar in group igrp to variable
|
|
|
|
* ovar in group ogrp. Global (group) attributes are specified by
|
|
|
|
* using the varid NC_GLOBAL */
|
|
|
|
static int
|
|
|
|
copy_atts(int igrp, int ivar, int ogrp, int ovar)
|
|
|
|
{
|
|
|
|
int natts;
|
|
|
|
int iatt;
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_varnatts(igrp, ivar, &natts));
|
2017-09-01 06:16:21 +08:00
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
for(iatt = 0; iatt < natts; iatt++) {
|
|
|
|
char name[NC_MAX_NAME];
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_attname(igrp, ivar, iatt, name));
|
2018-11-27 08:56:44 +08:00
|
|
|
if(!strcmp(name,"_NCProperties"))
|
|
|
|
return stat;
|
|
|
|
|
|
|
|
NC_CHECK(nc_copy_att(igrp, ivar, name, ogrp, ovar));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* copy the schema for a single variable in group igrp to group ogrp */
|
|
|
|
static int
|
2011-01-10 05:41:07 +08:00
|
|
|
copy_var(int igrp, int varid, int ogrp)
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int ndims;
|
2010-08-02 01:16:08 +08:00
|
|
|
int *idimids; /* ids of dims for input variable */
|
|
|
|
int *odimids; /* ids of dims for output variable */
|
2010-06-03 21:24:43 +08:00
|
|
|
char name[NC_MAX_NAME];
|
|
|
|
nc_type typeid, o_typeid;
|
|
|
|
int natts;
|
|
|
|
int i;
|
|
|
|
int o_varid;
|
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_varndims(igrp, varid, &ndims));
|
2023-11-25 02:20:52 +08:00
|
|
|
idimids = (int *) emalloc((size_t)(ndims + 1) * sizeof(int));
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_var(igrp, varid, name, &typeid, NULL, idimids, &natts));
|
2010-06-03 21:24:43 +08:00
|
|
|
o_typeid = typeid;
|
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
if (typeid > NC_STRING) { /* user-defined type */
|
|
|
|
/* type ids in source don't necessarily correspond to same
|
|
|
|
* typeids in destination, so look up destination typeid by
|
|
|
|
* using type name */
|
|
|
|
char type_name[NC_MAX_NAME];
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_type(igrp, typeid, type_name, NULL));
|
|
|
|
NC_CHECK(nc_inq_typeid(ogrp, type_name, &o_typeid));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
#endif /* USE_NETCDF4 */
|
|
|
|
|
|
|
|
/* get the corresponding dimids in the output file */
|
2023-11-25 02:20:52 +08:00
|
|
|
odimids = (int *) emalloc((size_t)(ndims + 1) * sizeof(int));
|
2010-06-03 21:24:43 +08:00
|
|
|
for(i = 0; i < ndims; i++) {
|
2011-01-10 05:41:07 +08:00
|
|
|
odimids[i] = dimmap_odimid(idimids[i]);
|
|
|
|
if(odimids[i] == -1) {
|
|
|
|
error("Oops, no dimension in output associated with input dimid %d", idimids[i]);
|
|
|
|
}
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* define the output variable */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_def_var(ogrp, name, o_typeid, ndims, odimids, &o_varid));
|
2010-06-03 21:24:43 +08:00
|
|
|
/* attach the variable attributes to the output variable */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(copy_atts(igrp, varid, ogrp, o_varid));
|
2017-09-01 06:16:21 +08:00
|
|
|
#ifdef USE_NETCDF4
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
|
|
|
int inkind;
|
|
|
|
int outkind;
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_format(igrp, &inkind));
|
|
|
|
NC_CHECK(nc_inq_format(ogrp, &outkind));
|
2018-07-27 10:16:02 +08:00
|
|
|
/* Copy all variable properties such as
|
|
|
|
* chunking, endianness, deflation, checksumming, fill, etc.
|
|
|
|
* Ok to call if outkind is netcdf-3
|
|
|
|
*/
|
|
|
|
NC_CHECK(copy_var_specials(igrp, varid, ogrp, o_varid, inkind, outkind));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
#endif /* USE_NETCDF4 */
|
2010-08-02 01:16:08 +08:00
|
|
|
free(idimids);
|
|
|
|
free(odimids);
|
2010-06-03 21:24:43 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* copy the schema for all the variables in group igrp to group ogrp */
|
|
|
|
static int
|
2011-01-10 05:41:07 +08:00
|
|
|
copy_vars(int igrp, int ogrp)
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int nvars;
|
|
|
|
int varid;
|
2013-01-24 01:45:29 +08:00
|
|
|
|
|
|
|
int iv; /* variable number */
|
|
|
|
idnode_t* vlist = 0; /* list for vars specified with -v option */
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If any vars were specified with -v option, get list of
|
|
|
|
* associated variable ids relative to this group. Assume vars
|
|
|
|
* specified with syntax like "grp1/grp2/varname" or
|
|
|
|
* "/grp1/grp2/varname" if they are in groups.
|
|
|
|
*/
|
|
|
|
vlist = newidlist(); /* list for vars specified with -v option */
|
|
|
|
for (iv=0; iv < option_nlvars; iv++) {
|
|
|
|
if(nc_inq_gvarid(igrp, option_lvars[iv], &varid) == NC_NOERR)
|
|
|
|
idadd(vlist, varid);
|
|
|
|
}
|
2017-09-01 06:16:21 +08:00
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_nvars(igrp, &nvars));
|
2010-06-03 21:24:43 +08:00
|
|
|
for (varid = 0; varid < nvars; varid++) {
|
2013-01-24 01:45:29 +08:00
|
|
|
if (!option_varstruct && option_nlvars > 0 && ! idmember(vlist, varid))
|
|
|
|
continue;
|
2011-01-10 05:41:07 +08:00
|
|
|
NC_CHECK(copy_var(igrp, varid, ogrp));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
Corrected "BAIL" macros to avoid infinite loop when logging is disabled and an
error occurs after an "exit:" label.
Corrected a dozen Coverity errors (mainly allocation issues, along with a few
other things):
711711, 711802, 711803, 711905, 970825, 996123, 996124, 1025787,
1047274, 1130013, 1130014, 1139538
Refactored internal fill-value code to correctly handle string types, and
especially to allow NULL pointers and null strings (ie. "") to be
distinguished. The code now avoids partially aliasing the two together
(which only happened on the 'write' side of things and wasn't reflected on
the 'read' side, adding to the previous confusion).
Probably still weak on handling fill-values of variable-length and compound
datatypes.
Refactored the recursive metadata reads a bit more, to process HDF5 named
datatypes and datasets immediately, avoiding chewing up memory for those
types of objects, etc.
Finished uncommenting and updating the nc_test4/tst_fills2.c code (as I'm
proceeding alphabetically through the nc_test4 code files).
2013-12-29 15:12:43 +08:00
|
|
|
freeidlist(vlist);
|
2010-06-03 21:24:43 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2020-12-08 02:29:12 +08:00
|
|
|
#if DEBUGCHUNK
|
2020-04-30 04:34:30 +08:00
|
|
|
static void
|
|
|
|
report(int rank, size_t* start, size_t* count, void* buf)
|
|
|
|
{
|
|
|
|
int i;
|
2020-12-08 02:29:12 +08:00
|
|
|
size_t prod = 1;
|
2020-04-30 04:34:30 +08:00
|
|
|
for(i=0;i<rank;i++) prod *= count[i];
|
|
|
|
fprintf(stderr,"start=");
|
2020-12-08 02:29:12 +08:00
|
|
|
for(i=0;i<rank;i++)
|
2020-04-30 04:34:30 +08:00
|
|
|
fprintf(stderr,"%s%ld",(i==0?"(":" "),(long)start[i]);
|
|
|
|
fprintf(stderr,")");
|
|
|
|
fprintf(stderr," count=");
|
2020-12-08 02:29:12 +08:00
|
|
|
for(i=0;i<rank;i++)
|
2020-04-30 04:34:30 +08:00
|
|
|
fprintf(stderr,"%s%ld",(i==0?"(":" "),(long)count[i]);
|
|
|
|
fprintf(stderr,")");
|
|
|
|
fprintf(stderr," data=");
|
2020-12-08 02:29:12 +08:00
|
|
|
for(i=0;i<prod;i++)
|
2020-04-30 04:34:30 +08:00
|
|
|
fprintf(stderr,"%s%d",(i==0?"(":" "),((int*)buf)[i]);
|
|
|
|
fprintf(stderr,"\n");
|
|
|
|
fflush(stderr);
|
|
|
|
}
|
2020-06-25 06:34:32 +08:00
|
|
|
#endif
|
2020-04-30 04:34:30 +08:00
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
/* Copy the schema in a group and all its subgroups, recursively, from
|
2010-12-31 02:17:04 +08:00
|
|
|
* group igrp in input to parent group ogrp in destination. Use
|
|
|
|
* dimmap array to map input dimids to output dimids. */
|
2010-06-03 21:24:43 +08:00
|
|
|
static int
|
2017-09-01 06:16:21 +08:00
|
|
|
copy_schema(int igrp, int ogrp)
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int ogid; /* like igrp but in output file */
|
|
|
|
|
|
|
|
/* get groupid in output corresponding to group igrp in input,
|
|
|
|
* given parent group (or root group) ogrp in output */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(get_grpid(igrp, ogrp, &ogid));
|
|
|
|
|
2011-01-10 05:41:07 +08:00
|
|
|
NC_CHECK(copy_dims(igrp, ogid));
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(copy_atts(igrp, NC_GLOBAL, ogid, NC_GLOBAL));
|
2011-01-10 05:41:07 +08:00
|
|
|
NC_CHECK(copy_vars(igrp, ogid));
|
2017-09-01 06:16:21 +08:00
|
|
|
#ifdef USE_NETCDF4
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
|
|
|
int numgrps;
|
|
|
|
int *grpids;
|
2012-01-30 02:56:29 +08:00
|
|
|
int i;
|
2010-06-03 21:24:43 +08:00
|
|
|
/* Copy schema from subgroups */
|
|
|
|
stat = nc_inq_grps(igrp, &numgrps, NULL);
|
2023-11-25 02:20:52 +08:00
|
|
|
grpids = (int *)emalloc((size_t)(numgrps + 1) * sizeof(int));
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_grps(igrp, &numgrps, grpids));
|
2017-09-01 06:16:21 +08:00
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
for(i = 0; i < numgrps; i++) {
|
2013-01-24 01:45:29 +08:00
|
|
|
if (option_grpstruct || group_wanted(grpids[i], option_nlgrps, option_grpids)) {
|
|
|
|
NC_CHECK(copy_schema(grpids[i], ogid));
|
|
|
|
}
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
free(grpids);
|
|
|
|
}
|
|
|
|
#endif /* USE_NETCDF4 */
|
2017-09-01 06:16:21 +08:00
|
|
|
return stat;
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
|
2010-08-29 23:08:12 +08:00
|
|
|
/* Return number of values for a variable varid in a group igrp */
|
2010-06-03 21:24:43 +08:00
|
|
|
static int
|
2010-08-02 01:16:08 +08:00
|
|
|
inq_nvals(int igrp, int varid, long long *nvalsp) {
|
2010-06-03 21:24:43 +08:00
|
|
|
int stat = NC_NOERR;
|
|
|
|
int ndims;
|
2010-08-02 01:16:08 +08:00
|
|
|
int *dimids;
|
2010-06-03 21:24:43 +08:00
|
|
|
int dim;
|
|
|
|
long long nvals = 1;
|
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_varndims(igrp, varid, &ndims));
|
2023-11-25 02:20:52 +08:00
|
|
|
dimids = (int *) emalloc((size_t)(ndims + 1) * sizeof(int));
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_vardimid (igrp, varid, dimids));
|
2010-06-03 21:24:43 +08:00
|
|
|
for(dim = 0; dim < ndims; dim++) {
|
|
|
|
size_t len;
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_dimlen(igrp, dimids[dim], &len));
|
2010-06-03 21:24:43 +08:00
|
|
|
nvals *= len;
|
|
|
|
}
|
|
|
|
if(nvalsp)
|
|
|
|
*nvalsp = nvals;
|
2010-08-02 01:16:08 +08:00
|
|
|
free(dimids);
|
2010-06-03 21:24:43 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Copy data from variable varid in group igrp to corresponding group
|
|
|
|
* ogrp. */
|
|
|
|
static int
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
copy_var_data(int igrp, int varid, int ogrp)
|
|
|
|
{
|
2010-06-03 21:24:43 +08:00
|
|
|
int stat = NC_NOERR;
|
|
|
|
nc_type vartype;
|
|
|
|
long long nvalues; /* number of values for this variable */
|
|
|
|
size_t ntoget; /* number of values to access this iteration */
|
|
|
|
size_t value_size; /* size of a single value of this variable */
|
|
|
|
static void *buf = 0; /* buffer for the variable values */
|
|
|
|
char varname[NC_MAX_NAME];
|
|
|
|
int ovarid;
|
2010-08-02 01:16:08 +08:00
|
|
|
size_t *start;
|
|
|
|
size_t *count;
|
2010-08-03 05:31:01 +08:00
|
|
|
nciter_t *iterp; /* opaque structure for iteration status */
|
2010-06-03 21:24:43 +08:00
|
|
|
int do_realloc = 0;
|
2017-09-01 06:16:21 +08:00
|
|
|
#ifdef USE_NETCDF4
|
2011-07-08 02:36:00 +08:00
|
|
|
int okind;
|
2012-01-30 02:56:29 +08:00
|
|
|
size_t chunksize;
|
|
|
|
#endif
|
2010-06-03 21:24:43 +08:00
|
|
|
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(inq_nvals(igrp, varid, &nvalues));
|
2010-06-03 21:24:43 +08:00
|
|
|
if(nvalues == 0)
|
|
|
|
return stat;
|
|
|
|
/* get corresponding output variable */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_varname(igrp, varid, varname));
|
|
|
|
NC_CHECK(nc_inq_varid(ogrp, varname, &ovarid));
|
|
|
|
NC_CHECK(nc_inq_vartype(igrp, varid, &vartype));
|
2012-12-12 01:49:40 +08:00
|
|
|
value_size = val_size(igrp, varid);
|
2011-07-08 02:36:00 +08:00
|
|
|
if(value_size > option_copy_buffer_size) {
|
|
|
|
option_copy_buffer_size = value_size;
|
2010-06-03 21:24:43 +08:00
|
|
|
do_realloc = 1;
|
|
|
|
}
|
2017-09-01 06:16:21 +08:00
|
|
|
#ifdef USE_NETCDF4
|
2011-07-08 02:36:00 +08:00
|
|
|
NC_CHECK(nc_inq_format(ogrp, &okind));
|
|
|
|
if(okind == NC_FORMAT_NETCDF4 || okind == NC_FORMAT_NETCDF4_CLASSIC) {
|
2017-09-01 06:16:21 +08:00
|
|
|
/* if this variable chunked, set variable chunk cache size */
|
2020-03-01 03:06:21 +08:00
|
|
|
int contig = NC_CONTIGUOUS;
|
2011-07-08 02:36:00 +08:00
|
|
|
NC_CHECK(nc_inq_var_chunking(ogrp, ovarid, &contig, NULL));
|
2020-03-01 03:06:21 +08:00
|
|
|
if(contig == NC_CHUNKED) { /* chunked */
|
2011-07-13 03:06:00 +08:00
|
|
|
if(option_compute_chunkcaches) {
|
|
|
|
/* Try to estimate variable-specific chunk cache,
|
|
|
|
* depending on specific size and shape of this
|
|
|
|
* variable's chunks. This doesn't work yet. */
|
2011-07-08 02:36:00 +08:00
|
|
|
size_t chunkcache_size, chunkcache_nelems;
|
|
|
|
float chunkcache_preemption;
|
|
|
|
NC_CHECK(inq_var_chunking_params(igrp, varid, ogrp, ovarid,
|
2017-09-01 06:16:21 +08:00
|
|
|
&chunkcache_size,
|
|
|
|
&chunkcache_nelems,
|
2011-07-08 02:36:00 +08:00
|
|
|
&chunkcache_preemption));
|
2017-09-01 06:16:21 +08:00
|
|
|
NC_CHECK(nc_set_var_chunk_cache(ogrp, ovarid,
|
|
|
|
chunkcache_size,
|
|
|
|
chunkcache_nelems,
|
|
|
|
chunkcache_preemption));
|
|
|
|
} else {
|
2011-07-13 03:06:00 +08:00
|
|
|
/* by default, use same chunk cache for all chunked variables */
|
2017-09-01 06:16:21 +08:00
|
|
|
NC_CHECK(nc_set_var_chunk_cache(ogrp, ovarid,
|
2011-07-13 03:06:00 +08:00
|
|
|
option_chunk_cache_size,
|
|
|
|
option_chunk_cache_nelems,
|
|
|
|
COPY_CHUNKCACHE_PREEMPTION));
|
2011-07-08 02:36:00 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
/* For chunked variables, option_copy_buffer_size must also be at least as large as
|
2011-01-18 06:15:26 +08:00
|
|
|
* size of a chunk in input, otherwise resize it. */
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(inq_var_chunksize(igrp, varid, &chunksize));
|
2011-07-08 02:36:00 +08:00
|
|
|
if(chunksize > option_copy_buffer_size) {
|
|
|
|
option_copy_buffer_size = chunksize;
|
2010-06-03 21:24:43 +08:00
|
|
|
do_realloc = 1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif /* USE_NETCDF4 */
|
|
|
|
if(buf && do_realloc) {
|
|
|
|
free(buf);
|
|
|
|
buf = 0;
|
|
|
|
}
|
|
|
|
if(buf == 0) { /* first time or needs to grow */
|
2011-07-08 02:36:00 +08:00
|
|
|
buf = emalloc(option_copy_buffer_size);
|
|
|
|
memset((void*)buf,0,option_copy_buffer_size);
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* initialize variable iteration */
|
2011-07-08 02:36:00 +08:00
|
|
|
NC_CHECK(nc_get_iter(igrp, varid, option_copy_buffer_size, &iterp));
|
2010-06-03 21:24:43 +08:00
|
|
|
|
2023-11-25 02:20:52 +08:00
|
|
|
start = (size_t *) emalloc((size_t)(iterp->rank + 1) * sizeof(size_t));
|
|
|
|
count = (size_t *) emalloc((size_t)(iterp->rank + 1) * sizeof(size_t));
|
2010-08-02 01:16:08 +08:00
|
|
|
/* nc_next_iter() initializes start and count on first call,
|
|
|
|
* changes start and count to iterate through whole variable on
|
|
|
|
* subsequent calls. */
|
2010-08-03 05:31:01 +08:00
|
|
|
while((ntoget = nc_next_iter(iterp, start, count)) > 0) {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_get_vara(igrp, varid, start, count, buf));
|
2020-06-06 07:03:29 +08:00
|
|
|
#ifdef DEBUGCHUNK
|
|
|
|
report(iterp->rank,start,count,buf);
|
|
|
|
#endif
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_put_vara(ogrp, ovarid, start, count, buf));
|
2010-06-03 21:24:43 +08:00
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
/* we have to explicitly free values for strings and vlens */
|
|
|
|
if(vartype == NC_STRING) {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_free_string(ntoget, (char **)buf));
|
2010-06-03 21:24:43 +08:00
|
|
|
} else if(vartype > NC_STRING) { /* user-defined type */
|
|
|
|
nc_type vclass;
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_user_type(igrp, vartype, NULL, NULL, NULL, NULL, &vclass));
|
2010-06-03 21:24:43 +08:00
|
|
|
if(vclass == NC_VLEN) {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_free_vlens(ntoget, (nc_vlen_t *)buf));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif /* USE_NETCDF4 */
|
|
|
|
} /* end main iteration loop */
|
2011-07-08 02:36:00 +08:00
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
/* We're all done with this input and output variable, so if
|
|
|
|
* either variable is chunked, free up its variable chunk cache */
|
2011-07-13 03:06:00 +08:00
|
|
|
/* NC_CHECK(free_var_chunk_cache(igrp, varid)); */
|
|
|
|
/* NC_CHECK(free_var_chunk_cache(ogrp, ovarid)); */
|
2011-07-08 02:36:00 +08:00
|
|
|
#endif /* USE_NETCDF4 */
|
2010-08-02 01:16:08 +08:00
|
|
|
free(start);
|
|
|
|
free(count);
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_free_iter(iterp));
|
2010-06-03 21:24:43 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Copy data from variables in group igrp to variables in
|
|
|
|
* corresponding group with parent ogrp, and all subgroups
|
|
|
|
* recursively */
|
|
|
|
static int
|
2011-07-08 02:36:00 +08:00
|
|
|
copy_data(int igrp, int ogrp)
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int ogid;
|
2012-01-30 02:56:29 +08:00
|
|
|
int nvars;
|
|
|
|
int varid;
|
|
|
|
#ifdef USE_NETCDF4
|
2010-06-03 21:24:43 +08:00
|
|
|
int numgrps;
|
|
|
|
int *grpids;
|
|
|
|
int i;
|
2012-01-30 02:56:29 +08:00
|
|
|
#endif
|
2010-06-03 21:24:43 +08:00
|
|
|
|
2013-01-24 01:45:29 +08:00
|
|
|
int iv; /* variable number */
|
Corrected "BAIL" macros to avoid infinite loop when logging is disabled and an
error occurs after an "exit:" label.
Corrected a dozen Coverity errors (mainly allocation issues, along with a few
other things):
711711, 711802, 711803, 711905, 970825, 996123, 996124, 1025787,
1047274, 1130013, 1130014, 1139538
Refactored internal fill-value code to correctly handle string types, and
especially to allow NULL pointers and null strings (ie. "") to be
distinguished. The code now avoids partially aliasing the two together
(which only happened on the 'write' side of things and wasn't reflected on
the 'read' side, adding to the previous confusion).
Probably still weak on handling fill-values of variable-length and compound
datatypes.
Refactored the recursive metadata reads a bit more, to process HDF5 named
datatypes and datasets immediately, avoiding chewing up memory for those
types of objects, etc.
Finished uncommenting and updating the nc_test4/tst_fills2.c code (as I'm
proceeding alphabetically through the nc_test4 code files).
2013-12-29 15:12:43 +08:00
|
|
|
idnode_t* vlist = NULL; /* list for vars specified with -v option */
|
2013-01-24 01:45:29 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If any vars were specified with -v option, get list of
|
|
|
|
* associated variable ids relative to this group. Assume vars
|
|
|
|
* specified with syntax like "grp1/grp2/varname" or
|
|
|
|
* "/grp1/grp2/varname" if they are in groups.
|
|
|
|
*/
|
|
|
|
vlist = newidlist(); /* list for vars specified with -v option */
|
|
|
|
for (iv=0; iv < option_nlvars; iv++) {
|
|
|
|
if(nc_inq_gvarid(igrp, option_lvars[iv], &varid) == NC_NOERR)
|
|
|
|
idadd(vlist, varid);
|
|
|
|
}
|
2017-09-01 06:16:21 +08:00
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
/* get groupid in output corresponding to group igrp in input,
|
|
|
|
* given parent group (or root group) ogrp in output */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(get_grpid(igrp, ogrp, &ogid));
|
2017-09-01 06:16:21 +08:00
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
/* Copy data from this group */
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_nvars(igrp, &nvars));
|
2011-07-08 02:36:00 +08:00
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
for (varid = 0; varid < nvars; varid++) {
|
2013-01-24 01:45:29 +08:00
|
|
|
if (option_nlvars > 0 && ! idmember(vlist, varid))
|
|
|
|
continue;
|
|
|
|
if (!group_wanted(igrp, option_nlgrps, option_grpids))
|
|
|
|
continue;
|
2011-07-08 02:36:00 +08:00
|
|
|
NC_CHECK(copy_var_data(igrp, varid, ogid));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
/* Copy data from subgroups */
|
|
|
|
stat = nc_inq_grps(igrp, &numgrps, NULL);
|
2023-11-25 02:20:52 +08:00
|
|
|
grpids = (int *)emalloc((size_t)(numgrps + 1) * sizeof(int));
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_inq_grps(igrp, &numgrps, grpids));
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
for(i = 0; i < numgrps; i++) {
|
2013-01-24 01:45:29 +08:00
|
|
|
if (!option_grpstruct && !group_wanted(grpids[i], option_nlgrps, option_grpids))
|
|
|
|
continue;
|
2011-07-08 02:36:00 +08:00
|
|
|
NC_CHECK(copy_data(grpids[i], ogid));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
free(grpids);
|
|
|
|
#endif /* USE_NETCDF4 */
|
Corrected "BAIL" macros to avoid infinite loop when logging is disabled and an
error occurs after an "exit:" label.
Corrected a dozen Coverity errors (mainly allocation issues, along with a few
other things):
711711, 711802, 711803, 711905, 970825, 996123, 996124, 1025787,
1047274, 1130013, 1130014, 1139538
Refactored internal fill-value code to correctly handle string types, and
especially to allow NULL pointers and null strings (ie. "") to be
distinguished. The code now avoids partially aliasing the two together
(which only happened on the 'write' side of things and wasn't reflected on
the 'read' side, adding to the previous confusion).
Probably still weak on handling fill-values of variable-length and compound
datatypes.
Refactored the recursive metadata reads a bit more, to process HDF5 named
datatypes and datasets immediately, avoiding chewing up memory for those
types of objects, etc.
Finished uncommenting and updating the nc_test4/tst_fills2.c code (as I'm
proceeding alphabetically through the nc_test4 code files).
2013-12-29 15:12:43 +08:00
|
|
|
freeidlist(vlist);
|
2010-06-03 21:24:43 +08:00
|
|
|
return stat;
|
|
|
|
}
|
|
|
|
|
2012-12-12 01:49:40 +08:00
|
|
|
/* Count total number of dimensions in ncid and all its descendant subgroups */
|
2011-01-10 05:41:07 +08:00
|
|
|
int
|
2015-08-03 07:23:32 +08:00
|
|
|
count_dims(int ncid) {
|
2018-03-21 01:20:14 +08:00
|
|
|
|
2018-02-25 11:36:24 +08:00
|
|
|
#ifdef USE_NETCDF4
|
2010-12-31 02:17:04 +08:00
|
|
|
int numgrps;
|
2018-02-25 11:36:24 +08:00
|
|
|
#endif
|
2018-03-21 01:20:14 +08:00
|
|
|
|
2012-12-12 01:49:40 +08:00
|
|
|
int ndims;
|
|
|
|
NC_CHECK(nc_inq_ndims(ncid, &ndims));
|
2018-03-21 04:41:21 +08:00
|
|
|
|
2012-12-12 01:49:40 +08:00
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
NC_CHECK(nc_inq_grps(ncid, &numgrps, NULL));
|
|
|
|
if(numgrps > 0) {
|
|
|
|
int igrp;
|
2023-11-25 02:20:52 +08:00
|
|
|
int *grpids = emalloc((size_t)numgrps * sizeof(int));
|
2012-12-12 01:49:40 +08:00
|
|
|
NC_CHECK(nc_inq_grps(ncid, &numgrps, grpids));
|
|
|
|
for(igrp = 0; igrp < numgrps; igrp++) {
|
|
|
|
ndims += count_dims(grpids[igrp]);
|
|
|
|
}
|
2017-09-01 06:16:21 +08:00
|
|
|
free(grpids);
|
2010-12-31 02:17:04 +08:00
|
|
|
}
|
2012-12-12 01:49:40 +08:00
|
|
|
#endif /* USE_NETCDF4 */
|
2011-01-10 05:41:07 +08:00
|
|
|
return ndims;
|
2010-12-31 02:17:04 +08:00
|
|
|
}
|
|
|
|
|
2012-03-09 03:48:57 +08:00
|
|
|
/* Test if special case: netCDF-3 file with more than one record
|
|
|
|
* variable. Performance can be very slow for this case when the disk
|
|
|
|
* block size is large, there are many record variables, and a
|
|
|
|
* record's worth of data for some variables is smaller than the disk
|
|
|
|
* block size. In this case, copying the record variables a variable
|
|
|
|
* at a time causes much rereading of record data, so instead we want
|
|
|
|
* to copy data a record at a time. */
|
|
|
|
static int
|
|
|
|
nc3_special_case(int ncid, int kind) {
|
2015-08-16 06:26:35 +08:00
|
|
|
if (kind == NC_FORMAT_CLASSIC || kind == NC_FORMAT_64BIT_OFFSET
|
|
|
|
|| kind == NC_FORMAT_CDF5) {
|
2012-03-09 03:48:57 +08:00
|
|
|
int recdimid = 0;
|
|
|
|
NC_CHECK(nc_inq_unlimdim(ncid, &recdimid));
|
|
|
|
if (recdimid != -1) { /* we have a record dimension */
|
|
|
|
int nvars;
|
2012-04-13 01:18:06 +08:00
|
|
|
int varid;
|
2012-03-09 03:48:57 +08:00
|
|
|
NC_CHECK(nc_inq_nvars(ncid, &nvars));
|
2012-04-13 01:18:06 +08:00
|
|
|
for (varid = 0; varid < nvars; varid++) {
|
|
|
|
int *dimids = 0;
|
|
|
|
int ndims;
|
|
|
|
NC_CHECK( nc_inq_varndims(ncid, varid, &ndims) );
|
|
|
|
if (ndims > 0) {
|
|
|
|
int dimids0;
|
2023-11-25 02:20:52 +08:00
|
|
|
dimids = (int *) emalloc((size_t)(ndims + 1) * sizeof(int));
|
2012-04-13 01:18:06 +08:00
|
|
|
NC_CHECK( nc_inq_vardimid(ncid, varid, dimids) );
|
|
|
|
dimids0 = dimids[0];
|
|
|
|
free(dimids);
|
|
|
|
if(dimids0 == recdimid) {
|
|
|
|
return 1; /* found a record variable */
|
2012-03-09 03:48:57 +08:00
|
|
|
}
|
|
|
|
}
|
2012-04-13 01:18:06 +08:00
|
|
|
}
|
2012-03-09 03:48:57 +08:00
|
|
|
}
|
|
|
|
}
|
2012-04-13 01:18:06 +08:00
|
|
|
return 0;
|
2012-03-09 03:48:57 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Classify variables in ncid as either fixed-size variables (with no
|
|
|
|
* unlimited dimension) or as record variables (with an unlimited
|
|
|
|
* dimension) */
|
|
|
|
static int
|
|
|
|
classify_vars(
|
|
|
|
int ncid, /* netCDF ID */
|
|
|
|
size_t *nf, /* for returning number of fixed-size variables */
|
|
|
|
int **fvars, /* the array of fixed_size variable IDS, caller should free */
|
|
|
|
size_t *nr, /* for returning number of record variables */
|
|
|
|
int **rvars) /* the array of record variable IDs, caller should free */
|
|
|
|
{
|
|
|
|
int varid;
|
2017-09-01 06:16:21 +08:00
|
|
|
int varindex = 0;
|
2012-03-09 03:48:57 +08:00
|
|
|
int nvars;
|
|
|
|
NC_CHECK(nc_inq_nvars(ncid, &nvars));
|
|
|
|
*nf = 0;
|
2023-11-25 02:20:52 +08:00
|
|
|
*fvars = (int *) emalloc((size_t)nvars * sizeof(int));
|
2012-03-09 03:48:57 +08:00
|
|
|
*nr = 0;
|
2023-11-25 02:20:52 +08:00
|
|
|
*rvars = (int *) emalloc((size_t)nvars * sizeof(int));
|
2017-09-01 06:16:21 +08:00
|
|
|
|
|
|
|
if(option_nlvars > 0) {
|
|
|
|
for (varindex = 0; varindex < option_nlvars; varindex++) {
|
|
|
|
nc_inq_varid(ncid,option_lvars[varindex],&varid);
|
|
|
|
|
|
|
|
if (isrecvar(ncid, varid)) {
|
|
|
|
(*rvars)[*nr] = varid;
|
|
|
|
(*nr)++;
|
|
|
|
} else {
|
|
|
|
(*fvars)[*nf] = varid;
|
|
|
|
(*nf)++;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
for (varid = 0; varid < nvars; varid++) {
|
|
|
|
if (isrecvar(ncid, varid)) {
|
|
|
|
(*rvars)[*nr] = varid;
|
|
|
|
(*nr)++;
|
|
|
|
} else {
|
|
|
|
(*fvars)[*nf] = varid;
|
|
|
|
(*nf)++;
|
|
|
|
}
|
|
|
|
}
|
2012-03-09 03:48:57 +08:00
|
|
|
}
|
|
|
|
return NC_NOERR;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Only called for classic format or 64-bit offset format files, to speed up special case */
|
|
|
|
static int
|
|
|
|
copy_fixed_size_data(int igrp, int ogrp, size_t nfixed_vars, int *fixed_varids) {
|
|
|
|
size_t ivar;
|
|
|
|
/* for each fixed-size variable, copy data */
|
|
|
|
for (ivar = 0; ivar < nfixed_vars; ivar++) {
|
|
|
|
int varid = fixed_varids[ivar];
|
|
|
|
NC_CHECK(copy_var_data(igrp, varid, ogrp));
|
|
|
|
}
|
|
|
|
if (fixed_varids)
|
|
|
|
free(fixed_varids);
|
|
|
|
return NC_NOERR;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* copy a record's worth of data for a variable from input to output */
|
|
|
|
static int
|
|
|
|
copy_rec_var_data(int ncid, /* input */
|
|
|
|
int ogrp, /* output */
|
|
|
|
int irec, /* record number */
|
|
|
|
int varid, /* input variable id */
|
|
|
|
int ovarid, /* output variable id */
|
|
|
|
size_t *start, /* start indices for record data */
|
|
|
|
size_t *count, /* edge lengths for record data */
|
|
|
|
void *buf /* buffer large enough to hold data */
|
2017-09-01 06:16:21 +08:00
|
|
|
)
|
2012-03-09 03:48:57 +08:00
|
|
|
{
|
|
|
|
NC_CHECK(nc_get_vara(ncid, varid, start, count, buf));
|
|
|
|
NC_CHECK(nc_put_vara(ogrp, ovarid, start, count, buf));
|
|
|
|
return NC_NOERR;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Only called for classic format or 64-bit offset format files, to speed up special case */
|
|
|
|
static int
|
|
|
|
copy_record_data(int ncid, int ogrp, size_t nrec_vars, int *rec_varids) {
|
|
|
|
int unlimid;
|
|
|
|
size_t nrecs = 0; /* how many records? */
|
|
|
|
size_t irec;
|
|
|
|
size_t ivar;
|
|
|
|
void **buf; /* space for reading in data for each variable */
|
|
|
|
int *rec_ovarids; /* corresponding varids in output */
|
|
|
|
size_t **start;
|
|
|
|
size_t **count;
|
|
|
|
NC_CHECK(nc_inq_unlimdim(ncid, &unlimid));
|
|
|
|
NC_CHECK(nc_inq_dimlen(ncid, unlimid, &nrecs));
|
|
|
|
buf = (void **) emalloc(nrec_vars * sizeof(void *));
|
|
|
|
rec_ovarids = (int *) emalloc(nrec_vars * sizeof(int));
|
|
|
|
start = (size_t **) emalloc(nrec_vars * sizeof(size_t*));
|
|
|
|
count = (size_t **) emalloc(nrec_vars * sizeof(size_t*));
|
|
|
|
/* get space to hold one record's worth of data for each record variable */
|
|
|
|
for (ivar = 0; ivar < nrec_vars; ivar++) {
|
|
|
|
int varid;
|
|
|
|
int ndims;
|
|
|
|
int *dimids;
|
|
|
|
size_t value_size;
|
|
|
|
int dimid;
|
|
|
|
int ii;
|
|
|
|
size_t nvals;
|
|
|
|
char varname[NC_MAX_NAME];
|
|
|
|
varid = rec_varids[ivar];
|
|
|
|
NC_CHECK(nc_inq_varndims(ncid, varid, &ndims));
|
2023-11-25 02:20:52 +08:00
|
|
|
dimids = (int *) emalloc((size_t)(1 + ndims) * sizeof(int));
|
|
|
|
start[ivar] = (size_t *) emalloc((size_t)ndims * sizeof(size_t));
|
|
|
|
count[ivar] = (size_t *) emalloc((size_t)ndims * sizeof(size_t));
|
2012-03-09 03:48:57 +08:00
|
|
|
NC_CHECK(nc_inq_vardimid (ncid, varid, dimids));
|
2012-12-12 01:49:40 +08:00
|
|
|
value_size = val_size(ncid, varid);
|
2012-03-09 03:48:57 +08:00
|
|
|
nvals = 1;
|
|
|
|
for(ii = 1; ii < ndims; ii++) { /* for rec size, don't include first record dimension */
|
|
|
|
size_t dimlen;
|
|
|
|
dimid = dimids[ii];
|
|
|
|
NC_CHECK(nc_inq_dimlen(ncid, dimid, &dimlen));
|
|
|
|
nvals *= dimlen;
|
|
|
|
start[ivar][ii] = 0;
|
|
|
|
count[ivar][ii] = dimlen;
|
|
|
|
}
|
2017-09-01 06:16:21 +08:00
|
|
|
start[ivar][0] = 0;
|
2012-03-09 03:48:57 +08:00
|
|
|
count[ivar][0] = 1; /* 1 record */
|
|
|
|
buf[ivar] = (void *) emalloc(nvals * value_size);
|
|
|
|
NC_CHECK(nc_inq_varname(ncid, varid, varname));
|
|
|
|
NC_CHECK(nc_inq_varid(ogrp, varname, &rec_ovarids[ivar]));
|
|
|
|
if(dimids)
|
|
|
|
free(dimids);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* for each record, copy all variable data */
|
|
|
|
for(irec = 0; irec < nrecs; irec++) {
|
|
|
|
for (ivar = 0; ivar < nrec_vars; ivar++) {
|
|
|
|
int varid, ovarid;
|
|
|
|
varid = rec_varids[ivar];
|
|
|
|
ovarid = rec_ovarids[ivar];
|
|
|
|
start[ivar][0] = irec;
|
2017-09-01 06:16:21 +08:00
|
|
|
NC_CHECK(copy_rec_var_data(ncid, ogrp, irec, varid, ovarid,
|
2012-03-09 03:48:57 +08:00
|
|
|
start[ivar], count[ivar], buf[ivar]));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
for (ivar = 0; ivar < nrec_vars; ivar++) {
|
|
|
|
if(start[ivar])
|
|
|
|
free(start[ivar]);
|
|
|
|
if(count[ivar])
|
|
|
|
free(count[ivar]);
|
|
|
|
}
|
|
|
|
if(start)
|
|
|
|
free(start);
|
|
|
|
if(count)
|
|
|
|
free(count);
|
|
|
|
for (ivar = 0; ivar < nrec_vars; ivar++) {
|
|
|
|
if(buf[ivar]) {
|
|
|
|
free(buf[ivar]);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (rec_varids)
|
|
|
|
free(rec_varids);
|
|
|
|
if(buf)
|
|
|
|
free(buf);
|
|
|
|
if(rec_ovarids)
|
|
|
|
free(rec_ovarids);
|
|
|
|
return NC_NOERR;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* copy infile to outfile using netCDF API
|
2010-09-01 06:41:00 +08:00
|
|
|
*/
|
2010-06-03 21:24:43 +08:00
|
|
|
static int
|
2011-07-13 03:06:00 +08:00
|
|
|
copy(char* infile, char* outfile)
|
2010-06-03 21:24:43 +08:00
|
|
|
{
|
|
|
|
int stat = NC_NOERR;
|
|
|
|
int igrp, ogrp;
|
|
|
|
int inkind, outkind;
|
2012-04-13 01:18:06 +08:00
|
|
|
int open_mode = NC_NOWRITE;
|
|
|
|
int create_mode = NC_CLOBBER;
|
2011-01-10 05:41:07 +08:00
|
|
|
size_t ndims;
|
2010-06-03 21:24:43 +08:00
|
|
|
|
2012-04-13 01:18:06 +08:00
|
|
|
if(option_read_diskless) {
|
|
|
|
open_mode |= NC_DISKLESS;
|
|
|
|
}
|
|
|
|
|
2012-04-13 10:52:30 +08:00
|
|
|
NC_CHECK(nc_open(infile, open_mode, &igrp));
|
|
|
|
|
|
|
|
NC_CHECK(nc_inq_format(igrp, &inkind));
|
|
|
|
|
2014-12-06 06:36:01 +08:00
|
|
|
/* option_kind specifies which netCDF format for output, one of
|
2012-03-09 03:48:57 +08:00
|
|
|
*
|
2017-09-01 06:16:21 +08:00
|
|
|
* SAME_AS_INPUT, NC_FORMAT_CLASSIC, NC_FORMAT_64BIT,
|
|
|
|
* NC_FORMAT_NETCDF4, NC_FORMAT_NETCDF4_CLASSIC
|
2014-12-06 06:36:01 +08:00
|
|
|
*
|
|
|
|
* However, if compression or shuffling was specified and kind was SAME_AS_INPUT,
|
2017-09-01 06:16:21 +08:00
|
|
|
* option_kind is changed to NC_FORMAT_NETCDF4_CLASSIC, if input format is
|
2014-12-06 06:36:01 +08:00
|
|
|
* NC_FORMAT_CLASSIC or NC_FORMAT_64BIT .
|
2012-03-09 03:48:57 +08:00
|
|
|
*/
|
2011-07-08 02:36:00 +08:00
|
|
|
outkind = option_kind;
|
|
|
|
if (option_kind == SAME_AS_INPUT) { /* default, kind not specified */
|
2010-09-01 06:41:00 +08:00
|
|
|
outkind = inkind;
|
2011-01-10 05:41:07 +08:00
|
|
|
/* Deduce output kind if netCDF-4 features requested */
|
2015-08-16 06:26:35 +08:00
|
|
|
if (inkind == NC_FORMAT_CLASSIC || inkind == NC_FORMAT_64BIT_OFFSET
|
2017-09-01 06:16:21 +08:00
|
|
|
|| inkind == NC_FORMAT_CDF5) {
|
|
|
|
if (option_deflate_level > 0 ||
|
|
|
|
option_shuffle_vars == NC_SHUFFLE ||
|
2018-07-27 10:16:02 +08:00
|
|
|
listlength(option_chunkspecs) > 0)
|
2017-09-01 06:16:21 +08:00
|
|
|
{
|
2011-01-10 05:41:07 +08:00
|
|
|
outkind = NC_FORMAT_NETCDF4_CLASSIC;
|
|
|
|
}
|
2010-09-01 06:41:00 +08:00
|
|
|
}
|
|
|
|
}
|
2011-01-06 07:48:47 +08:00
|
|
|
|
2011-01-07 06:09:14 +08:00
|
|
|
#ifdef USE_NETCDF4
|
2018-07-27 10:16:02 +08:00
|
|
|
if(listlength(option_chunkspecs) > 0) {
|
|
|
|
int i;
|
|
|
|
/* Now that input is open, can parse option_chunkspecs into binary
|
2011-01-07 06:09:14 +08:00
|
|
|
* structure. */
|
2018-07-27 10:16:02 +08:00
|
|
|
for(i=0;i<listlength(option_chunkspecs);i++) {
|
|
|
|
char* spec = (char*)listget(option_chunkspecs,i);
|
|
|
|
NC_CHECK(chunkspec_parse(igrp, spec));
|
|
|
|
}
|
2011-01-07 06:09:14 +08:00
|
|
|
}
|
|
|
|
#endif /* USE_NETCDF4 */
|
|
|
|
|
2013-01-24 01:45:29 +08:00
|
|
|
/* Check if any vars in -v don't exist */
|
|
|
|
if(missing_vars(igrp, option_nlvars, option_lvars))
|
2021-09-03 07:04:26 +08:00
|
|
|
goto fail;
|
2013-01-24 01:45:29 +08:00
|
|
|
|
|
|
|
if(option_nlgrps > 0) {
|
|
|
|
if(inkind != NC_FORMAT_NETCDF4) {
|
|
|
|
error("Group list (-g ...) only permitted for netCDF-4 file");
|
2021-09-03 07:04:26 +08:00
|
|
|
goto fail;
|
2013-01-24 01:45:29 +08:00
|
|
|
}
|
|
|
|
/* Check if any grps in -g don't exist */
|
|
|
|
if(grp_matches(igrp, option_nlgrps, option_lgrps, option_grpids) == 0)
|
2021-09-03 07:04:26 +08:00
|
|
|
goto fail;
|
2013-01-24 01:45:29 +08:00
|
|
|
}
|
|
|
|
|
2012-04-13 01:18:06 +08:00
|
|
|
if(option_write_diskless)
|
2019-03-16 02:05:27 +08:00
|
|
|
create_mode |= NC_PERSIST | NC_DISKLESS; /* NC_WRITE persists diskless file on close */
|
2010-06-03 21:24:43 +08:00
|
|
|
switch(outkind) {
|
|
|
|
case NC_FORMAT_CLASSIC:
|
2012-04-13 01:18:06 +08:00
|
|
|
/* nothing to do */
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
2015-08-16 06:26:35 +08:00
|
|
|
case NC_FORMAT_64BIT_OFFSET:
|
2012-04-13 01:18:06 +08:00
|
|
|
create_mode |= NC_64BIT_OFFSET;
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
2015-08-16 06:26:35 +08:00
|
|
|
case NC_FORMAT_CDF5:
|
2018-06-30 10:17:07 +08:00
|
|
|
#ifdef ENABLE_CDF5
|
2017-09-19 04:11:53 +08:00
|
|
|
create_mode |= NC_64BIT_DATA;
|
2017-09-21 04:18:43 +08:00
|
|
|
break;
|
2017-09-19 04:11:53 +08:00
|
|
|
#else
|
|
|
|
error("netCDF library built without CDF5 support, can't create CDF5 files");
|
|
|
|
break;
|
|
|
|
#endif
|
2010-06-03 21:24:43 +08:00
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
case NC_FORMAT_NETCDF4:
|
2012-04-13 01:18:06 +08:00
|
|
|
create_mode |= NC_NETCDF4;
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
|
|
|
case NC_FORMAT_NETCDF4_CLASSIC:
|
2012-04-13 01:18:06 +08:00
|
|
|
create_mode |= NC_NETCDF4 | NC_CLASSIC_MODEL;
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
|
|
|
#else
|
|
|
|
case NC_FORMAT_NETCDF4:
|
|
|
|
case NC_FORMAT_NETCDF4_CLASSIC:
|
2014-12-06 06:36:01 +08:00
|
|
|
error("netCDF library built with --disable-netcdf4, can't create netCDF-4 files");
|
2011-01-06 07:48:47 +08:00
|
|
|
break;
|
2010-06-03 21:24:43 +08:00
|
|
|
#endif /* USE_NETCDF4 */
|
|
|
|
default:
|
2014-12-06 06:36:01 +08:00
|
|
|
error("bad value for option specifying desired output format, see usage\n");
|
2011-01-06 07:48:47 +08:00
|
|
|
break;
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
2012-04-13 01:18:06 +08:00
|
|
|
NC_CHECK(nc_create(outfile, create_mode, &ogrp));
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_set_fill(ogrp, NC_NOFILL, NULL));
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
/* Because types in one group may depend on types in a different
|
|
|
|
* group, need to create all groups before defining types */
|
|
|
|
if(inkind == NC_FORMAT_NETCDF4) {
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(copy_groups(igrp, ogrp));
|
|
|
|
NC_CHECK(copy_types(igrp, ogrp));
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
#endif /* USE_NETCDF4 */
|
|
|
|
|
2011-01-10 05:41:07 +08:00
|
|
|
ndims = count_dims(igrp);
|
|
|
|
NC_CHECK(dimmap_init(ndims));
|
|
|
|
NC_CHECK(copy_schema(igrp, ogrp));
|
2011-01-06 07:48:47 +08:00
|
|
|
NC_CHECK(nc_enddef(ogrp));
|
2012-03-09 03:48:57 +08:00
|
|
|
|
2012-03-16 20:36:51 +08:00
|
|
|
/* For performance, special case netCDF-3 input or output file with record
|
2012-03-09 03:48:57 +08:00
|
|
|
* variables, to copy a record-at-a-time instead of a
|
2012-03-16 20:36:51 +08:00
|
|
|
* variable-at-a-time. */
|
2013-01-24 01:45:29 +08:00
|
|
|
/* TODO: check that these special cases work with -v option */
|
2012-03-09 03:48:57 +08:00
|
|
|
if(nc3_special_case(igrp, inkind)) {
|
|
|
|
size_t nfixed_vars, nrec_vars;
|
|
|
|
int *fixed_varids;
|
|
|
|
int *rec_varids;
|
|
|
|
NC_CHECK(classify_vars(igrp, &nfixed_vars, &fixed_varids, &nrec_vars, &rec_varids));
|
|
|
|
NC_CHECK(copy_fixed_size_data(igrp, ogrp, nfixed_vars, fixed_varids));
|
|
|
|
NC_CHECK(copy_record_data(igrp, ogrp, nrec_vars, rec_varids));
|
2012-03-16 20:36:51 +08:00
|
|
|
} else if (nc3_special_case(ogrp, outkind)) {
|
|
|
|
size_t nfixed_vars, nrec_vars;
|
|
|
|
int *fixed_varids;
|
|
|
|
int *rec_varids;
|
|
|
|
/* classifies output vars, but returns input varids */
|
|
|
|
NC_CHECK(classify_vars(ogrp, &nfixed_vars, &fixed_varids, &nrec_vars, &rec_varids));
|
|
|
|
NC_CHECK(copy_fixed_size_data(igrp, ogrp, nfixed_vars, fixed_varids));
|
|
|
|
NC_CHECK(copy_record_data(igrp, ogrp, nrec_vars, rec_varids));
|
2017-09-01 06:16:21 +08:00
|
|
|
} else {
|
2012-03-09 03:48:57 +08:00
|
|
|
NC_CHECK(copy_data(igrp, ogrp)); /* recursive, to handle nested groups */
|
|
|
|
}
|
2011-01-06 07:48:47 +08:00
|
|
|
|
|
|
|
NC_CHECK(nc_close(igrp));
|
|
|
|
NC_CHECK(nc_close(ogrp));
|
2010-06-03 21:24:43 +08:00
|
|
|
return stat;
|
2021-09-03 07:04:26 +08:00
|
|
|
fail:
|
|
|
|
nc_finalize();
|
|
|
|
exit(EXIT_FAILURE);
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
|
2017-09-01 06:16:21 +08:00
|
|
|
/*
|
2012-11-22 04:46:43 +08:00
|
|
|
* For non-negative numeric string with multiplier suffix K, M, G, T,
|
|
|
|
* or P (or lower-case equivalent), return corresponding value
|
|
|
|
* incorporating multiplier 1000, 1000000, 1.0d9, ... 1.0d15, or -1.0
|
|
|
|
* for error.
|
|
|
|
*/
|
|
|
|
static double
|
|
|
|
double_with_suffix(char *str) {
|
|
|
|
double dval;
|
|
|
|
char *suffix = 0;
|
|
|
|
errno = 0;
|
|
|
|
dval = strtod(str, &suffix);
|
|
|
|
if(dval < 0 || errno != 0)
|
|
|
|
return -1.0;
|
|
|
|
if(*suffix) {
|
|
|
|
switch (*suffix) {
|
|
|
|
case 'k': case 'K':
|
|
|
|
dval *= 1000;
|
|
|
|
break;
|
|
|
|
case 'm': case 'M':
|
|
|
|
dval *= 1000000;
|
|
|
|
break;
|
|
|
|
case 'g': case 'G':
|
|
|
|
dval *= 1000000000;
|
|
|
|
break;
|
|
|
|
case 't': case 'T':
|
|
|
|
dval *= 1.0e12;
|
|
|
|
break;
|
|
|
|
case 'p': case 'P':
|
|
|
|
dval *= 1.0e15;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
dval = -1.0; /* error, suffix multiplier must be K, M, G, or T */
|
2017-09-01 06:16:21 +08:00
|
|
|
}
|
2012-11-22 04:46:43 +08:00
|
|
|
}
|
|
|
|
return dval;
|
|
|
|
}
|
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
static void
|
|
|
|
usage(void)
|
|
|
|
{
|
|
|
|
#define USAGE "\
|
2014-12-06 03:26:42 +08:00
|
|
|
[-k kind] specify kind of netCDF format for output file, default same as input\n\
|
2015-08-16 06:26:35 +08:00
|
|
|
kind strings: 'classic', '64-bit offset', 'cdf5',\n\
|
2014-12-18 01:27:35 +08:00
|
|
|
'netCDF-4', 'netCDF-4 classic model'\n\
|
|
|
|
[-3] netCDF classic output (same as -k 'classic')\n\
|
|
|
|
[-6] 64-bit-offset output (same as -k '64-bit offset')\n\
|
|
|
|
[-4] netCDF-4 output (same as -k 'netCDF-4')\n\
|
|
|
|
[-7] netCDF-4-classic output (same as -k 'netCDF-4 classic model')\n\
|
2015-08-16 06:26:35 +08:00
|
|
|
[-5] CDF5 output (same as -k 'cdf5)\n\
|
2014-12-06 03:26:42 +08:00
|
|
|
[-d n] set output deflation compression level, default same as input (0=none 9=max)\n\
|
2011-01-18 06:15:26 +08:00
|
|
|
[-s] add shuffle option to deflation compression\n\
|
2021-07-18 06:55:30 +08:00
|
|
|
[-c chunkspec] specify chunking for variable and dimensions, e.g. \"var:N1,N2,...\" or \"dim1/N1,dim2/N2,...\"\n\
|
2011-01-18 06:15:26 +08:00
|
|
|
[-u] convert unlimited dimensions to fixed-size dimensions in output copy\n\
|
2012-05-10 01:21:32 +08:00
|
|
|
[-w] write whole output file from diskless netCDF on close\n\
|
2013-04-17 06:18:14 +08:00
|
|
|
[-v var1,...] include data for only listed variables, but definitions for all variables\n\
|
|
|
|
[-V var1,...] include definitions and data for only listed variables\n\
|
|
|
|
[-g grp1,...] include data for only variables in listed groups, but all definitions\n\
|
|
|
|
[-G grp1,...] include definitions and data only for variables in listed groups\n\
|
2011-01-18 06:15:26 +08:00
|
|
|
[-m n] set size in bytes of copy buffer, default is 5000000 bytes\n\
|
2011-07-08 02:36:00 +08:00
|
|
|
[-h n] set size in bytes of chunk_cache for chunked variables\n\
|
|
|
|
[-e n] set number of elements that chunk_cache can hold\n\
|
2015-08-16 06:26:35 +08:00
|
|
|
[-r] read whole input file into diskless file on open (classic or 64-bit offset or cdf5 formats only)\n\
|
2020-02-17 03:59:33 +08:00
|
|
|
[-F filterspec] specify a compression algorithm to apply to an output variable (may be repeated).\n\
|
2018-01-19 06:12:29 +08:00
|
|
|
[-Ln] set log level to n (>= 0); ignored if logging isn't enabled.\n\
|
2018-08-26 11:44:41 +08:00
|
|
|
[-Mn] set minimum chunk size to n bytes (n >= 0)\n\
|
2010-06-03 21:24:43 +08:00
|
|
|
infile name of netCDF input file\n\
|
|
|
|
outfile name for netCDF output file\n"
|
|
|
|
|
2011-07-13 03:06:00 +08:00
|
|
|
/* Don't document this flaky option until it works better */
|
|
|
|
/* [-x] use experimental computed estimates for variable-specific chunk caches\n\ */
|
|
|
|
|
2018-09-05 01:22:36 +08:00
|
|
|
|
2018-08-26 11:44:41 +08:00
|
|
|
error("%s [-k kind] [-[3|4|6|7]] [-d n] [-s] [-c chunkspec] [-u] [-w] [-[v|V] varlist] [-[g|G] grplist] [-m n] [-h n] [-e n] [-r] [-F filterspec] [-Ln] [-Mn] infile outfile\n%s\nnetCDF library version %s",
|
2014-08-27 03:20:03 +08:00
|
|
|
progname, USAGE, nc_inq_libvers());
|
2018-09-05 01:22:36 +08:00
|
|
|
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
int
|
|
|
|
main(int argc, char**argv)
|
|
|
|
{
|
2018-03-03 07:55:58 +08:00
|
|
|
int exitcode = EXIT_SUCCESS;
|
2010-06-03 21:24:43 +08:00
|
|
|
char* inputfile = NULL;
|
|
|
|
char* outputfile = NULL;
|
|
|
|
int c;
|
|
|
|
|
2018-07-27 10:16:02 +08:00
|
|
|
chunkspecinit();
|
|
|
|
option_chunkspecs = listnew();
|
2010-06-03 21:24:43 +08:00
|
|
|
|
|
|
|
progname = argv[0];
|
|
|
|
|
|
|
|
if (argc <= 1)
|
|
|
|
{
|
|
|
|
usage();
|
|
|
|
}
|
|
|
|
|
2020-05-19 09:36:28 +08:00
|
|
|
opterr = 1;
|
2018-07-27 10:16:02 +08:00
|
|
|
while ((c = getopt(argc, argv, "k:3467d:sum:c:h:e:rwxg:G:v:V:F:L:M:")) != -1) {
|
2010-06-03 21:24:43 +08:00
|
|
|
switch(c) {
|
2017-09-01 06:16:21 +08:00
|
|
|
case 'k': /* for specifying variant of netCDF format to be generated
|
2014-12-18 01:27:35 +08:00
|
|
|
Format names:
|
|
|
|
"classic" or "nc3"
|
|
|
|
"64-bit offset" or "nc6"
|
|
|
|
"netCDF-4" or "nc4"
|
|
|
|
"netCDF-4 classic model" or "nc7"
|
2015-08-16 06:26:35 +08:00
|
|
|
"64-bit-data" | "64-bit data" | "cdf5" | "nc5"
|
2014-12-18 01:27:35 +08:00
|
|
|
Format version numbers (deprecated):
|
|
|
|
1 (=> classic)
|
|
|
|
2 (=> 64-bit offset)
|
|
|
|
3 (=> netCDF-4)
|
|
|
|
4 (=> netCDF-4 classic model)
|
2015-08-16 06:26:35 +08:00
|
|
|
5 (=> classic 64 bit data, CDF-5)
|
2010-06-03 21:24:43 +08:00
|
|
|
*/
|
|
|
|
{
|
|
|
|
struct Kvalues* kvalue;
|
|
|
|
char *kind_name = (char *) emalloc(strlen(optarg)+1);
|
|
|
|
(void)strcpy(kind_name, optarg);
|
|
|
|
for(kvalue=legalkinds;kvalue->name;kvalue++) {
|
|
|
|
if(strcmp(kind_name,kvalue->name) == 0) {
|
2011-07-08 02:36:00 +08:00
|
|
|
option_kind = kvalue->kind;
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if(kvalue->name == NULL) {
|
2014-12-06 06:36:01 +08:00
|
|
|
error("invalid output format: %s", kind_name);
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
2021-02-26 06:06:39 +08:00
|
|
|
nullfree(kind_name);
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
break;
|
2014-12-06 03:26:42 +08:00
|
|
|
case '3': /* output format is classic (netCDF-3) */
|
|
|
|
option_kind = NC_FORMAT_CLASSIC;
|
|
|
|
break;
|
2015-08-16 06:26:35 +08:00
|
|
|
case '5': /* output format is cdf5 */
|
|
|
|
option_kind = NC_FORMAT_CDF5;
|
|
|
|
break;
|
2014-12-06 03:26:42 +08:00
|
|
|
case '6': /* output format is 64-bit-offset (netCDF-3 version 2) */
|
2015-08-16 06:26:35 +08:00
|
|
|
option_kind = NC_FORMAT_64BIT_OFFSET;
|
2014-12-06 03:26:42 +08:00
|
|
|
break;
|
|
|
|
case '4': /* output format is netCDF-4 (variant of HDF5) */
|
|
|
|
option_kind = NC_FORMAT_NETCDF4;
|
|
|
|
break;
|
|
|
|
case '7': /* output format is netCDF-4 (restricted to classic model)*/
|
|
|
|
option_kind = NC_FORMAT_NETCDF4_CLASSIC;
|
|
|
|
break;
|
2010-08-29 23:08:12 +08:00
|
|
|
case 'd': /* non-default compression level specified */
|
2010-09-01 11:21:08 +08:00
|
|
|
option_deflate_level = strtol(optarg, NULL, 10);
|
|
|
|
if(option_deflate_level < 0 || option_deflate_level > 9) {
|
2011-01-06 07:48:47 +08:00
|
|
|
error("invalid deflation level: %d", option_deflate_level);
|
2010-09-01 06:41:00 +08:00
|
|
|
}
|
2010-08-29 23:08:12 +08:00
|
|
|
break;
|
2010-09-01 06:41:00 +08:00
|
|
|
case 's': /* shuffling, may improve compression */
|
2010-09-01 11:21:08 +08:00
|
|
|
option_shuffle_vars = NC_SHUFFLE;
|
2010-08-29 23:08:12 +08:00
|
|
|
break;
|
|
|
|
case 'u': /* convert unlimited dimensions to fixed size */
|
2010-09-01 11:21:08 +08:00
|
|
|
option_fix_unlimdims = 1;
|
|
|
|
break;
|
2010-06-03 21:24:43 +08:00
|
|
|
case 'm': /* non-default size of data copy buffer */
|
2010-09-01 11:21:08 +08:00
|
|
|
{
|
2012-11-22 04:46:43 +08:00
|
|
|
double dval = double_with_suffix(optarg); /* "K" for kilobytes. "M" for megabytes, ... */
|
|
|
|
if(dval < 0)
|
|
|
|
error("Suffix used for '-m' option value must be K, M, G, T, or P");
|
2023-10-26 23:01:24 +08:00
|
|
|
option_copy_buffer_size = (size_t)dval;
|
2010-06-03 21:24:43 +08:00
|
|
|
break;
|
2010-09-01 11:21:08 +08:00
|
|
|
}
|
2011-07-08 02:36:00 +08:00
|
|
|
case 'h': /* non-default size of chunk cache */
|
|
|
|
{
|
2012-11-22 04:46:43 +08:00
|
|
|
double dval = double_with_suffix(optarg); /* "K" for kilobytes. "M" for megabytes, ... */
|
|
|
|
if(dval < 0)
|
|
|
|
error("Suffix used for '-h' option value must be K, M, G, T, or P");
|
2023-10-26 23:01:24 +08:00
|
|
|
option_chunk_cache_size = (size_t)dval;
|
2011-07-08 02:36:00 +08:00
|
|
|
break;
|
2012-11-22 04:46:43 +08:00
|
|
|
}
|
2011-07-08 02:36:00 +08:00
|
|
|
case 'e': /* number of elements chunk cache can hold */
|
2012-11-22 04:46:43 +08:00
|
|
|
{
|
|
|
|
double dval = double_with_suffix(optarg); /* "K" for kilobytes. "M" for megabytes, ... */
|
|
|
|
if(dval < 0 )
|
|
|
|
error("Suffix used for '-e' option value must be K, M, G, T, or P");
|
2023-10-26 23:01:24 +08:00
|
|
|
option_chunk_cache_nelems = (size_t)dval;
|
2011-07-08 02:36:00 +08:00
|
|
|
break;
|
2012-11-22 04:46:43 +08:00
|
|
|
}
|
2012-04-13 01:18:06 +08:00
|
|
|
case 'r':
|
|
|
|
option_read_diskless = 1; /* read into memory on open */
|
|
|
|
break;
|
|
|
|
case 'w':
|
|
|
|
option_write_diskless = 1; /* write to memory, persist on close */
|
|
|
|
break;
|
2011-07-08 02:36:00 +08:00
|
|
|
case 'x': /* use experimental variable-specific chunk caches */
|
2011-07-13 03:06:00 +08:00
|
|
|
option_compute_chunkcaches = 1;
|
2011-07-08 02:36:00 +08:00
|
|
|
break;
|
2011-01-07 06:09:14 +08:00
|
|
|
case 'c': /* optional chunking spec for each dimension in list */
|
2011-01-18 06:15:26 +08:00
|
|
|
/* save chunkspec string for parsing later, once we know input ncid */
|
2018-07-27 10:16:02 +08:00
|
|
|
listpush(option_chunkspecs,strdup(optarg));
|
2011-01-07 06:09:14 +08:00
|
|
|
break;
|
2013-01-24 01:45:29 +08:00
|
|
|
case 'g': /* group names */
|
|
|
|
/* make list of names of groups specified */
|
|
|
|
make_lgrps (optarg, &option_nlgrps, &option_lgrps, &option_grpids);
|
|
|
|
option_grpstruct = true;
|
|
|
|
break;
|
|
|
|
case 'G': /* group names */
|
|
|
|
/* make list of names of groups specified */
|
|
|
|
make_lgrps (optarg, &option_nlgrps, &option_lgrps, &option_grpids);
|
|
|
|
option_grpstruct = false;
|
|
|
|
break;
|
|
|
|
case 'v': /* variable names */
|
|
|
|
/* make list of names of variables specified */
|
|
|
|
make_lvars (optarg, &option_nlvars, &option_lvars);
|
|
|
|
option_varstruct = true;
|
|
|
|
break;
|
|
|
|
case 'V': /* variable names */
|
|
|
|
/* make list of names of variables specified */
|
|
|
|
make_lvars (optarg, &option_nlvars, &option_lvars);
|
|
|
|
option_varstruct = false;
|
|
|
|
break;
|
2018-07-27 10:16:02 +08:00
|
|
|
case 'L': /* Set logging, if logging support was compiled in. */
|
2018-01-19 06:12:29 +08:00
|
|
|
#ifdef LOGGING
|
2018-07-27 10:16:02 +08:00
|
|
|
{
|
|
|
|
int level = atoi(optarg);
|
|
|
|
if(level >= 0)
|
|
|
|
nc_set_log_level(level);
|
|
|
|
}
|
2018-08-05 03:22:29 +08:00
|
|
|
#else
|
|
|
|
error("-L specified, but logging support not enabled");
|
2018-01-19 06:12:29 +08:00
|
|
|
#endif
|
2018-07-27 10:16:02 +08:00
|
|
|
break;
|
2017-05-15 08:10:02 +08:00
|
|
|
case 'F': /* optional filter spec for a specified variable */
|
|
|
|
#ifdef USE_NETCDF4
|
2019-02-09 09:48:17 +08:00
|
|
|
/* If the arg is "none" or "*,none" then suppress all filters
|
2018-03-03 07:55:58 +08:00
|
|
|
on output unless explicit */
|
2019-02-09 09:48:17 +08:00
|
|
|
if(strcmp(optarg,"none")==0
|
|
|
|
|| strcasecmp(optarg,"*,none")==0) {
|
2018-03-03 07:55:58 +08:00
|
|
|
suppressfilters = 1;
|
|
|
|
} else {
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
if(filteroptions == NULL)
|
|
|
|
filteroptions = listnew();
|
|
|
|
NC_CHECK(parsefilterspec(optarg,filteroptions));
|
2018-12-13 04:23:09 +08:00
|
|
|
/* Force output to be netcdf-4 */
|
2018-03-03 07:55:58 +08:00
|
|
|
option_kind = NC_FORMAT_NETCDF4;
|
|
|
|
}
|
2017-05-15 08:10:02 +08:00
|
|
|
#else
|
|
|
|
error("-F requires netcdf-4");
|
|
|
|
#endif
|
2018-08-05 03:22:29 +08:00
|
|
|
break;
|
2018-07-27 10:16:02 +08:00
|
|
|
case 'M': /* set min chunk size */
|
|
|
|
#ifdef USE_NETCDF4
|
2018-08-05 03:22:29 +08:00
|
|
|
if(optarg == NULL)
|
2018-08-26 11:44:41 +08:00
|
|
|
option_min_chunk_bytes = 0;
|
2018-08-05 03:22:29 +08:00
|
|
|
else
|
|
|
|
option_min_chunk_bytes = atol(optarg);
|
2017-05-15 08:10:02 +08:00
|
|
|
break;
|
2018-07-27 10:16:02 +08:00
|
|
|
#else
|
|
|
|
error("-M requires netcdf-4");
|
|
|
|
#endif
|
|
|
|
|
2018-01-19 06:12:29 +08:00
|
|
|
default:
|
2010-06-03 21:24:43 +08:00
|
|
|
usage();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
argc -= optind;
|
|
|
|
argv += optind;
|
|
|
|
|
|
|
|
if (argc != 2) {
|
2011-01-06 07:48:47 +08:00
|
|
|
error("one input file and one output file required");
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
Codify cross-platform file paths
The netcdf-c code has to deal with a variety of platforms:
Windows, OSX, Linux, Cygwin, MSYS, etc. These platforms differ
significantly in the kind of file paths that they accept. So in
order to handle this, I have created a set of replacements for
the most common file system operations such as _open_ or _fopen_
or _access_ to manage the file path differences correctly.
A more limited version of this idea was already implemented via
the ncwinpath.h and dwinpath.c code. So this can be viewed as a
replacement for that code. And in path in many cases, the only
change that was required was to replace '#include <ncwinpath.h>'
with '#include <ncpathmgt.h>' and then replace file operation
calls with the NCxxx equivalent from ncpathmgr.h Note that
recently, the ncwinpath.h was renamed ncpathmgmt.h, so this pull
request should not require dealing with winpath.
The heart of the change is include/ncpathmgmt.h, which provides
alternate operations such as NCfopen or NCaccess and which properly
parse and rebuild path arguments to work for the platform on which
the code is executing. This mostly matters for Windows because of the
way that it uses backslash and drive letters, as compared to *nix*.
One important feature is that the user can do string manipulations
on a file path without having to worry too much about the platform
because the path management code will properly handle most mixed cases.
So one can for example concatenate a path suffix that uses forward
slashes to a Windows path and have it work correctly.
The conversion code is in libdispatch/dpathmgr.c, and the
important function there is NCpathcvt which does the proper
conversions to the local path format.
As a rule, most code should just replace their file operations with
the corresponding NCxxx ones defined in include/ncpathmgmt.h. These
NCxxx functions all call NCpathcvt on their path arguments before
executing the actual file operation.
In some rare cases, the client may need to directly use NCpathcvt,
but this should be avoided as much as possible. If there is a need
for supporting a new file operation not already in ncpathmgmt.h, then
use the code in dpathmgr.c as a template. Also please notify Unidata
so we can include it as a formal part or our supported operations.
Also, if you see an operation in the library that is not using the
NCxxx form, then please submit an issue so we can fix it.
Misc. Changes:
* Clean up the utf8 testing code; it is impossible to get some
tests to work under windows using shell scripts; the args do
not pass as utf8 but as some other encoding.
* Added an extra utf8 test case: test_unicode_path.sh
* Add a true test for HDF5 1.10.6 or later because as noted in
PR https://github.com/Unidata/netcdf-c/pull/1794,
HDF5 changed its Windows file path handling.
2021-03-05 04:41:31 +08:00
|
|
|
/* Canonicalize the input and output files names */
|
2021-04-22 04:59:15 +08:00
|
|
|
inputfile = NC_shellUnescape(argv[0]); /* Remove shell added escapes */
|
|
|
|
outputfile = NC_shellUnescape(argv[1]);
|
2010-06-03 21:24:43 +08:00
|
|
|
if(strcmp(inputfile, outputfile) == 0) {
|
2011-01-06 07:48:47 +08:00
|
|
|
error("output would overwrite input");
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
|
|
|
|
2018-01-20 03:25:03 +08:00
|
|
|
#ifdef USE_NETCDF4
|
2018-01-20 03:06:02 +08:00
|
|
|
#ifdef DEBUGFILTER
|
2019-02-09 09:48:17 +08:00
|
|
|
{ int i,j;
|
|
|
|
for(i=0;i<listlength(filterspecs);i++) {
|
|
|
|
struct FilterSpec *spec = listget(filterspecs,i);
|
2018-01-18 10:47:54 +08:00
|
|
|
fprintf(stderr,"filterspecs[%d]={fqn=|%s| filterid=%u nparams=%ld params=",
|
|
|
|
i,spec->fqn,spec->filterid,(unsigned long)spec->nparams);
|
|
|
|
for(j=0;j<spec->nparams;j++) {
|
|
|
|
if(j>0) fprintf(stderr,",");
|
|
|
|
fprintf(stderr,"%u",spec->params[j]);
|
|
|
|
}
|
|
|
|
fprintf(stderr,"}\n");
|
|
|
|
fflush(stderr);
|
|
|
|
}
|
|
|
|
}
|
2018-01-20 03:25:03 +08:00
|
|
|
#endif /*DEBUGFILTER*/
|
|
|
|
#endif /*USE_NETCDF4*/
|
|
|
|
|
2011-07-13 03:06:00 +08:00
|
|
|
if(copy(inputfile, outputfile) != NC_NOERR)
|
2018-03-03 07:55:58 +08:00
|
|
|
exitcode = EXIT_FAILURE;
|
|
|
|
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
nullfree(inputfile);
|
|
|
|
nullfree(outputfile);
|
|
|
|
|
2018-03-03 07:55:58 +08:00
|
|
|
#ifdef USE_NETCDF4
|
2020-02-17 03:59:33 +08:00
|
|
|
/* Clean up */
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
freefilteroptlist(filteroptions);
|
|
|
|
filteroptions = NULL;
|
2018-03-03 07:55:58 +08:00
|
|
|
#endif /*USE_NETCDF4*/
|
|
|
|
|
2021-09-03 07:04:26 +08:00
|
|
|
nc_finalize();
|
|
|
|
|
2018-03-03 07:55:58 +08:00
|
|
|
exit(exitcode);
|
2010-06-03 21:24:43 +08:00
|
|
|
}
|
2020-02-17 03:59:33 +08:00
|
|
|
|
|
|
|
#ifdef USE_NETCDF4
|
|
|
|
static void
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
freefilteroptlist(List* specs)
|
2020-02-17 03:59:33 +08:00
|
|
|
{
|
|
|
|
int i;
|
|
|
|
for(i=0;i<listlength(specs);i++) {
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
struct FilterOption* spec = (struct FilterOption*)listget(specs,i);
|
2020-02-17 03:59:33 +08:00
|
|
|
if(spec->fqn) free(spec->fqn);
|
2020-09-28 02:43:46 +08:00
|
|
|
nullfree(spec->pfs.params);
|
2020-02-17 03:59:33 +08:00
|
|
|
free(spec);
|
|
|
|
}
|
|
|
|
listfree(specs);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2020-09-28 02:43:46 +08:00
|
|
|
freefilterlist(size_t nfilters, NC_H5_Filterspec** filters)
|
2020-02-17 03:59:33 +08:00
|
|
|
{
|
|
|
|
int i;
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
if(filters != NULL) {
|
|
|
|
for(i=0;i<nfilters;i++)
|
2020-09-28 02:43:46 +08:00
|
|
|
ncaux_h5filterspec_free(filters[i]);
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
nullfree(filters);
|
2020-02-17 03:59:33 +08:00
|
|
|
}
|
|
|
|
}
|
This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".
The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.
More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).
WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:
Platform | Build System | S3 support
------------------------------------
Linux+gcc | Automake | yes
Linux+gcc | CMake | yes
Visual Studio | CMake | no
Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future. Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.
In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*. The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
and the version bumped.
4. An overly complex set of structs was created to support funnelling
all of the filterx operations thru a single dispatch
"filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
to nczarr.
Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
-- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
support zarr and to regularize the structure of the fragments
section of a URL.
Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
* Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.
Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
|
|
|
|
2020-02-17 03:59:33 +08:00
|
|
|
#endif
|