netcdf-c/include/nc4internal.h

474 lines
22 KiB
C
Raw Normal View History

/* Copyright 2018-2018 University Corporation for Atmospheric
2018-04-05 04:11:44 +08:00
Research/Unidata. */
/**
* @file
* @internal This header file contains macros, types and prototypes
* used to build and manipulate the netCDF metadata model.
*
* @author Ed Hartnett, Dennis Heimbigner, Ward Fisher
*/
2010-06-03 21:24:43 +08:00
#ifndef _NC4INTERNAL_
#define _NC4INTERNAL_
#include "netcdf.h"
2010-06-03 21:24:43 +08:00
#include "config.h"
2010-06-03 21:24:43 +08:00
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
#include "nc_logging.h"
#include "ncindex.h"
#include "nc_provenance.h"
#include "nchashmap.h"
#include "netcdf_f.h"
#include "netcdf_mem.h"
Add support for multiple filters per variable. re: https://github.com/Unidata/netcdf-c/issues/1584 Support has been added for multiple filters per variable. This affects a number of components in netcdf. The new APIs are documented in NUG/filters.md. The primary changes are: * A set of new functions are provided (see __include/netcdf_filter.h__). - Obtain a list of the filters associated with a variable - Obtain the parameters for a specific filter. * The existing __nc_inq_var_filter__ function now returns info about the first defined filter. * The utilities (ncgen, ncdump, and nccopy) now support an extended format for specifying a sequence of filters. The general form is __<filter>|<filter>..._. * The ncdump **_Filter** attribute now dumps a list of all the filters associated with a variable using the above new format. * Filter specifications can now use a filter name instead of number for filters known to the netcdf library, which in turn is taken from the HDF5 filter registration page. * New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter is returned if an attempt is made to access an unknown filter. * Internally, the dispatch table has been extended to add a function to handle all of the filter functions. * New, filter-related, tests were added to nc_test4. * A new plugin was added to the plugins directory to help with testing. Notes: 1. The shuffle and fletcher32 filters are not part of the multifilter system. Misc. changes: 1. A debug module was added to libhdf5 to help catch error locations.
2020-02-17 03:59:33 +08:00
#include "netcdf_filter.h"
2010-06-03 21:24:43 +08:00
#ifdef USE_PARALLEL
#include "netcdf_par.h"
2010-06-03 21:24:43 +08:00
#endif /* USE_PARALLEL */
/* Always needed */
#include "nc.h"
/** The file ID is stored in the first two bytes of ncid. */
2010-06-03 21:24:43 +08:00
#define FILE_ID_MASK (0xffff0000)
/** The group ID is stored in the last two bytes of ncid. */
2010-06-03 21:24:43 +08:00
#define GRP_ID_MASK (0x0000ffff)
/** File and group IDs are each 16 bits of the ncid. */
2010-06-03 21:24:43 +08:00
#define ID_SHIFT (16)
/* typedef enum {GET, PUT} NC_PG_T; */
/** These are the different objects that can be in our hash-lists. */
typedef enum {NCNAT, NCVAR, NCDIM, NCATT, NCTYP, NCFLD, NCGRP, NCFIL} NC_SORT;
2010-06-03 21:24:43 +08:00
/** The netCDF V2 error code. */
2010-06-03 21:24:43 +08:00
#define NC_V2_ERR (-1)
/** The name of the root group. */
2010-06-03 21:24:43 +08:00
#define NC_GROUP_NAME "/"
/** One mega-byte. */
2010-06-03 21:24:43 +08:00
#define MEGABYTE 1048576
/** The HDF5 ID for the szip filter. */
#define HDF5_FILTER_SZIP 4
#define X_SCHAR_MIN (-128) /**< Minimum signed char value. */
#define X_SCHAR_MAX 127 /**< Maximum signed char value. */
#define X_UCHAR_MAX 255U /**< Maximum unsigned char value. */
2019-09-18 10:27:43 +08:00
#define X_SHORT_MIN (-32768) /**< Minimum short value. */
#define X_SHRT_MIN X_SHORT_MIN /**< This alias is compatible with limits.h. */
#define X_SHORT_MAX 32767 /**< Maximum short value. */
#define X_SHRT_MAX X_SHORT_MAX /**< This alias is compatible with limits.h. */
#define X_USHORT_MAX 65535U /**< Maximum unsigned short value. */
#define X_USHRT_MAX X_USHORT_MAX /**< This alias is compatible with limits.h. */
#define X_INT_MIN (-2147483647-1) /**< Minimum int value. */
#define X_INT_MAX 2147483647 /**< Maximum int value. */
#define X_LONG_MIN X_INT_MIN /**< Minimum long value. */
#define X_LONG_MAX X_INT_MAX /**< Maximum long value. */
#define X_UINT_MAX 4294967295U /**< Maximum unsigned int value. */
#define X_INT64_MIN (-9223372036854775807LL-1LL) /**< Minimum int64 value. */
#define X_INT64_MAX 9223372036854775807LL /**< Maximum int64 value. */
#define X_UINT64_MAX 18446744073709551615ULL /**< Maximum unsigned int64 value. */
#ifdef _WIN32 /* Windows, of course, has to be a *little* different. */
#define X_FLOAT_MAX 3.402823466e+38f
2010-06-03 21:24:43 +08:00
#else
#define X_FLOAT_MAX 3.40282347e+38f /**< Maximum float value. */
#endif /* _WIN32 */
#define X_FLOAT_MIN (-X_FLOAT_MAX) /**< Minimum float value. */
#define X_DOUBLE_MAX 1.7976931348623157e+308 /**< Maximum double value. */
#define X_DOUBLE_MIN (-X_DOUBLE_MAX) /**< Minimum double value. */
2010-06-03 21:24:43 +08:00
2018-05-25 04:27:16 +08:00
/** This is the number of netCDF atomic types. */
2018-06-09 05:50:28 +08:00
#define NUM_ATOMIC_TYPES (NC_MAX_ATOMIC_TYPE + 1)
2018-05-25 04:27:16 +08:00
/** Number of parameters needed for ZLIB filter. */
#define CD_NELEMS_ZLIB 1
/** Get a pointer to the NC_FILE_INFO_T from dispatchdata field. */
#define NC4_DATA(nc) ((NC_FILE_INFO_T *)(nc)->dispatchdata)
/** Set a pointer to the NC_FILE_INFO_T in the dispatchdata field. */
#define NC4_DATA_SET(nc,data) ((nc)->dispatchdata = (void *)(data))
/* Reserved attribute flags: must be powers of 2. */
/** Hidden attributes; immutable and unreadable thru API. */
#define HIDDENATTRFLAG 1
Add filter support to NCZarr Filter support has three goals: 1. Use the existing HDF5 filter implementations, 2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr, 3. Allow filters to be used even when HDF5 is disabled Detailed usage directions are define in docs/filters.md. For now, the existing filter API is left in place. So filters are defined using ''nc_def_var_filter'' using the HDF5 style where the id and parameters are unsigned integers. This is a big change since filters affect many parts of the code. In the following, the terms "compressor" and "filter" and "codec" are generally used synonomously. ### Filter-Related Changes: * In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms. * Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h. * Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out. * Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h. * Add a number of new test to test the new nczarr filters. * Let ncgen parse _Codecs attribute, although it is ignored. ### Plugin directory changes: * Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file * Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip * Add a Codec defaulter (see docs/filters.md) for the big four filters. * Make plugins work with windows by properly adding __declspec declaration. ### Misc. Non-Filter Changes * Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5. * Improve support for caching * More fixes for path conversion code * Fix misc. memory leaks * Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath. * Add a number of new test to test the non-filter fixes. * Update the parsers * Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'
2021-09-03 07:04:26 +08:00
/** Readonly attributes; readable, but immutable thru the API. */
#define READONLYFLAG 2
/** Subset of readonly flags; readable by name only thru the API. */
#define NAMEONLYFLAG 4
/** Subset of readonly flags; Value is actually in file. */
#define MATERIALIZEDFLAG 8
Add filter support to NCZarr Filter support has three goals: 1. Use the existing HDF5 filter implementations, 2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr, 3. Allow filters to be used even when HDF5 is disabled Detailed usage directions are define in docs/filters.md. For now, the existing filter API is left in place. So filters are defined using ''nc_def_var_filter'' using the HDF5 style where the id and parameters are unsigned integers. This is a big change since filters affect many parts of the code. In the following, the terms "compressor" and "filter" and "codec" are generally used synonomously. ### Filter-Related Changes: * In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms. * Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h. * Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out. * Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h. * Add a number of new test to test the new nczarr filters. * Let ncgen parse _Codecs attribute, although it is ignored. ### Plugin directory changes: * Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file * Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip * Add a Codec defaulter (see docs/filters.md) for the big four filters. * Make plugins work with windows by properly adding __declspec declaration. ### Misc. Non-Filter Changes * Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5. * Improve support for caching * More fixes for path conversion code * Fix misc. memory leaks * Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath. * Add a number of new test to test the non-filter fixes. * Update the parsers * Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'
2021-09-03 07:04:26 +08:00
/** Per-variable attribute, as opposed to global */
#define VARFLAG 16
/** Boolean type, to make the code easier to read. */
typedef enum {NC_FALSE = 0, NC_TRUE = 1} nc_bool_t;
/* Forward declarations. */
struct NC_GRP_INFO;
struct NC_TYPE_INFO;
/**
* This struct provides indexed Access to Meta-data objects. See the
* document docs/indexing.dox for detailed information.
*
* Basically it provides a common header and use NCindex instances
* instead of linked lists.
*
* WARNING: ALL OBJECTS THAT CAN BE INSERTED INTO AN NCindex MUST HAVE
* AN INSTANCE of NC_OBJ AS THE FIRST FIELD.
*/
typedef struct NC_OBJ
{
NC_SORT sort; /**< Type of object. */
char* name; /**< Name, assumed to be null terminated. */
size_t id; /**< This objects ID. */
} NC_OBJ;
/**
* This struct holds information about reserved attributes. These
* attributes cannot be created or read by the user (through the
* netCDF API). */
typedef struct NC_reservedatt
{
const char *name; /**< Name of the reserved attribute. */
int flags; /**< Flags that control handling of reserved attribute. */
} NC_reservedatt;
/** This is a struct to handle the dimension metadata. */
2010-06-03 21:24:43 +08:00
typedef struct NC_DIM_INFO
{
NC_OBJ hdr; /**< The hdr contains the name and ID. */
struct NC_GRP_INFO *container; /**< Pointer to containing group. */
size_t len; /**< Length of this dimension. */
nc_bool_t unlimited; /**< True if the dimension is unlimited */
nc_bool_t extended; /**< True if the dimension needs to be extended. */
nc_bool_t too_long; /**< True if len is too big to fit in local size_t. */
void *format_dim_info; /**< Pointer to format-specific dim info. */
struct NC_VAR_INFO *coord_var; /**< The coord var, if it exists. */
2010-06-03 21:24:43 +08:00
} NC_DIM_INFO_T;
/** This is a struct to handle the attribute metadata. */
2010-06-03 21:24:43 +08:00
typedef struct NC_ATT_INFO
{
NC_OBJ hdr; /**< The hdr contains the name and ID. */
struct NC_OBJ *container; /**< Pointer to containing group|var. */
int len; /**< Length of attribute data. */
nc_bool_t dirty; /**< True if attribute modified. */
nc_bool_t created; /**< True if attribute already created. */
nc_type nc_typeid; /**< NetCDF type of attribute's data. */
void *format_att_info; /**< Pointer to format-specific att info. */
void *data; /**< The attribute data. */
nc_vlen_t *vldata; /**< VLEN data (only used for vlen types). */
char **stdata; /**< String data (only for string type). */
2010-06-03 21:24:43 +08:00
} NC_ATT_INFO_T;
/** This is a struct to handle the var metadata. */
2010-06-03 21:24:43 +08:00
typedef struct NC_VAR_INFO
{
NC_OBJ hdr; /**< The hdr contains the name and ID. */
char *alt_name; /**< Used if name in dispatcher must be different from hdr.name. */
struct NC_GRP_INFO *container; /**< Pointer to containing group. */
size_t ndims; /**< Number of dims. */
int *dimids; /**< Dim IDs. */
NC_DIM_INFO_T **dim; /**< Pointer to array of NC_DIM_INFO_T. */
nc_bool_t is_new_var; /**< True if variable is newly created. */
nc_bool_t was_coord_var; /**< True if variable was a coordinate var, but either the dim or var has been renamed. */
nc_bool_t became_coord_var; /**< True if variable _became_ a coordinate var, because either the dim or var has been renamed. */
nc_bool_t fill_val_changed; /**< True if variable's fill value changes after it has been created. */
nc_bool_t attr_dirty; /**< True if variable's attributes are dirty and should be rewritten. */
nc_bool_t created; /**< Variable has already been created (_not_ that it was just created). */
nc_bool_t written_to; /**< True if variable has data written to it. */
struct NC_TYPE_INFO *type_info; /**< Contains info about the variable type. */
int atts_read; /**< If true, the atts have been read. */
nc_bool_t meta_read; /**< True if this vars metadata has been completely read. */
nc_bool_t coords_read; /**< True if this var has hidden coordinates att, and it has been read. */
NCindex *att; /**< List of NC_ATT_INFO_T. */
nc_bool_t no_fill; /**< True if no fill value is defined for var. */
void *fill_value; /**< Pointer to fill value, or NULL. */
size_t *chunksizes; /**< For chunked storage, an array (size ndims) of chunksizes. */
int storage; /**< Storage of this var, compact, contiguous, or chunked. */
This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform | Build System | S3 support ------------------------------------ Linux+gcc | Automake | yes Linux+gcc | CMake | yes Visual Studio | CMake | no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char*. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char* based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
int endianness; /**< What endianness for the var? */
int parallel_access; /**< Type of parallel access for I/O on variable (collective or independent). */
nc_bool_t shuffle; /**< True if var has shuffle filter applied. */
nc_bool_t fletcher32; /**< True if var has fletcher32 filter applied. */
Add filter support to NCZarr Filter support has three goals: 1. Use the existing HDF5 filter implementations, 2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr, 3. Allow filters to be used even when HDF5 is disabled Detailed usage directions are define in docs/filters.md. For now, the existing filter API is left in place. So filters are defined using ''nc_def_var_filter'' using the HDF5 style where the id and parameters are unsigned integers. This is a big change since filters affect many parts of the code. In the following, the terms "compressor" and "filter" and "codec" are generally used synonomously. ### Filter-Related Changes: * In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms. * Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h. * Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out. * Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h. * Add a number of new test to test the new nczarr filters. * Let ncgen parse _Codecs attribute, although it is ignored. ### Plugin directory changes: * Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file * Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip * Add a Codec defaulter (see docs/filters.md) for the big four filters. * Make plugins work with windows by properly adding __declspec declaration. ### Misc. Non-Filter Changes * Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5. * Improve support for caching * More fixes for path conversion code * Fix misc. memory leaks * Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath. * Add a number of new test to test the non-filter fixes. * Update the parsers * Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'
2021-09-03 07:04:26 +08:00
size_t chunk_cache_size; /**< Size in bytes of the var chunk cache. */
size_t chunk_cache_nelems; /**< Number of slots in var chunk cache. */
float chunk_cache_preemption; /**< Chunk cache preemtion policy. */
int quantize_mode; /**< Quantize mode. NC_NOQUANTIZE is 0, and means no quantization. */
int nsd; /**< Number of significant digits if quantization is used, 0 if not. */
void *format_var_info; /**< Pointer to any binary format info. */
Mostly revert the filter code to reduce its complexity of use. re: https://github.com/Unidata/netcdf-c/issues/1836 Revert the internal filter code to simplify it. From the user's point of view, the only visible changes should be: 1. The functions that convert text to filter specs have had their signature reverted and have been moved to netcdf_aux.h 2. Some filter API functions now return NC_ENOFILTER when inquiry is made about some filter. Internally,the dispatch table has been modified to get rid of the filter_actions entry and associated complex structures. It has been replaced with inq_var_filter_ids and inq_var_filter_info entries and the dispatch table version has been bumped to 3. Corresponding NOOP and NOTNC4 functions were added to libdispatch/dnotnc4.c. Also, the filter_action entries in dispatch tables were replaced for all dispatch code bases (HDF5, DAP2, etc). This should only impact UDF users. In the process, it became clear that the form of the filters field in NC_VAR_INFO_T was format dependent, so I converted it to be of type void* and pushed its management into the various dispatch code bases. Specifically libhdf5 and libnczarr now manage the filters field in their own way. The auxilliary functions for parsing textual filter specifications were moved to netcdf_aux.h and were renamed to the following: * ncaux_h5filterspec_parse * ncaux_h5filterspec_parselist * ncaux_h5filterspec_free * ncaux_h5filter_fix8 Misc. Other Changes: 1. Document NUG/filters.md updated to reflect the changes above. 2. All the old data types (structs and enums) used by filter_actions actions were deleted. The exception is the NC_H5_Filterspec because it is needed by ncaux_h5filterspec_parselist. 3. Clientside filters were removed -- another enhancement for which no-one ever asked. 4. The ability to remove filters was itself removed. 5. Some functionality needed by nczarr was moved from libhdf5 to libsrc4 e.g. nc4_find_default_chunksizes 6. All the filterx code was removed 7. ncfilter.h and nc4filter.c no longer used Misc. Unrelated Changes: 1. The nczarr_test makefile clean was leaving some directories; so add clean-local to take care of them.
2020-09-28 02:43:46 +08:00
void* filters; /**< Record of the list of filters to be applied to var data; format dependent */
2010-06-03 21:24:43 +08:00
} NC_VAR_INFO_T;
/** This is a struct to handle the field metadata from a user-defined
* type. */
2010-06-03 21:24:43 +08:00
typedef struct NC_FIELD_INFO
{
NC_OBJ hdr; /**< The hdr contains the name and ID. */
nc_type nc_typeid; /**< The type of this field. */
size_t offset; /**< Offset in bytes of field. */
int ndims; /**< Number of dims. */
int *dim_size; /**< Dim sizes. */
void *format_field_info; /**< Pointer to any binary format info for field. */
2010-06-03 21:24:43 +08:00
} NC_FIELD_INFO_T;
/** This is a struct to handle metadata for a user-defined enum
* type. */
2010-06-03 21:24:43 +08:00
typedef struct NC_ENUM_MEMBER_INFO
{
char *name; /**< Name of member. */
void *value; /**< Value of member. */
2010-06-03 21:24:43 +08:00
} NC_ENUM_MEMBER_INFO_T;
/** This is a struct to handle metadata for a user-defined type. */
2010-06-03 21:24:43 +08:00
typedef struct NC_TYPE_INFO
{
NC_OBJ hdr; /**< The hdr contains the name and ID. */
struct NC_GRP_INFO *container; /**< Containing group */
unsigned rc; /**< Ref. count of objects using this type */
int endianness; /**< What endianness for the type? */
size_t size; /**< Size of the type in memory, in bytes */
nc_bool_t committed; /**< True when datatype is committed in the file */
nc_type nc_type_class; /**< NC_VLEN, NC_COMPOUND, NC_OPAQUE, NC_ENUM, NC_INT, NC_FLOAT, or NC_STRING. */
void *format_type_info; /**< HDF5-specific type info. */
/** Information for each type or class */
union {
struct {
NClist* enum_member; /**< <! NClist<NC_ENUM_MEMBER_INFO_T*> */
nc_type base_nc_typeid; /**< Typeid of the base type. */
} e; /**< Enum */
struct Fields {
NClist* field; /**< <! NClist<NC_FIELD_INFO_T*> */
} c; /**< Compound */
struct {
nc_type base_nc_typeid; /**< Typeid of the base type. */
} v; /**< Variable-length. */
} u; /**< Union of structs, for each type/class. */
2010-06-03 21:24:43 +08:00
} NC_TYPE_INFO_T;
/** This holds information for one group. Groups reproduce with
2010-06-03 21:24:43 +08:00
* parthenogenesis. */
typedef struct NC_GRP_INFO
{
NC_OBJ hdr; /**< The hdr contains the name and ID. */
void *format_grp_info; /**< Pointer to binary format info for group. */
struct NC_FILE_INFO *nc4_info; /**< Pointer containing NC_FILE_INFO_T. */
struct NC_GRP_INFO *parent; /**< Pointer tp parent group. */
int atts_read; /**< True if atts have been read for this group. */
NCindex* children; /**< NCindex<struct NC_GRP_INFO*> */
NCindex* dim; /**< NCindex<NC_DIM_INFO_T> * */
NCindex* att; /**< NCindex<NC_ATT_INFO_T> * */
NCindex* type; /**< NCindex<NC_TYPE_INFO_T> * */
/* Note that this is the list of vars with position == varid */
NCindex* vars; /**< NCindex<NC_VAR_INFO_T> * */
2010-06-03 21:24:43 +08:00
} NC_GRP_INFO_T;
/* These constants apply to the cmode parameter in the
* HDF5_FILE_INFO_T defined below. */
This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform | Build System | S3 support ------------------------------------ Linux+gcc | Automake | yes Linux+gcc | CMake | yes Visual Studio | CMake | no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char*. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char* based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
/* Make sure they do not conflict with defined flags in netcdf.h */
#define NC_CREAT 0x10002 /**< in create phase, cleared by ncendef */
#define NC_INDEF 0x10008 /**< in define mode, cleared by ncendef */
#define NC_NSYNC 0x10010 /**< synchronise numrecs on change */
#define NC_HSYNC 0x10020 /**< synchronise whole header on change */
#define NC_NDIRTY 0x10040 /**< numrecs has changed */
#define NC_HDIRTY 0x10080 /**< header info has changed */
/** This is the metadata we need to keep track of for each
This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform | Build System | S3 support ------------------------------------ Linux+gcc | Automake | yes Linux+gcc | CMake | yes Visual Studio | CMake | no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char*. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char* based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
* netcdf-4/HDF5 file. */
typedef struct NC_FILE_INFO
2010-06-03 21:24:43 +08:00
{
NC_OBJ hdr;
NC *controller; /**< Pointer to containing NC. */
2015-08-16 06:26:35 +08:00
#ifdef USE_PARALLEL4
MPI_Comm comm; /**< Copy of MPI Communicator used to open the file. */
MPI_Info info; /**< Copy of MPI Information Object used to open the file. */
#endif
int flags; /**< Flags used to open the file. */
int cmode; /**< Create mode used to create the file. */
nc_bool_t parallel; /**< True if file is open for parallel access */
nc_bool_t redef; /**< True if redefining an existing file */
2021-08-10 22:56:36 +08:00
nc_bool_t no_attr_create_order; /**< True if the creation order tracking of attributes is disabled (netcdf-4 only) */
int fill_mode; /**< Fill mode for vars - Unused internally currently */
nc_bool_t no_write; /**< true if nc_open has mode NC_NOWRITE. */
NC_GRP_INFO_T *root_grp; /**< Pointer to root group. */
short next_nc_grpid; /**< Next available group ID. */
int next_typeid; /**< Next available type ID. */
int next_dimid; /**< Next available dim ID. */
/* Provide convenience vectors indexed by the object id. This
allows for direct conversion of e.g. an nc_type to the
corresponding NC_TYPE_INFO_T object. */
NClist *alldims; /**< List of all dims. */
NClist *alltypes; /**< List of all types. */
NClist *allgroups; /**< List of all groups, including root group. */
void *format_file_info; /**< Pointer to binary format info for file. */
NC4_Provenance provenance; /**< File provenence info. */
struct NC4_Memio
{
NC_memio memio; /**< What we sent to image_init and what comes back. */
int locked; /**< Do not copy and do not free. */
int persist; /**< Should file be persisted out on close? */
int inmemory; /**< NC_INMEMORY flag was set. */
int diskless; /**< NC_DISKLESS flag was set => inmemory. */
int created; /**< 1 => create, 0 => open. */
unsigned int imageflags; /**< for H5LTopen_file_image. */
size_t initialsize; /**< Initial size. */
void *udata; /**< Extra memory allocated in NC4_image_init. */
} mem;
} NC_FILE_INFO_T;
2010-06-03 21:24:43 +08:00
/** Variable Length Datatype struct in memory. Must be identical to
2018-11-26 23:21:32 +08:00
* HDF5 hvl_t. (This is only used for VL sequences, not VL strings,
* which are stored in char *'s) */
typedef struct
{
size_t len; /**< Length of VL data (in base type units) */
void *p; /**< Pointer to VL data */
2018-11-26 23:21:32 +08:00
} nc_hvl_t;
This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform | Build System | S3 support ------------------------------------ Linux+gcc | Automake | yes Linux+gcc | CMake | yes Visual Studio | CMake | no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char*. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char* based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
/* Misc functions */
extern int NC4_inq_atomic_type(nc_type typeid1, char *name, size_t *size);
extern int NC4_lookup_atomic_type(const char *name, nc_type* idp, size_t *sizep);
/* These functions convert between netcdf and HDF5 types. */
extern int nc4_get_typelen_mem(NC_FILE_INFO_T *h5, nc_type xtype, size_t *len);
extern int nc4_convert_type(const void *src, void *dest, const nc_type src_type,
2021-08-25 15:31:26 +08:00
const nc_type dest_type, const size_t len, int *range_error,
const void *fill_value, int strict_nc3, int quantize_mode,
int nsd);
2010-06-03 21:24:43 +08:00
/* These functions do HDF5 things. */
extern int nc4_reopen_dataset(NC_GRP_INFO_T *grp, NC_VAR_INFO_T *var);
extern int nc4_read_atts(NC_GRP_INFO_T *grp, NC_VAR_INFO_T *var);
2010-06-03 21:24:43 +08:00
2018-08-22 21:03:37 +08:00
/* Find items in the in-memory lists of metadata. */
extern int nc4_find_nc_grp_h5(int ncid, NC **nc, NC_GRP_INFO_T **grp,
2018-08-22 21:03:37 +08:00
NC_FILE_INFO_T **h5);
extern int nc4_find_grp_h5(int ncid, NC_GRP_INFO_T **grp, NC_FILE_INFO_T **h5);
extern int nc4_find_nc4_grp(int ncid, NC_GRP_INFO_T **grp);
extern int nc4_find_dim(NC_GRP_INFO_T *grp, int dimid, NC_DIM_INFO_T **dim,
2018-08-22 21:03:37 +08:00
NC_GRP_INFO_T **dim_grp);
extern int nc4_find_var(NC_GRP_INFO_T *grp, const char *name, NC_VAR_INFO_T **var);
extern int nc4_find_dim_len(NC_GRP_INFO_T *grp, int dimid, size_t **len);
extern int nc4_find_type(const NC_FILE_INFO_T *h5, int typeid1, NC_TYPE_INFO_T **type);
extern NC_TYPE_INFO_T *nc4_rec_find_named_type(NC_GRP_INFO_T *start_grp, char *name);
extern NC_TYPE_INFO_T *nc4_rec_find_equal_type(NC_GRP_INFO_T *start_grp, int ncid1,
2018-08-22 21:03:37 +08:00
NC_TYPE_INFO_T *type);
extern int nc4_find_nc_att(int ncid, int varid, const char *name, int attnum,
NC_ATT_INFO_T **att);
extern int nc4_find_grp_h5_var(int ncid, int varid, NC_FILE_INFO_T **h5,
2018-08-22 21:03:37 +08:00
NC_GRP_INFO_T **grp, NC_VAR_INFO_T **var);
extern int nc4_find_grp_att(NC_GRP_INFO_T *grp, int varid, const char *name,
2018-08-22 21:03:37 +08:00
int attnum, NC_ATT_INFO_T **att);
extern int nc4_get_typeclass(const NC_FILE_INFO_T *h5, nc_type xtype,
int *type_class);
2010-06-03 21:24:43 +08:00
/* Free various types */
extern int nc4_type_free(NC_TYPE_INFO_T *type);
2010-06-03 21:24:43 +08:00
/* These list functions add and delete vars, atts. */
extern int nc4_nc4f_list_add(NC *nc, const char *path, int mode);
extern int nc4_nc4f_list_del(NC_FILE_INFO_T *h5);
extern int nc4_file_list_add(int ncid, const char *path, int mode,
void **dispatchdata);
extern int nc4_file_list_get(int ncid, char **path, int *mode,
void **dispatchdata);
extern int nc4_file_list_del(int ncid);
extern int nc4_file_change_ncid(int ncid, unsigned short new_ncid_index);
extern int nc4_var_list_add(NC_GRP_INFO_T* grp, const char* name, int ndims,
2018-11-17 01:07:54 +08:00
NC_VAR_INFO_T **var);
extern int nc4_var_list_add2(NC_GRP_INFO_T* grp, const char* name,
2018-11-17 01:07:54 +08:00
NC_VAR_INFO_T **var);
extern int nc4_var_set_ndims(NC_VAR_INFO_T *var, int ndims);
extern int nc4_var_list_del(NC_GRP_INFO_T *grp, NC_VAR_INFO_T *var);
extern int nc4_dim_list_add(NC_GRP_INFO_T *grp, const char *name, size_t len,
2018-11-17 01:07:54 +08:00
int assignedid, NC_DIM_INFO_T **dim);
extern int nc4_dim_list_del(NC_GRP_INFO_T *grp, NC_DIM_INFO_T *dim);
extern int nc4_type_new(size_t size, const char *name, int assignedid,
2018-11-17 01:07:54 +08:00
NC_TYPE_INFO_T **type);
extern int nc4_type_list_add(NC_GRP_INFO_T *grp, size_t size, const char *name,
2018-11-17 01:07:54 +08:00
NC_TYPE_INFO_T **type);
extern int nc4_type_list_del(NC_GRP_INFO_T *grp, NC_TYPE_INFO_T *type);
extern int nc4_type_free(NC_TYPE_INFO_T *type);
extern int nc4_field_list_add(NC_TYPE_INFO_T* parent, const char *name,
size_t offset, nc_type xtype, int ndims,
2018-11-16 23:26:09 +08:00
const int *dim_sizesp);
extern int nc4_att_list_add(NCindex *list, const char *name, NC_ATT_INFO_T **att);
extern int nc4_att_list_del(NCindex *list, NC_ATT_INFO_T *att);
extern int nc4_grp_list_add(NC_FILE_INFO_T *h5, NC_GRP_INFO_T *parent, char *name,
2018-11-17 01:07:54 +08:00
NC_GRP_INFO_T **grp);
extern int nc4_build_root_grp(NC_FILE_INFO_T *h5);
extern int nc4_rec_grp_del(NC_GRP_INFO_T *grp);
extern int nc4_enum_member_add(NC_TYPE_INFO_T *type, size_t size, const char *name,
2018-11-17 01:07:54 +08:00
const void *value);
extern int nc4_att_free(NC_ATT_INFO_T *att);
2010-06-03 21:24:43 +08:00
/* Check and normalize names. */
extern int NC_check_name(const char *name);
extern int nc4_check_name(const char *name, char *norm_name);
extern int nc4_normalize_name(const char *name, char *norm_name);
extern int nc4_check_dup_name(NC_GRP_INFO_T *grp, char *norm_name);
2010-06-03 21:24:43 +08:00
This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform | Build System | S3 support ------------------------------------ Linux+gcc | Automake | yes Linux+gcc | CMake | yes Visual Studio | CMake | no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char*. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char* based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
/* Get the fill value for a var. */
extern int nc4_get_fill_value(NC_FILE_INFO_T *h5, NC_VAR_INFO_T *var, void **fillp);
This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform | Build System | S3 support ------------------------------------ Linux+gcc | Automake | yes Linux+gcc | CMake | yes Visual Studio | CMake | no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char*. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char* based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
2018-07-19 21:23:03 +08:00
/* Find default fill value. */
extern int nc4_get_default_fill_value(nc_type typecode, void *fill_value);
2018-07-19 21:23:03 +08:00
2018-11-26 23:21:32 +08:00
/* Get an att given pointers to file, group, and perhaps ver info. */
extern int nc4_get_att_ptrs(NC_FILE_INFO_T *h5, NC_GRP_INFO_T *grp, NC_VAR_INFO_T *var,
2018-11-26 23:21:32 +08:00
const char *name, nc_type *xtype, nc_type mem_type,
size_t *lenp, int *attnum, void *data);
2018-07-17 22:00:47 +08:00
/* Close the file. */
extern int nc4_close_netcdf4_file(NC_FILE_INFO_T *h5, int abort, NC_memio *memio);
2018-07-17 22:00:47 +08:00
Mostly revert the filter code to reduce its complexity of use. re: https://github.com/Unidata/netcdf-c/issues/1836 Revert the internal filter code to simplify it. From the user's point of view, the only visible changes should be: 1. The functions that convert text to filter specs have had their signature reverted and have been moved to netcdf_aux.h 2. Some filter API functions now return NC_ENOFILTER when inquiry is made about some filter. Internally,the dispatch table has been modified to get rid of the filter_actions entry and associated complex structures. It has been replaced with inq_var_filter_ids and inq_var_filter_info entries and the dispatch table version has been bumped to 3. Corresponding NOOP and NOTNC4 functions were added to libdispatch/dnotnc4.c. Also, the filter_action entries in dispatch tables were replaced for all dispatch code bases (HDF5, DAP2, etc). This should only impact UDF users. In the process, it became clear that the form of the filters field in NC_VAR_INFO_T was format dependent, so I converted it to be of type void* and pushed its management into the various dispatch code bases. Specifically libhdf5 and libnczarr now manage the filters field in their own way. The auxilliary functions for parsing textual filter specifications were moved to netcdf_aux.h and were renamed to the following: * ncaux_h5filterspec_parse * ncaux_h5filterspec_parselist * ncaux_h5filterspec_free * ncaux_h5filter_fix8 Misc. Other Changes: 1. Document NUG/filters.md updated to reflect the changes above. 2. All the old data types (structs and enums) used by filter_actions actions were deleted. The exception is the NC_H5_Filterspec because it is needed by ncaux_h5filterspec_parselist. 3. Clientside filters were removed -- another enhancement for which no-one ever asked. 4. The ability to remove filters was itself removed. 5. Some functionality needed by nczarr was moved from libhdf5 to libsrc4 e.g. nc4_find_default_chunksizes 6. All the filterx code was removed 7. ncfilter.h and nc4filter.c no longer used Misc. Unrelated Changes: 1. The nczarr_test makefile clean was leaving some directories; so add clean-local to take care of them.
2020-09-28 02:43:46 +08:00
/* Compute default chunksizes */
extern int nc4_find_default_chunksizes2(NC_GRP_INFO_T *grp, NC_VAR_INFO_T *var);
extern int nc4_check_chunksizes(NC_GRP_INFO_T* grp, NC_VAR_INFO_T* var, const size_t* chunksizes);
/* HDF5 initialization/finalization */
extern int nc4_hdf5_initialized;
extern void nc4_hdf5_initialize(void);
extern void nc4_hdf5_finalize(void);
2010-06-03 21:24:43 +08:00
/* This is only included if --enable-logging is used for configure; it
prints info about the metadata to stderr. */
#ifdef LOGGING
extern int log_metadata_nc(NC_FILE_INFO_T *h5);
2010-06-03 21:24:43 +08:00
#endif
This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform | Build System | S3 support ------------------------------------ Linux+gcc | Automake | yes Linux+gcc | CMake | yes Visual Studio | CMake | no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char*. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char* based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
/** @internal Names of atomic types. */
extern const char* nc4_atomic_name[NUM_ATOMIC_TYPES];
Add support for multiple filters per variable. re: https://github.com/Unidata/netcdf-c/issues/1584 Support has been added for multiple filters per variable. This affects a number of components in netcdf. The new APIs are documented in NUG/filters.md. The primary changes are: * A set of new functions are provided (see __include/netcdf_filter.h__). - Obtain a list of the filters associated with a variable - Obtain the parameters for a specific filter. * The existing __nc_inq_var_filter__ function now returns info about the first defined filter. * The utilities (ncgen, ncdump, and nccopy) now support an extended format for specifying a sequence of filters. The general form is __<filter>|<filter>..._. * The ncdump **_Filter** attribute now dumps a list of all the filters associated with a variable using the above new format. * Filter specifications can now use a filter name instead of number for filters known to the netcdf library, which in turn is taken from the HDF5 filter registration page. * New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter is returned if an attempt is made to access an unknown filter. * Internally, the dispatch table has been extended to add a function to handle all of the filter functions. * New, filter-related, tests were added to nc_test4. * A new plugin was added to the plugins directory to help with testing. Notes: 1. The shuffle and fletcher32 filters are not part of the multifilter system. Misc. changes: 1. A debug module was added to libhdf5 to help catch error locations.
2020-02-17 03:59:33 +08:00
This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform | Build System | S3 support ------------------------------------ Linux+gcc | Automake | yes Linux+gcc | CMake | yes Visual Studio | CMake | no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char*. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char* based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
/* Binary searcher for reserved attributes */
extern const NC_reservedatt* NC_findreserved(const char* name);
Add support for multiple filters per variable. re: https://github.com/Unidata/netcdf-c/issues/1584 Support has been added for multiple filters per variable. This affects a number of components in netcdf. The new APIs are documented in NUG/filters.md. The primary changes are: * A set of new functions are provided (see __include/netcdf_filter.h__). - Obtain a list of the filters associated with a variable - Obtain the parameters for a specific filter. * The existing __nc_inq_var_filter__ function now returns info about the first defined filter. * The utilities (ncgen, ncdump, and nccopy) now support an extended format for specifying a sequence of filters. The general form is __<filter>|<filter>..._. * The ncdump **_Filter** attribute now dumps a list of all the filters associated with a variable using the above new format. * Filter specifications can now use a filter name instead of number for filters known to the netcdf library, which in turn is taken from the HDF5 filter registration page. * New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter is returned if an attempt is made to access an unknown filter. * Internally, the dispatch table has been extended to add a function to handle all of the filter functions. * New, filter-related, tests were added to nc_test4. * A new plugin was added to the plugins directory to help with testing. Notes: 1. The shuffle and fletcher32 filters are not part of the multifilter system. Misc. changes: 1. A debug module was added to libhdf5 to help catch error locations.
2020-02-17 03:59:33 +08:00
This PR adds EXPERIMENTAL support for accessing data in the cloud using a variant of the Zarr protocol and storage format. This enhancement is generically referred to as "NCZarr". The data model supported by NCZarr is netcdf-4 minus the user-defined types and the String type. In this sense it is similar to the CDF-5 data model. More detailed information about enabling and using NCZarr is described in the document NUG/nczarr.md and in a [Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in). WARNING: this code has had limited testing, so do use this version for production work. Also, performance improvements are ongoing. Note especially the following platform matrix of successful tests: Platform | Build System | S3 support ------------------------------------ Linux+gcc | Automake | yes Linux+gcc | CMake | yes Visual Studio | CMake | no Additionally, and as a consequence of the addition of NCZarr, major changes have been made to the Filter API. NOTE: NCZarr does not yet support filters, but these changes are enablers for that support in the future. Note that it is possible (probable?) that there will be some accidental reversions if the changes here did not correctly mimic the existing filter testing. In any case, previously filter ids and parameters were of type unsigned int. In order to support the more general zarr filter model, this was all converted to char*. The old HDF5-specific, unsigned int operations are still supported but they are wrappers around the new, char* based nc_filterx_XXX functions. This entailed at least the following changes: 1. Added the files libdispatch/dfilterx.c and include/ncfilter.h 2. Some filterx utilities have been moved to libdispatch/daux.c 3. A new entry, "filter_actions" was added to the NCDispatch table and the version bumped. 4. An overly complex set of structs was created to support funnelling all of the filterx operations thru a single dispatch "filter_actions" entry. 5. Move common code to from libhdf5 to libsrc4 so that it is accessible to nczarr. Changes directly related to Zarr: 1. Modified CMakeList.txt and configure.ac to support both C and C++ -- this is in support of S3 support via the awd-sdk libraries. 2. Define a size64_t type to support nczarr. 3. More reworking of libdispatch/dinfermodel.c to support zarr and to regularize the structure of the fragments section of a URL. Changes not directly related to Zarr: 1. Make client-side filter registration be conditional, with default off. 2. Hack include/nc4internal.h to make some flags added by Ed be unique: e.g. NC_CREAT, NC_INDEF, etc. 3. cleanup include/nchttp.h and libdispatch/dhttp.c. 4. Misc. changes to support compiling under Visual Studio including: * Better testing under windows for dirent.h and opendir and closedir. 5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags and to centralize error reporting. 6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them. 7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible. Changes Left TO-DO: 1. fix provenance code, it is too HDF5 specific.
2020-06-29 08:02:47 +08:00
/* Generic reserved Attributes */
#define NC_ATT_REFERENCE_LIST "REFERENCE_LIST"
#define NC_ATT_CLASS "CLASS"
#define NC_ATT_DIMENSION_LIST "DIMENSION_LIST"
#define NC_ATT_NAME "NAME"
#define NC_ATT_COORDINATES "_Netcdf4Coordinates" /*see hdf5internal.h:COORDINATES*/
#define NC_ATT_FORMAT "_Format"
#define NC_ATT_DIMID_NAME "_Netcdf4Dimid"
#define NC_ATT_NC3_STRICT_NAME "_nc3_strict"
#define NC_XARRAY_DIMS "_ARRAY_DIMENSIONS"
Add filter support to NCZarr Filter support has three goals: 1. Use the existing HDF5 filter implementations, 2. Allow filter metadata to be stored in the NumCodecs metadata format used by Zarr, 3. Allow filters to be used even when HDF5 is disabled Detailed usage directions are define in docs/filters.md. For now, the existing filter API is left in place. So filters are defined using ''nc_def_var_filter'' using the HDF5 style where the id and parameters are unsigned integers. This is a big change since filters affect many parts of the code. In the following, the terms "compressor" and "filter" and "codec" are generally used synonomously. ### Filter-Related Changes: * In order to support dynamic loading of shared filter libraries, a new library was added in the libncpoco directory; it helps to isolate dynamic loading across multiple platforms. * Provide a json parsing library for use by plugins; this is created by merging libdispatch/ncjson.c with include/ncjson.h. * Add a new _Codecs attribute to allow clients to see what codecs are being used; let ncdump -s print it out. * Provide special headers to help support compilation of HDF5 filters when HDF5 is not enabled: netcdf_filter_hdf5_build.h and netcdf_filter_build.h. * Add a number of new test to test the new nczarr filters. * Let ncgen parse _Codecs attribute, although it is ignored. ### Plugin directory changes: * Add support for the Blosc compressor; this is essential because it is the most common compressor used in Zarr datasets. This also necessitated adding a CMake FindBlosc.cmake file * Add NCZarr support for the big-four filters provided by HDF5: shuffle, fletcher32, deflate (zlib), and szip * Add a Codec defaulter (see docs/filters.md) for the big four filters. * Make plugins work with windows by properly adding __declspec declaration. ### Misc. Non-Filter Changes * Replace most uses of USE_NETCDF4 (deprecated) with USE_HDF5. * Improve support for caching * More fixes for path conversion code * Fix misc. memory leaks * Add new utility -- ncdump/ncpathcvt -- that does more or less the same thing as cygpath. * Add a number of new test to test the non-filter fixes. * Update the parsers * Convert most instances of '#ifdef _MSC_VER' to '#ifdef _WIN32'
2021-09-03 07:04:26 +08:00
#define NC_ATT_CODECS "_Codecs"
#define NC_NCZARR_ATTR "_NCZARR_ATTR"
#endif /* _NC4INTERNAL_ */