Commit Graph

169 Commits

Author SHA1 Message Date
Ward Fisher
7a0999ef3f Fix in support of https://github.com/Unidata/netcdf-c/issues/3007 2024-09-30 16:19:38 -06:00
Ward Fisher
a7077c6e6e Add configure option to enable potentially-problematic legacy macro _FillValue 2024-09-19 14:56:35 -05:00
Dennis Heimbigner
f0f0f39950 Cleanup various Zarr-related build issues
# Description
Remove various obsolete build options. Also do some code movement.

## Specific Changes

* The remotetest server is sometimes unstable, so provide a mechanism
  to force disabling calls to remotetest.unidata.ucar.edu.
  This is enabled by adding a repository variable named
  REMOTETESTDOWN with the value "yes".
* Fix CMakeLists.txt to use the uname command as an alternate
  to using the hostname command (which does not work under cygwin).
* Remove the JNA stuff as obsolete
* Remove the ENABLE_CLIENTSIDE_FILTERS options since it has been
  disabled for a while.
* Fix bad option flag in some github action .yml files: change --disable-xml2 to --disable-libxml2
* Collect globalstate definitions into nc4internal.h
* Remove ENABLE_NCZARR_FILTERS_TESTING option as obsolete and replace
  with ENABLE_NCZARR_FILTERS
* Move some dispatcher independent functions from libsrc4/nc4internal.c to libdispatch/ddispatch.c
* As a long term goal, and because it is now the case that --enable-nczarr
    => USE_NETCDF4, make the external options --enable-netcdf-4 and
    --enable-netcdf4 obsolete in favor of --enable-hdf5
    We will do the following for one more release cycle.
        1. Make --enable-netcdf-4 be an alias for --enable-netcdf4.
        2. Make --enable-netcdf4 an alias for --enable-hdf5.
        3. Internally, convert most uses of USE_NETCDF_4 ad USE_NETCDF4 to USE_HDF5
    After the next release, --enable-netcdf-4 and --enable-netcdf4 will
    be removed.
2024-05-15 18:46:25 -06:00
Ward Fisher
c26f7eabf4 Refactor macro _FillValue to NC_FillValue in support of https://github.com/Unidata/netcdf-c/issues/2858 2024-04-24 11:38:07 -06:00
Dennis Heimbigner
fb40a72b45 Improve performance of the nc_reclaim_data and nc_copy_data functions.
re: Issue https://github.com/Unidata/netcdf-c/issues/2685
re: PR https://github.com/Unidata/netcdf-c/pull/2179

As noted in PR https://github.com/Unidata/netcdf-c/pull/2179,
the old code did not allow for reclaiming instances of types,
nor for properly copying them. That PR provided new functions
capable of reclaiming/copying instances of arbitrary types.

However, as noted by Issue https://github.com/Unidata/netcdf-c/issues/2685, using these
most general functions resulted in a significant performance
degradation, even for common cases.

This PR attempts to mitigate the cost of using the general
reclaim/copy functions in two ways.

First, the previous functions operating at the top level by
using ncid and typeid arguments. These functions were augmented
with equivalent versions that used the netcdf-c library internal
data structures to allow direct access to needed information.
These new functions are used internally to the library.

The second mitigation involves optimizing the internal functions
by providing early tests for common cases. This avoids
unnecessary recursive function calls.

The overall result is a significant improvement in speed by a
factor of roughly twenty -- your mileage may vary. These
optimized functions are still not as fast as the original (more
limited) functions, but they are getting close. Additional optimizations are
possible. But the cost is a significant "uglification" of the
code that I deemed a step too far, at least for now.

## Misc. Changes
1. Added a test case to check the proper reclamation/copy of complex types.
2. Found and fixed some places where nc_reclaim/copy should have been used.
3. Replaced, in the netcdf-c library, (almost all) occurrences of nc_reclaim_copy with calls to NC_reclaim/copy. This plus the optimizations is the primary speed-up mechanism.
4. In DAP4, the metadata is held in a substrate in-memory file; this required some changes so that the reclaim/copy code accessed that substrate dispatcher rather than the DAP4 dispatcher.
5. Re-factored and isolated the code that computes if a type is (transitively) variable-sized or not.
6. Clean up the reclamation code in ncgen; adding the use of nc_reclaim exposed some memory problems.
2023-05-20 17:11:25 -06:00
Dennis Heimbigner
65fd9fe1a5 Provide a default enum const when fill value does not match any enum const.
re: https://github.com/Unidata/netcdf-c/issues/982

It is possible to define an enum type that has no enum constant
with value zero. However, HDF5 has a default fill value of zero
that it used to fill all chunks. In the event that this situation
occurs, ncdump, say, will fail because there is no enum const
to print for the value zero.

The solution is to create a special enum constant called "_UNDEFINED"
that has the value zero. It is only used in the case that there is
no constant in the enum that already covers zero.

A test case is added in netcdf-c/ncdump to validate this solution.

Note: the changes occur primarily in libsrc4, so they also work for NCZarr.
2022-07-17 14:32:31 -06:00
Dennis Heimbigner
aabbdbf64c Make public a limited API for programmatic access to internal .rc tables
re: https://github.com/Unidata/netcdf-c/issues/2337
re: https://github.com/Unidata/netcdf-c/issues/2407

Add two functions to netcdf.h to allow programs to get/set
selected entries into the internal .rc tables. This should fix
the above issues by allowing HTTP.CAINFO to be set to the
certificates directory.  Note that the changes should be
performed as early as possible in the program because some of
the .rc table entries may get cached internally and changing the
entry after that caching occurs may have no effect.

The new signatures are as follows:

1. Get the value of a simple .rc entry of the form "key=value".
Note that caller must free the returned value, which might be NULL.
````
char* nc_rc_get(char* const * key);

@param key table entry key
@return value if .rc table has entry of the form key=value
@return NULL if no such entry is found.
````

2. Insert/Overwrite the specified key=value pair in the .rc table.
````
int nc_rc_set(const char* key, const char* value);

@param key table entry key -- may not be NULL
@param value table entry value -- may not be NULL
@return NC_NOERR if no error
@return NC_EINVAL if error
````

Addendum:

re: https://github.com/Unidata/netcdf-c/issues/2407

Modify dhttp.c to use the .rc entry HTTP.CAINFO if defined.
2022-06-17 14:35:12 -06:00
Ward Fisher
0586b64521
Merge pull request #2335 from edwardhartnett/ejh_szip_constants
fixed missing szip constants in netcdf.h
2022-05-17 16:45:47 -06:00
Edward Hartnett
d1cbd60960 fixed missing szip constants in netcdf.h 2022-05-04 09:48:46 -06:00
Edward Hartnett
14e80b4673 fixing doxygen warnings 2022-05-03 09:41:45 -06:00
Ward Fisher
982b258c46 Merge branch 'dimscale_attachement_optional' of https://github.com/gsjaardema/netcdf-c into gh2161.wif 2022-04-19 11:06:34 -06:00
Ward Fisher
2ccdf14697 Merge branch 'csz_bitround' of https://github.com/nco/netcdf-c into gh2232.wif 2022-04-01 10:43:34 -06:00
Charlie Zender
a74d3573e5 First draft of BitRound implementation 2022-02-18 11:00:37 -08:00
Greg Sjaardema
c746c11539
Merge branch 'main' into dimscale_attachement_optional 2022-02-16 11:47:14 -07:00
Dennis Heimbigner
f3e711e2b8 Add support for setting HDF5 alignment property when creating a file
re: https://github.com/Unidata/netcdf-c/issues/2177
re: https://github.com/Unidata/netcdf-c/pull/2178

Provide get/set functions to store global data alignment
information and apply it when a file is created.

The api is as follows:
````
int nc_set_alignment(int threshold, int alignment);
int nc_get_alignment(int* thresholdp, int* alignmentp);
````

If defined, then for every file created opened after the call to
nc_set_alignment, for every new variable added to the file, the
most recently set threshold and alignment values will be applied
to that variable.

The nc_get_alignment function return the last values set by
nc_set_alignment.  If nc_set_alignment has not been called, then
it returns the value 0 for both threshold and alignment.

The alignment parameters are stored in the NCglobalstate object
(see below) for use as needed. Repeated calls to nc_set_alignment
will overwrite any existing values in NCglobalstate.

The alignment parameters are applied in libhdf5/hdf5create.c
and libhdf5/hdf5open.c

The set/get alignment functions are defined in libsrc4/nc4internal.c.

A test program was added as nc_test4/tst_alignment.c.

## Misc. Changes Unrelated to Alignment

* The NCRCglobalstate type was renamed to NCglobalstate to
  indicate that it represented more general global state than
  just .rc data.  It was also moved to nc4internal.h.  This led
  to a large number of small changes: mostly renaming. The
  global state management functions were moved to nc4internal.c.

* The global chunk cache variables have been moved into
  NCglobalstate.  As warranted, other global state will be moved
  as well.

* Some misc. problems with the nczarr performance tests were corrected.
2022-01-29 15:27:52 -07:00
Dennis Heimbigner
89cc20a20d Rename GranularBitGroom to GranularBitRound
As per Charlie Zender's request (https://github.com/Unidata/netcdf-c/pull/2197#issuecomment-1022762863), the GranularBitGroom name is changed to GranularBitRound
with attendant code changes.
2022-01-28 13:04:16 -07:00
Ward Fisher
3980d7616a
Merge pull request #2130 from nco/csz_gbg
Granular BitGroom feature for netcdf-c
2022-01-13 17:35:16 -07:00
Ward Fisher
978a99bbea
Merge branch 'main' into vlenfix.dmh 2022-01-13 16:31:34 -07:00
Dennis Heimbigner
56c549af0f Make sure mode flags are properly defined in netcdf.h
In a number of places in the netcdf-c library, some of the
high order mode flags (the mode argument to nc_open or nc_close)
are being used to save state information. This means that the
description of the defined and open mode flags in netcdf.h
were not accurate.

This PR moves all those hack flags so that the list of mode flags
in netcdf.h is correct.
2022-01-11 19:05:46 -07:00
Dennis Heimbigner
8b9253fef2 Fix various problem around VLEN's
re: https://github.com/Unidata/netcdf-c/issues/541
re: https://github.com/Unidata/netcdf-c/issues/1208
re: https://github.com/Unidata/netcdf-c/issues/2078
re: https://github.com/Unidata/netcdf-c/issues/2041
re: https://github.com/Unidata/netcdf-c/issues/2143

For a long time, there have been known problems with the
management of complex types containing VLENs.  This also
involves the string type because it is stored as a VLEN of
chars.

This PR (mostly) fixes this problem. But note that it adds new
functions to netcdf.h (see below) and this may require bumping
the .so number.  These new functions can be removed, if desired,
in favor of functions in netcdf_aux.h, but netcdf.h seems the
better place for them because they are intended as alternatives
to the nc_free_vlen and nc_free_string functions already in
netcdf.h.

The term complex type refers to any type that directly or
transitively references a VLEN type. So an array of VLENS, a
compound with a VLEN field, and so on.

In order to properly handle instances of these complex types, it
is necessary to have function that can recursively walk
instances of such types to perform various actions on them.  The
term "deep" is also used to mean recursive.

At the moment, the two operations needed by the netcdf library are:
* free'ing an instance of the complex type
* copying an instance of the complex type.

The current library does only shallow free and shallow copy of
complex types. This means that only the top level is properly
free'd or copied, but deep internal blocks in the instance are
not touched.

Note that the term "vector" will be used to mean a contiguous (in
memory) sequence of instances of some type. Given an array with,
say, dimensions 2 X 3 X 4, this will be stored in memory as a
vector of length 2*3*4=24 instances.

The use cases are primarily these.

## nc_get_vars
Suppose one is reading a vector of instances using nc_get_vars
(or nc_get_vara or nc_get_var, etc.).  These functions will
return the vector in the top-level memory provided.  All
interior blocks (form nested VLEN or strings) will have been
dynamically allocated.

After using this vector of instances, it is necessary to free
(aka reclaim) the dynamically allocated memory, otherwise a
memory leak occurs.  So, the recursive reclaim function is used
to walk the returned instance vector and do a deep reclaim of
the data.

Currently functions are defined in netcdf.h that are supposed to
handle this: nc_free_vlen(), nc_free_vlens(), and
nc_free_string().  Unfortunately, these functions only do a
shallow free, so deeply nested instances are not properly
handled by them.

Note that internally, the provided data is immediately written so
there is no need to copy it. But the caller may need to reclaim the
data it passed into the function.

## nc_put_att
Suppose one is writing a vector of instances as the data of an attribute
using, say, nc_put_att.

Internally, the incoming attribute data must be copied and stored
so that changes/reclamation of the input data will not affect
the attribute.

Again, the code inside the netcdf library does only shallow copying
rather than deep copy. As a result, one sees effects such as described
in Github Issue https://github.com/Unidata/netcdf-c/issues/2143.

Also, after defining the attribute, it may be necessary for the user
to free the data that was provided as input to nc_put_att().

## nc_get_att
Suppose one is reading a vector of instances as the data of an attribute
using, say, nc_get_att.

Internally, the existing attribute data must be copied and returned
to the caller, and the caller is responsible for reclaiming
the returned data.

Again, the code inside the netcdf library does only shallow copying
rather than deep copy. So this can lead to memory leaks and errors
because the deep data is shared between the library and the user.

# Solution

The solution is to build properly recursive reclaim and copy
functions and use those as needed.
These recursive functions are defined in libdispatch/dinstance.c
and their signatures are defined in include/netcdf.h.
For back compatibility, corresponding "ncaux_XXX" functions
are defined in include/netcdf_aux.h.
````
int nc_reclaim_data(int ncid, nc_type xtypeid, void* memory, size_t count);
int nc_reclaim_data_all(int ncid, nc_type xtypeid, void* memory, size_t count);
int nc_copy_data(int ncid, nc_type xtypeid, const void* memory, size_t count, void* copy);
int nc_copy_data_all(int ncid, nc_type xtypeid, const void* memory, size_t count, void** copyp);
````
There are two variants. The first two, nc_reclaim_data() and
nc_copy_data(), assume the top-level vector is managed by the
caller. For reclaim, this is so the user can use, for example, a
statically allocated vector. For copy, it assumes the user
provides the space into which the copy is stored.

The second two, nc_reclaim_data_all() and
nc_copy_data_all(), allows the functions to manage the
top-level.  So for nc_reclaim_data_all, the top level is
assumed to be dynamically allocated and will be free'd by
nc_reclaim_data_all().  The nc_copy_data_all() function
will allocate the top level and return a pointer to it to the
user. The user can later pass that pointer to
nc_reclaim_data_all() to reclaim the instance(s).

# Internal Changes
The netcdf-c library internals are changed to use the proper
reclaim and copy functions.  It turns out that the places where
these functions are needed is quite pervasive in the netcdf-c
library code.  Using these functions also allows some
simplification of the code since the stdata and vldata fields of
NC_ATT_INFO are no longer needed.  Currently this is commented
out using the SEPDATA \#define macro.  When any bugs are largely
fixed, all this code will be removed.

# Known Bugs

1. There is still one known failure that has not been solved.
   All the failures revolve around some variant of this .cdl file.
   The proximate cause of failure is the use of a VLEN FillValue.
````
        netcdf x {
        types:
          float(*) row_of_floats ;
        dimensions:
          m = 5 ;
        variables:
          row_of_floats ragged_array(m) ;
              row_of_floats ragged_array:_FillValue = {-999} ;
        data:
          ragged_array = {10, 11, 12, 13, 14}, {20, 21, 22, 23}, {30, 31, 32},
                         {40, 41}, _ ;
        }
````
When a solution is found, I will either add it to this PR or post a new PR.

# Related Changes

* Mark nc_free_vlen(s) as deprecated in favor of ncaux_reclaim_data.
* Remove the --enable-unfixed-memory-leaks option.
* Remove the NC_VLENS_NOTEST code that suppresses some vlen tests.
* Document this change in docs/internal.md
* Disable the tst_vlen_data test in ncdump/tst_nccopy4.sh.
* Mark types as fixed size or not (transitively) to optimize the reclaim
  and copy functions.

# Misc. Changes

* Make Doxygen process libdispatch/daux.c
* Make sure the NC_ATT_INFO_T.container field is set.
2022-01-08 18:30:00 -07:00
Greg Sjaardema
31783b61de Make dimscale attachement to variables optional 2021-12-02 15:13:51 -07:00
Charlie Zender
fb70b4cf46 First draft of Granular BitGroom feature for netcdf-c 2021-10-20 16:00:32 -07:00
Ward Fisher
4086bbd887
Merge pull request #2056 from gsjaardema/WIP-attribute-creation-order-tracking-option
Attribute creation order on/off
2021-10-13 10:18:36 -06:00
Edward Hartnett
8f3da3f375 change name of att to _QuantizeBitgroomNumberOfSignificantDigits 2021-08-26 23:02:03 -06:00
Edward Hartnett
2b3d2c1edf changed name of attribute to _quantize_bitgroom_number_of_significant_digits 2021-08-26 10:24:14 -06:00
Edward Hartnett
b2c0bb9810 more quantize testing 2021-08-25 01:54:25 -06:00
Edward Hartnett
9a18689ffa getting ready for next try at quantization code 2021-08-24 00:45:38 -06:00
Greg Sjaardema
af0fcceae8 Address code review issues 2021-08-10 08:55:21 -06:00
Greg Sjaardema
d2c3165664 WIP: attribute creation order on/off
Work in progress / Proof of concept:

Add a capability to disable the tracking of attribute creation order.
See #2054 for details.

This PR adds a `NC_NOATTCREORD` define which can be passed int the
`mode` argument to `nc_create`.  If it is present, then the
calls to set the attribute creation order tracking is disabled.
This should only be used for files in which you *know* that the
ordering of the attributes does not matter to *any* potential
readers of this database.
2021-08-06 13:30:47 -06:00
Ward Fisher
9f798e2ed6 Merge branch 'virtual_datasets' of https://github.com/d70-t/netcdf-c into gh1983.wif 2021-07-19 09:44:35 -07:00
Dennis Heimbigner
d953899559 Move to Version 2 NCZarr Extended Meta-Data
re: https://github.com/zarr-developers/zarr-specs/issues/41

After discussions with the Zarr community, it was decided to
convert to a new representation of the NCZarr meta-data extensions: version 2.
These extensions store information necessary to mapping the Zarr data model
to the netcdf-4 data model.

The basic change is to remove the NCZarr specific objects: .nczarr, .nczgroup, .nczarray, and .nczattr.
The contents of these objects is moved into the corresponding existing Zarr objects as special keys. The mapping is as follows:

* ''.nczarr'' => ''/.zgroup/_NCZARR_SUPERBLOCK_''
* ''.nczgroup => ''.zgroup/_NCZARR_GROUP_''
* ''.nczarray => ''.zarray/_NCZARR_ARRAY_''
* ''.nczattr => ''.zattr/_NCZARR_ATTR_''

Backward compatibility is maintained by looking for the object ''/.nczarr''
and if found, then assuming that the dataset is in the older version 1 format.
This compatibility only supports reading of such version 1 datasets.

Documentation and test cases are also added.

Misc. Other Changes:
1. The json parsing code was added to the general library instead of nczarr only (ncjson.c, ncjson.h).
2. Improved support for different platform paths by allowing conversion
   to a single common path representation.
3. Add some new error codes.
4. Modify nccopy usage to mention the new chunking specification.
2021-07-17 16:55:30 -06:00
Dennis Heimbigner
ec5b3f9a4f Regularize the scoping of dimensions
This is a follow-on to pull request
````https://github.com/Unidata/netcdf-c/pull/1959````,
which fixed up type scoping.

The primary changes are to _nc\_inq\_dimid()_ and to ncdump.

The _nc\_inq\_dimid()_ function is supposed to allow the name to be
and FQN, but this apparently never got implemented. So if was modified
to support FQNs.

The ncdump program is supposed to output fully qualified dimension names
in its generated CDL file under certain conditions.

Suppose ncdump has a netcdf-4 file F with variable V, and V's parent group
is G. For each dimension id D referenced by V, ncdump needs to determine
whether to print its name as a simple name or as a fully qualified name (FQN).

The algorithm is as follows:

1. Search up the tree of ancestor groups.
2. If one of those ancestor groups contains the dimid, then call it dimgrp.
3. If one of those ancestor groups contains a dim with the same name as the dimid, but with a different dimid, then record that as duplicate=true.
4. If dimgrp is defined and duplicate == false, then we do not need an fqn.
5. If dimgrp is defined and duplicate == true, then we do need an fqn to avoid incorrectly using the duplicate.
6. If dimgrp is undefined, then do a preorder breadth-first search of all the groups looking for the dimid.
7. If found, then use the fqn of the first found such dimension location.
8. If not found, then fail.

Test case ncdump/test_scope.sh was modified to test the proper
operation of ncdump and _nc\_inq\_dimid()_.

Misc. Other Changes:
* Fix nc_inq_ncid (NC4_inq_ncid actually) to return root group id if the name argument is NULL.
* Modify _ncdump/printfqn_ to print out a dimid FQN; this supports verification that the resulting .nc files were properly created.
2021-05-31 15:51:12 -06:00
Dennis Heimbigner
e7d5f24078 Add zip file support
The primary change is to support the use of a zip file as a
storage format. Simultaneously the .nz4 support is made obsolete

Use of zip requires the libzip support library, so a number of
changes to the build files (Makefile.am, CMakeLists.txt) are
necessary to locate and incorporate libzip.  The nczarr_tests
tests are also changed to add zip testing.

Other changes:
* Make sure distcheck leaves no files around.
* Add some functions to netcdf_aux to export some functions of libnetcdf.
* Add a new error NC_EFOUND as the complement of NC_EEMPTY.
* Add tracing support to nclog and use it in libnczarr.
* Modify the zmap interface to support the writeonce semantics of zip.
* Create a new s3util.c to support a variety of S3 auxilliary functions.
* EXTERNL'ize a number of functions so they can be used in s3util.
* Add support for the S3 ListObjects CommonPrefixes mechanism
  to improve search.
* Add experimental support for running nczarr X s3 tests against
  the actual Amazon S3 cloud.
2021-01-28 20:11:01 -07:00
Tobias Kölling
69fb44ec4b added NC_VIRTUAL storage layout 2020-09-03 17:51:46 +02:00
Tobias Kölling
4c27730ae3 hdf5: added unknown storage specification
In case HDF5 adds more storage specifications, netcdf4 should be able to
cope with them by default. Further specializations could be added
nonetheless.
2020-09-02 16:13:23 +02:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Dennis Heimbigner
44d0dcaad2 Add support for multiple filters per variable.
re: https://github.com/Unidata/netcdf-c/issues/1584

Support has been added for multiple filters per variable.  This
affects a number of components in netcdf. The new APIs are
documented in NUG/filters.md.

The primary changes are:
* A set of new functions are provided (see __include/netcdf_filter.h__).
    - Obtain a list of the filters associated with a variable
    - Obtain the parameters for a specific filter.
* The existing __nc_inq_var_filter__ function now returns info
  about the first defined filter.
* The utilities (ncgen, ncdump, and nccopy) now support
  an extended format for specifying a sequence of filters.
  The general form is __<filter>|<filter>..._.
* The ncdump **_Filter** attribute now dumps a list of all the
  filters associated with a variable using the above new format.
* Filter specifications can now use a filter name instead of number
  for filters known to the netcdf library, which in turn is taken
  from the HDF5 filter registration page.
* New errors are defined: NC_EFILTER and NC_ENOFILTER. The latter
  is returned if an attempt is made to access an unknown filter.
* Internally, the dispatch table has been extended to add a function
  to handle all of the filter functions.
* New, filter-related, tests were added to nc_test4.
* A new plugin was added to the plugins directory to help with testing.

Notes:
1. The shuffle and fletcher32 filters are not part of the multifilter system.

Misc. changes:
1. A debug module was added to libhdf5 to help catch error locations.
2020-02-16 12:59:33 -07:00
Edward Hartnett
164de982bd merged changes from master branch 2020-02-11 04:05:35 -07:00
Edward Hartnett
da904f6438 raised NC_MAX_HDF4_NAME length to NC_MAX_NAME 2020-02-08 09:21:01 -07:00
Edward Hartnett
96182a9236 trying to fix windows build 2020-02-07 11:57:56 -07:00
Edward Hartnett
82df2876b6 starting to support compact storage 2019-12-04 07:53:37 -07:00
Greg Sjaardema
56c0d5cf8a Spelling fixes 2019-09-18 08:03:01 -06:00
edwardhartnett
fb371d3c3e added 2019 to netcdf.h copyright notice 2019-08-16 04:38:26 -06:00
edwardhartnett
60f436e7ee starting to remove obsolete _CRAYMPP macros 2019-08-14 06:13:45 -06:00
Ward Fisher
e2b31ffae4
Merge branch 'master' into byterange.dmh 2019-03-19 12:05:44 -06:00
Wei-keng Liao
8ebe73059c revise comments about the deprecated flags in netcdf.h to avoid confusion 2019-02-26 11:16:54 -06:00
Dennis Heimbigner
bf2746b8ea Provide byte-range reading of remote datasets
re: issue https://github.com/Unidata/netcdf-c/issues/1251

Assume that you have the URL to a remote dataset
which is a normal netcdf-3 or netcdf-4 file.

This PR allows the netcdf-c to read that dataset's
contents as a netcdf file using HTTP byte ranges
if the remote server supports byte-range access.

Originally, this PR was set up to access Amazon S3 objects,
but it can also access other remote datasets such as those
provided by a Thredds server via the HTTPServer access protocol.
It may also work for other kinds of servers.

Note that this is not intended as a true production
capability because, as is known, this kind of access to
can be quite slow. In addition, the byte-range IO drivers
do not currently do any sort of optimization or caching.

An additional goal here is to gain some experience with
the Amazon S3 REST protocol.

This architecture and its use documented in
the file docs/byterange.dox.

There are currently two test cases:

1. nc_test/tst_s3raw.c - this does a simple open, check format, close cycle
   for a remote netcdf-3 file and a remote netcdf-4 file.
2. nc_test/test_s3raw.sh - this uses ncdump to investigate some remote
   datasets.

This PR also incorporates significantly changed model inference code
(see the superceded PR https://github.com/Unidata/netcdf-c/pull/1259).

1. It centralizes the code that infers the dispatcher.
2. It adds support for byte-range URLs

Other changes:

1. NC_HDF5_finalize was not being properly called by nc_finalize().
2. Fix minor bug in ncgen3.l
3. fix memory leak in nc4info.c
4. add code to walk the .daprc triples and to replace protocol=
   fragment tag with a more general mode= tag.

Final Note:
Th inference code is still way too complicated. We need to move
to the validfile() model used by netcdf Java, where each
dispatcher is asked if it can process the file. This decentralizes
the inference code. This will be done after all the major new
dispatchers (PIO, Zarr, etc) have been implemented.
2019-01-01 18:27:36 -07:00
Ward Fisher
5be0126920 More standardizing of the copyright stanza. 2018-12-06 14:13:56 -07:00
Ward Fisher
3c59fb860d Updating files to refer to the top-level COPYRIGHT file. 2018-12-04 15:52:43 -07:00
Dennis Heimbigner
4636584d5b Revert/Improve nc_create + NC_DISKLESS behavior
re: https://github.com/Unidata/netcdf-c/issues/1154

Inadvertently, the behavior of NC_DISKLESS with nc_create() was
changed in release 4.6.1. Previously, the NC_WRITE flag needed
to be explicitly used with NC_DISKLESS in order to cause the
created file to be persisted to disk.

Additional analyis indicated that the current NC_DISKLESS
implementation was seriously flawed.

This PR attempts to clean up and regularize the situation with
respect to NC_DISKLESS control. One important aspect of diskless
operation is that there are two different notions of write.

1. The file is read-write vs read-only when using the netcdf API.
2. The file is persisted or not to disk at nc_close().

Previously, these two were conflated. The rules now are
as follows.

1. NC_DISKLESS + NC_WRITE means that the file is read/write using the netcdf API
2. NC_DISKLESS + NC_PERSIST means that the file is persisted to a disk file at nc_close.
3. NC_DISKLESS + NC_PERSIST + NC_WRITE means both 1 and 2.

The NC_PERSIST flag is new and takes over the obsolete NC_MPIPOSIX flag.
NC_MPIPOSIX is still defined, but is now an alias for the NC_MPIIO flag.

It is also now the case that for netcdf-4, NC_DISKLESS is independent
of NC_INMEMORY and in fact it is an error to specify both flags
simultaneously.

Finally, the MMAP code was fixed to use NC_PERSIST as well.
Also marked MMAP as deprecated.

Also added a test case to test various combinations of NC_DISKLESS,
NC_PERSIST, and NC_WRITE.

This PR affects a number of files and especially test cases
that used NC_DISKLESS.

Misc. Unrelated fixes
1. fixed some warnings in ncdump/dumplib.c
2018-10-10 13:32:17 -06:00