Commit Graph

880 Commits

Author SHA1 Message Date
Dennis Heimbigner
8cab468169 Suppress filters on variables with non-fixed-size types.
re: Discussion https://github.com/Unidata/netcdf-c/discussions/2554
re: PR https://github.com/Unidata/netcdf-c/pull/2231
re: Issue https://github.com/Unidata/netcdf-c/issues/2189

After some discussion, the issue of applying filters on variables
whose type is not fixed size, was resolved as follows:
1. A call to nc_def_var_filter will ignore such filters, but will issue a log warning.
2. Loading (from an existing file) a variable whose type is not fixed-size and which has filters, will cause the variable to be suppressed.

This PR enforces those rules.

### Misc. Other changes
* Add a test case to test the vlen change.
* Make some minor clean-ups in various cmake and automake files.
* Remove unused test
2023-06-21 14:46:22 -06:00
Ward Fisher
8b90ffaeef Changed my mind about the #define. We are still using has_multifilters in nc-config, and I can see where people might want to use this define to write code that works with modern and older versions of netCDF. 2023-06-12 15:12:39 -06:00
Dennis Heimbigner
cdbf04956b Provide a single option to disable all network access and testing.
Add the option "--disable-network-access" (automake)
or "-DENABLE_NETWORK_ACCESS=OFF" (cmake).
When disabled, this option transitively disables all
network access capabilities and testing.
If set, this option implies the following:
* --disable-dap
* --disable-byterange
* --disable-s3

This PR answers a request for a feature from Ed Hartnett.

## Misc. Other changes
* Take the opportunity to clean up some old, unused options;
e.g. --enable-multifilters.
* Fix bug in using S3 urls.
2023-06-10 14:08:04 -06:00
Dennis Heimbigner
fb40a72b45 Improve performance of the nc_reclaim_data and nc_copy_data functions.
re: Issue https://github.com/Unidata/netcdf-c/issues/2685
re: PR https://github.com/Unidata/netcdf-c/pull/2179

As noted in PR https://github.com/Unidata/netcdf-c/pull/2179,
the old code did not allow for reclaiming instances of types,
nor for properly copying them. That PR provided new functions
capable of reclaiming/copying instances of arbitrary types.

However, as noted by Issue https://github.com/Unidata/netcdf-c/issues/2685, using these
most general functions resulted in a significant performance
degradation, even for common cases.

This PR attempts to mitigate the cost of using the general
reclaim/copy functions in two ways.

First, the previous functions operating at the top level by
using ncid and typeid arguments. These functions were augmented
with equivalent versions that used the netcdf-c library internal
data structures to allow direct access to needed information.
These new functions are used internally to the library.

The second mitigation involves optimizing the internal functions
by providing early tests for common cases. This avoids
unnecessary recursive function calls.

The overall result is a significant improvement in speed by a
factor of roughly twenty -- your mileage may vary. These
optimized functions are still not as fast as the original (more
limited) functions, but they are getting close. Additional optimizations are
possible. But the cost is a significant "uglification" of the
code that I deemed a step too far, at least for now.

## Misc. Changes
1. Added a test case to check the proper reclamation/copy of complex types.
2. Found and fixed some places where nc_reclaim/copy should have been used.
3. Replaced, in the netcdf-c library, (almost all) occurrences of nc_reclaim_copy with calls to NC_reclaim/copy. This plus the optimizations is the primary speed-up mechanism.
4. In DAP4, the metadata is held in a substrate in-memory file; this required some changes so that the reclaim/copy code accessed that substrate dispatcher rather than the DAP4 dispatcher.
5. Re-factored and isolated the code that computes if a type is (transitively) variable-sized or not.
6. Clean up the reclamation code in ncgen; adding the use of nc_reclaim exposed some memory problems.
2023-05-20 17:11:25 -06:00
Dennis Heimbigner
49737888ca Improve S3 Documentation and Support
## Improvements to S3 Documentation
* Create a new document *quickstart_paths.md* that give a summary of the legal path formats used by netcdf-c. This includes both file paths and URL paths.
* Modify *nczarr.md* to remove most of the S3 related text.
* Move the S3 text from *nczarr.md* to a new document *cloud.md*.
* Add some S3-related text to the *byterange.md* document.

Hopefully, this will make it easier for users to find the information they want.

## Rebuild NCZarr Testing
In order to avoid problems with running make check in parallel, two changes were made:
1. The *nczarr_test* test system was rebuilt. Now, for each test.
any generated files are kept in a test-specific directory, isolated
from all other test executions.
2. Similarly, since the S3 test bucket is shared, any generated S3 objects
are isolated using a test-specific key path.

## Other S3 Related Changes
* Add code to ensure that files created on S3 are reclaimed at end of testing.
* Used the bash "trap" command to ensure S3 cleanup even if the test fails.
* Cleanup the S3 related configure.ac flag set since S3 is used in several places. So now one should use the option *--enable-s3* instead of *--enable-nczarr-s3*, although the latter is still kept as a deprecated alias for the former.
* Get some of the github actions yml to work with S3; required fixing various test scripts adding a secret to access the Unidata S3 bucket.
* Cleanup S3 portion of libnetcdf.settings.in and netcdf_meta.h.in and test_common.in.
* Merge partial S3 support into dhttp.c.
* Create an experimental s3 access library especially for use with Windows. It is enabled by using the options *--enable-s3-internal* (automake) or *-DENABLE_S3_INTERNAL=ON* (CMake). Also add a unit-test for it.
* Move some definitions from ncrc.h to ncs3sdk.h

## Other Changes
* Provide a default implementation of strlcpy and move this and similar defaults into *dmissing.c*.
2023-04-25 17:15:06 -06:00
Dennis Heimbigner
d7d216a3f5 Merge branch 'master' into dap4tests2.dmh 2023-03-16 14:03:29 -06:00
Ward Fisher
8d51666d04
Merge branch 'main' into encode.dmh 2023-03-07 14:18:38 -07:00
Dennis Heimbigner
cf6fcb3b9c Merge branch 'master' into dap4tests2.dmh 2023-03-02 20:00:05 -07:00
Dennis Heimbigner
69e84fe9f1 Fix byterange handling of some URLS
re: Issue

The byterange handling of the following URLS fails.

### Problem 1: "https://crudata.uea.ac.uk/cru/data/temperature/HadCRUT.4.6.0.0.median.nc#mode=bytes"
It turns out that byterange in hdf5 has two possible targets: S3 and not-S3 (e.g. a thredds server or the crudata URL above). Each uses a different HDF5 Virtual File Driver (VFD).
I incorrectly set up the byterange code in libhdf5 so that it would choose one or the other of the two VFD's for any netcdf-c library build. The fix is to allow it to choose either one at run-time.

### Problem 2: "https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2022/001/18/OR_ABI-L1b-RadF-M6C01_G16_s20220011800205_e20220011809513_c20220011809562.nc#mode=bytes,s3"
When given what appears to be an S3-related URL, the netcdf-c library code converts it into a canonical, so-called "path" format. In casing out the possible input URL formats, I missed the case where the host contains the bucket ("noaa-goes16"), but not the region. So the fix was to check for this case.

## Misc. Related Changes
1. Since S3 is used in more than just NCZarr, I changed the automake/cmake options to replace "--enable-nczarr-s3" with "--enable-s3", but keeping the former option as a synonym for the latter. This also entailed cleaning up libnetcdf.settings WRT S3 support
2. Added the above URLS as additional test cases

## Misc. Un-Related Changes
1. CURLOPT_PUT is deprecated in favor to CURLOPT_UPLOAD
2. Fix some minor warnings

## Open Problems
* Under Ubuntu, either libcrypto or aws-sdk-cpp has a memory leak.
2023-03-02 19:51:02 -07:00
Ward Fisher
b40f8ce367 Invert solution as discussed at https://github.com/Unidata/netcdf-c/pull/2618 2023-02-27 16:07:36 -07:00
Dennis Heimbigner
e31ce10842 Enable ACCEPT_ENCODING on DAP requests
re: PR https://github.com/Unidata/netcdf-c/issues/2622

H/T Nathan Potter for finding this.

Apparently the existing library DAP code for supporting
compressed http responses was disabled.

So:
1. enable CURLOPT_ACCEPT_ENCODING by default
2. Add a new HTTP.ENCODE for .dodsrc to allow it to be disabled.
2023-02-16 20:21:22 -07:00
Dennis Heimbigner
d1d2808919 Additional DAP4 fixes
This change-set modifies PR https://github.com/Unidata/netcdf-c/pull/2555
to add the changes listed below. Most of these changes are required
by changes to the Java remotetest.unidata.ucar.edu server.

## DAP4 Related Changes
* Add tests *dap4_test/test_constraints.sh* and *dap4_test/test_hyrax.sh*.
* Provide explicit list of remotetest files to test.
* Cleanup local checksum computing and verification.
* Define a temporary Hyrax hack flag to deal with the way Hyrax handles checksums and add "#hyrax" fragment flag for it.
* Add a hack to get past an LGTM problem with using "http:".
* Improve debug support.

## Other Changes
* Cleanup the recipe in *docs/nczarr.md* for building *aws-sdk-cpp* library.
2023-01-18 19:47:29 -07:00
Dennis Heimbigner
d06901f7d1 Suppress mistaken LGTM warnings 2022-11-15 20:54:01 -07:00
Dennis Heimbigner
835b81a285 Cleanup DAP4 testing
NOTE: This PR should not be included in 4.9.1 since additional
DAP4 related PRs will be forthcoming.

This PR makes major changes to libdap4 and dap4_test driven by changes to TDS.

* Enable DAP4
* Clean up the test input files and the test baseline comparison files. This entails:
    * Remove a multitude of unused test input and baseline data files; among them are dap4_test/: daptestfiles, dmrtestfiles, nctestfiles, and misctestfiles.
    * Define a canonical set of test input files and record in dap4_test/cdltestfiles.
    * Use the cdltestfiles to generate the .nc test inputs. This set of .nc files is then moved to the d4ts (DAP4 test server) war file in the tds repository. This set then becomes the canonical set of DAP4 test sources.
    * Scrape d4ts to obtain copies of the raw streams of DAP4 encoded data. The .dmr and .dap streams are then stored in dap4_test/rawtestfiles.
    * Disable some remote server tests until those servers are fixed.
* Add an option to ncdump (-XF) that forces the type of the _FillValue attribute; this is primarily to simplify testing of fill mismatch.
* Minor bug fixes to ncgen.
* Changes to libdap4:
    * Replace old checksum hack with the dap4.checksum flag.
    * Support the dap4.XXX controls.
    * Cleanup _FillValue handling, especially var-attribute type mismatches.
    * Fix enum handling based on changes to netcdf-java.
* Changes to dap4_test:
    * Add getopt support to various test support programs.
    * Remove unneeded shell scripts.
    * Add new scripts: test_curlopt.sh
2022-11-13 13:15:11 -07:00
DWesl
4c1a39bb71
BLD: Declare nulldup backup definition static not extern
The use of this function currently runs into problems with multiple definitions: once for each file including ncconfigure.h.  Defining this as static rather than extern should hide the definitions from each other.

static inline would still be closer to the definition as a macro, but that requires a #define to work on all platforms (not all compilers have inline yet).
2022-11-01 08:01:10 -04:00
DWesl
3b74e0bb93 FIX: ifndef requires no parentheses. 2022-10-29 13:27:17 -04:00
DWesl
4ef68740a3 STY: Move nulldup backup definition from cp_win32.c to ncconfigure.h
Allow definition to be used in more places.
Should probably consolidate definition a few places.
2022-10-29 09:48:22 -04:00
Ward Fisher
ec7cc936fa Adding NC_HAS_BLOSC and NC_HAS_BZ2 to netcdf_meta.h in support of https://github.com/Unidata/netcdf-c/issues/2511 2022-09-20 15:11:23 -06:00
Ward Fisher
2f265b7193
Merge branch 'main' into moreszfixes.dmh 2022-09-16 10:51:22 -06:00
Dennis Heimbigner
c9af92df8c conflicts 2022-08-28 13:26:20 -06:00
Dennis Heimbigner
231ae96c4b Add support for Zarr string type to NCZarr
* re: https://github.com/Unidata/netcdf-c/pull/2278
* re: https://github.com/Unidata/netcdf-c/issues/2485
* re: https://github.com/Unidata/netcdf-c/issues/2474

This PR subsumes PR https://github.com/Unidata/netcdf-c/pull/2278.
Actually is a bit an omnibus covering several issues.

## PR https://github.com/Unidata/netcdf-c/pull/2278
Add support for the Zarr string type.
Zarr strings are restricted currently to be of fixed size.
The primary issue to be addressed is to provide a way for user to
specify the size of the fixed length strings. This is handled by providing
the following new attributes special:
1. **_nczarr_default_maxstrlen** —
This is an attribute of the root group. It specifies the default
maximum string length for string types. If not specified, then
it has the value of 64 characters.
2. **_nczarr_maxstrlen** —
This is a per-variable attribute. It specifies the maximum
string length for the string type associated with the variable.
If not specified, then it is assigned the value of
**_nczarr_default_maxstrlen**.

This PR also requires some hacking to handle the existing netcdf-c NC_CHAR
type, which does not exist in zarr. The goal was to choose numpy types for
both the netcdf-c NC_STRING type and the netcdf-c NC_CHAR type such that
if a pure zarr implementation read them, it would still work and an
NC_CHAR type would be handled by zarr as a string of length 1.

For writing variables and NCZarr attributes, the type mapping is as follows:
* "|S1" for NC_CHAR.
* ">S1" for NC_STRING && MAXSTRLEN==1
* ">Sn" for NC_STRING && MAXSTRLEN==n

Note that it is a bit of a hack to use endianness, but it should be ok since for
string/char, the endianness has no meaning.

For reading attributes with pure zarr (i.e. with no nczarr
atribute types defined), they will always be interpreted as of
type NC_CHAR.

## Issue: https://github.com/Unidata/netcdf-c/issues/2474
This PR partly fixes this issue because it provided more
comprehensive support for Zarr attributes that are JSON valued expressions.
This PR still does not address the problem in that issue where the
_ARRAY_DIMENSION attribute is incorrectly set. Than can only be
fixed by the creator of the datasets.

## Issue: https://github.com/Unidata/netcdf-c/issues/2485
This PR also fixes the scalar failure shown in this issue.
It generally cleans up scalar handling.
It also adds a note to the documentation describing that
NCZarr supports scalars while Zarr does not and also how
scalar interoperability is achieved.

## Misc. Other Changes
1. Convert the nczarr special attributes and keys to be all lower case. So "_NCZARR_ATTR" now used "_nczarr_attr. Support back compatibility for the upper case names.
2. Cleanup my too-clever-by-half handling of scalars in libnczarr.
2022-08-27 20:21:13 -06:00
Ward Fisher
ba37c0af9f
Merge branch 'main' into enumdfalt.dmh 2022-07-26 15:23:40 -06:00
Ward Fisher
62ae05d6d0
Merge pull request #2457 from edwardhartnett/ejh_test_quantize_3
more quantize testing and adding pre-processor constant NC_MAX_FILENAME to nc_tests.h
2022-07-25 15:59:50 -06:00
Dennis Heimbigner
65fd9fe1a5 Provide a default enum const when fill value does not match any enum const.
re: https://github.com/Unidata/netcdf-c/issues/982

It is possible to define an enum type that has no enum constant
with value zero. However, HDF5 has a default fill value of zero
that it used to fill all chunks. In the event that this situation
occurs, ncdump, say, will fail because there is no enum const
to print for the value zero.

The solution is to create a special enum constant called "_UNDEFINED"
that has the value zero. It is only used in the case that there is
no constant in the enum that already covers zero.

A test case is added in netcdf-c/ncdump to validate this solution.

Note: the changes occur primarily in libsrc4, so they also work for NCZarr.
2022-07-17 14:32:31 -06:00
Edward Hartnett
4aa319f9dc adding pre-processor constant NC_MAX_FILENAME to nc_tests.h 2022-07-08 07:27:54 -06:00
Dennis Heimbigner
31b24d767a Fix bad cmake install location 2022-07-06 15:01:23 -06:00
Dennis Heimbigner
8b0e1134b4 Ensure that netcdf_json.h does not interfere with ncjson.
re: Issue https://github.com/Unidata/netcdf-c/issues/2419

There are effectively two json subsystems in netcdf-c.
1. ncjson.[ch] in libnetcdf
2. netcdf_json.h for use by plugins so they can be built without need
   for libnetcdf.

The netcdf_json.h file is constructed from the concatenation of
ncjson.h plus ncjson.c. It turned out that in doing this, I was
leaving some symbols externally visible so that if, for some
reason, a plugin was built and needed libnetcdf, then symbol
conflicts arose.

The solution is to prefix the declarations in ncjson.[ch] with a
macro (OPTSTATIC) that can be resolved to either nothing or to
"static". Then in netcdf_json.h, it resolves to "static" and
prevents the symbol conflicts.

Note that netcdf_json.h is constructed once in
netcdf-c/include/Makefile.am with the rule named
"makepluginjson". This means that it is included in the
distribution. However, this also means that if ncjson.[ch] is
changed, then it is necessary to invoke makepluginjson
explicitly to rebuild netcdf_json.h
2022-07-05 22:03:52 -06:00
Dennis Heimbigner
f419af9204 Cleanup szip handling some more
re: https://github.com/Unidata/netcdf-c/issues/2420

nc_test4/tst_vars3.c has the wrong conditional
around the szip tests (did I do that?).
Anyway, the current test is
> #ifdef HAVE_SZ

and it should be
> #ifdef HAVE_H5Z_SZIP

because the only thing that matters is that HDF5 lib has szip support.
2022-06-22 16:40:54 -06:00
Dennis Heimbigner
aabbdbf64c Make public a limited API for programmatic access to internal .rc tables
re: https://github.com/Unidata/netcdf-c/issues/2337
re: https://github.com/Unidata/netcdf-c/issues/2407

Add two functions to netcdf.h to allow programs to get/set
selected entries into the internal .rc tables. This should fix
the above issues by allowing HTTP.CAINFO to be set to the
certificates directory.  Note that the changes should be
performed as early as possible in the program because some of
the .rc table entries may get cached internally and changing the
entry after that caching occurs may have no effect.

The new signatures are as follows:

1. Get the value of a simple .rc entry of the form "key=value".
Note that caller must free the returned value, which might be NULL.
````
char* nc_rc_get(char* const * key);

@param key table entry key
@return value if .rc table has entry of the form key=value
@return NULL if no such entry is found.
````

2. Insert/Overwrite the specified key=value pair in the .rc table.
````
int nc_rc_set(const char* key, const char* value);

@param key table entry key -- may not be NULL
@param value table entry value -- may not be NULL
@return NC_NOERR if no error
@return NC_EINVAL if error
````

Addendum:

re: https://github.com/Unidata/netcdf-c/issues/2407

Modify dhttp.c to use the .rc entry HTTP.CAINFO if defined.
2022-06-17 14:35:12 -06:00
Ward Fisher
0586b64521
Merge pull request #2335 from edwardhartnett/ejh_szip_constants
fixed missing szip constants in netcdf.h
2022-05-17 16:45:47 -06:00
Dennis Heimbigner
5b400442ff Merge branch 'master' into jsonconvention.dmh 2022-05-09 12:43:52 -06:00
Edward Hartnett
c3d201a8b9
Merge branch 'main' into ejh_doc_4 2022-05-05 08:16:40 -06:00
Edward Hartnett
7a61c9a8d4 added netcdf_filter.h to doxygen build 2022-05-04 13:12:48 -06:00
Edward Hartnett
d1cbd60960 fixed missing szip constants in netcdf.h 2022-05-04 09:48:46 -06:00
Edward Hartnett
14e80b4673 fixing doxygen warnings 2022-05-03 09:41:45 -06:00
Dennis Heimbigner
444024a7be Merge branch 'master' into jsonconvention.dmh 2022-05-01 13:16:58 -06:00
Dennis Heimbigner
126b3f9423 Support installation of filters into user-specified location
re: https://github.com/Unidata/netcdf-c/issues/2294

Ed Hartnett suggested that the netcdf library installation process
be extended to install the standard filters into a user specified
location. The user can then set HDF5_PLUGIN_PATH to that location.

This PR provides that capability using:
````
configure option: --with-plugin-dir=<absolute directory path>
cmake option: -DPLUGIN_INSTALL_DIR=<absolute directory path>
````

Currently, the following plugins are always installed, if
available: bzip2, zstd, blosc.
If NCZarr is enabled, then additional plugins are installed:
fletcher32, shuffle, deflate, szip.

Additionally, the necessary codec support is installed
for each of the above filters that is installed.

## Changes:
1. Cleanup handling of built-in bzip2.
2. Add documentation to docs/filters.md
3. Re-factor the NCZarr codec libraries
4. Add a test, although it can only be exercised after
   the library is installed, so it cannot be used during
   normal testing.
5. Cleanup use of HDF5_PLUGIN_PATH in the filter test cases.
2022-04-29 14:31:55 -06:00
Dennis Heimbigner
2856ee751d restore 2022-04-29 12:36:33 -06:00
Dennis Heimbigner
94db4d7a56 ckp 2022-04-29 12:04:27 -06:00
Dennis Heimbigner
ad62ed2d41 ckp 2022-04-26 17:58:20 -06:00
Ward Fisher
f37313d1cf
Merge pull request #2309 from edwardhartnett/ejh_summary
added BENCHMARKS to the build summary
2022-04-26 12:37:43 -06:00
Ward Fisher
248e263d0e
Merge pull request #2289 from mjwoods/mingw-w64-static-tests
Fix dll exports for ncxml
2022-04-26 11:03:43 -06:00
Edward Hartnett
e723b1d570 added BENCHMARKS to the summary 2022-04-26 06:18:52 -06:00
Edward Hartnett
e3f305908e fixed parallel functions for netcdf-fortran build 2022-04-24 05:41:14 -06:00
Ward Fisher
982b258c46 Merge branch 'dimscale_attachement_optional' of https://github.com/gsjaardema/netcdf-c into gh2161.wif 2022-04-19 11:06:34 -06:00
Milton Woods
f546d95aa2 Fix dll exports for ncxml 2022-04-12 19:16:58 +10:00
Edward Hartnett
57365d4b47 added ZSTD to netcdf_meta.h and libnetcdf.settings 2022-04-11 08:03:24 -06:00
Dennis Heimbigner
9f78be8bb8 Allow the read/write of JSON-valued Zarr attributes.
A number of other packages that read/write Zarr insert
attributes whose value is a dictionary containing specialized
information.  An example is the GDAL Driver convention (see
https://gdal.org/drivers/raster/zarr.html).

In order to handle such attributes, this PR enforces a special
convention. It applies to both pure Zarr an NCZarr format as
written by the netdf-c library.

The convention is as follows:

## Reading
Suppose an attribute is read from *.zattrs* and it has a JSON
value that is a a dictionary.  In this case, the JSON dictionary
is converted to a string value.  It then appears in the netcdf-c
API as if it is a character valued attribute of the same name,
and whose value is the "stringified" dictionary.

# Writing
Suppose an attribute is of type character and its *value* *looks like*
a JSON dictionary. In this case, it is parsed to JSON
and written as the value of the attribute in the NCZarr file.
Here the *value* is the concatenation of all the characters
in the attributes netcdf-c value.
The term "looks like" means that the *value*'s first character is
"{", its last value is "}", and it can be successfully parsed
by a JSON parser.

A test case, *nczarr_test/run_jsonconventions.sh* was also added.

## Misc. Unrelated Changes

1. Fix an error in nc_test4/tst_broken_files.c
2. Modify the internal JSON parser API.
3. Modify the nczarr_test/zisjson program is modified to support
   this convention.
2022-04-06 18:22:59 -06:00
Ward Fisher
3446aa0c13 Merge branch 'winutf8.dmh' of https://github.com/DennisHeimbigner/netcdf-c into gh2222.wif 2022-04-05 10:46:22 -06:00
wkliao
01efbd79cf avoid type define MPI_Comm and MPI_Info
Also define NC_MPI_INFO only when parallel I/O is enabled.
2022-04-01 23:10:19 -05:00