Commit Graph

28 Commits

Author SHA1 Message Date
Edward Hartnett
5982b1c5ef change to trigger CI 2022-04-27 09:52:00 -06:00
Edward Hartnett
01bbc0471d declared a variable outside for loop 2022-04-26 14:39:51 -06:00
Greg Sjaardema
6f65754357
Avoid infinite loop for finding large prime values
If the `val` passed to `findPrimeGreaterThan` is greater than the largest value (not the sentinel) in the `NC_primes`, then the routine will fall into an infinite loop.   Modified to call an external routine that brute forces the finding of a prime larger than the value in this case.  

The brute force routine uses the primes in `NC_primes` table in the prime test, so this will fail if given a `value > 180503 * 180503`.   The `isPrime` function could be rewritten to avoid this, but assuming this won't happen for the forseeable future.  If it does happen, `isPrime` will return that any value larger than this is prime...
2021-10-14 15:55:04 -06:00
Dennis Heimbigner
b5d4afd1d5 Patch errors
## Examine and fix ezxml errors

re: Issue https://github.com/Unidata/netcdf-c/issues/2119

Multiple security issues were found in ezxml (see above Issue).

* CVE-2021-31598
* CVE-2021-31348 / CVE-2021-31347
* CVE-2021-31229
* CVE-2021-30485
* CVE-2021-26222
* CVE-2021-26221
* CVE-2021-26220
* CVE-2019-20202
* CVE-2019-20201
* CVE-2019-20200
* CVE-2019-20199
* CVE-2019-20198
* CVE-2019-20007
* CVE-2019-20006
* CVE-2019-20005

In addition, moved ezxml to libdispatch.

## Examine and fix selected  oss-fuzz detected errors

Note that most of these errors are in the libsrc .m4 generated
code so fixing them is difficult. It would nice if we could tell
oss-fuzz to skip those files. They are old and crufty and
probably need a complete refactor.

Issue|Status
-----|------
35382|Fixed; old bug
35398|Closed by OSS-Fuzz
35442|Guarantee alloc > 0 or error; Old bug
35721|Assert failure; ok
35992|Fixed; old bug
36038|Fixed; old bug
36129|Unfixed; old bug
36229|Fixed by adding assert; old bug
37476|Unfixed; old bug
37824|Assert Failure; ok
38300|Closed by OSS-Fuzz
38537|Unfixed; old bug
38658|Unfixed; old bug
38699|Fixed maybe; old bug
38772|Nature of error is unclear, suspect that it results from using too large a type.
39248|Need more information
39394|Unfixed; old bug
2021-10-12 14:03:48 -06:00
Dennis Heimbigner
eb3d9eb0c9 Provide a Number of fixes/improvements to NCZarr
Primary changes:
* Add an improved cache system to speed up performance.
* Fix NCZarr to properly handle scalar variables.

Misc. Related Changes:
* Added unit tests for extendible hash and for the generic cache.
* Add config parameter to set size of the NCZarr cache.
* Add initial performance tests but leave them unused.
* Add CRC64 support.
* Move location of ncdumpchunks utility from /ncgen to /ncdump.
* Refactor auth support.

Misc. Unrelated Changes:
* More cleanup of the S3 support
* Add support for S3 authentication in .rc files: HTTP.S3.ACCESSID and HTTP.S3.SECRETKEY.
* Remove the hashkey from the struct OBJHDR since it is never used.
2020-11-19 17:01:04 -07:00
Dennis Heimbigner
c7cf0d3807 Fix LGTM errors 2020-06-28 19:07:08 -06:00
Dennis Heimbigner
59e04ae071 This PR adds EXPERIMENTAL support for accessing data in the
cloud using a variant of the Zarr protocol and storage
format. This enhancement is generically referred to as "NCZarr".

The data model supported by NCZarr is netcdf-4 minus the user-defined
types and the String type. In this sense it is similar to the CDF-5
data model.

More detailed information about enabling and using NCZarr is
described in the document NUG/nczarr.md and in a
[Unidata Developer's blog entry](https://www.unidata.ucar.edu/blogs/developer/en/entry/overview-of-zarr-support-in).

WARNING: this code has had limited testing, so do use this version
for production work. Also, performance improvements are ongoing.
Note especially the following platform matrix of successful tests:

Platform | Build System | S3 support
------------------------------------
Linux+gcc      | Automake     | yes
Linux+gcc      | CMake        | yes
Visual Studio  | CMake        | no

Additionally, and as a consequence of the addition of NCZarr,
major changes have been made to the Filter API. NOTE: NCZarr
does not yet support filters, but these changes are enablers for
that support in the future.  Note that it is possible
(probable?) that there will be some accidental reversions if the
changes here did not correctly mimic the existing filter testing.

In any case, previously filter ids and parameters were of type
unsigned int. In order to support the more general zarr filter
model, this was all converted to char*.  The old HDF5-specific,
unsigned int operations are still supported but they are
wrappers around the new, char* based nc_filterx_XXX functions.
This entailed at least the following changes:
1. Added the files libdispatch/dfilterx.c and include/ncfilter.h
2. Some filterx utilities have been moved to libdispatch/daux.c
3. A new entry, "filter_actions" was added to the NCDispatch table
   and the version bumped.
4. An overly complex set of structs was created to support funnelling
   all of the filterx operations thru a single dispatch
   "filter_actions" entry.
5. Move common code to from libhdf5 to libsrc4 so that it is accessible
   to nczarr.

Changes directly related to Zarr:
1. Modified CMakeList.txt and configure.ac to support both C and C++
   -- this is in support of S3 support via the awd-sdk libraries.
2. Define a size64_t type to support nczarr.
3. More reworking of libdispatch/dinfermodel.c to
   support zarr and to regularize the structure of the fragments
   section of a URL.

Changes not directly related to Zarr:
1. Make client-side filter registration be conditional, with default off.
2. Hack include/nc4internal.h to make some flags added by Ed be unique:
   e.g. NC_CREAT, NC_INDEF, etc.
3. cleanup include/nchttp.h and libdispatch/dhttp.c.
4. Misc. changes to support compiling under Visual Studio including:
   * Better testing under windows for dirent.h and opendir and closedir.
5. Misc. changes to the oc2 code to support various libcurl CURLOPT flags
   and to centralize error reporting.
6. By default, suppress the vlen tests that have unfixed memory leaks; add option to enable them.
7. Make part of the nc_test/test_byterange.sh test be contingent on remotetest.unidata.ucar.edu being accessible.

Changes Left TO-DO:
1. fix provenance code, it is too HDF5 specific.
2020-06-28 18:02:47 -06:00
Dennis Heimbigner
6934aa2e8b Thread safety: step 1: cleanup
re: https://github.com/Unidata/netcdf-c/issues/1373 (partial)

* Mark some global constants be const to indicate to make them easier to track.
* Hide direct access to the ncrc_globalstate behind a function call.
* Convert dispatch tables to constants (except the user defined ones)
  This has some consequences in terms of function arguments needing to be marked
  as const also.
* Remove some no longer needed global fields
* Aggregate all the globals in nclog.c
* Uniformly replace nc_sizevector{0,1} with NC_coord_{zero,one}
* Uniformly replace nc_ptrdffvector1 with NC_stride_one
* Remove some obsolete code
2019-03-30 14:06:20 -06:00
Dennis Heimbigner
bf2746b8ea Provide byte-range reading of remote datasets
re: issue https://github.com/Unidata/netcdf-c/issues/1251

Assume that you have the URL to a remote dataset
which is a normal netcdf-3 or netcdf-4 file.

This PR allows the netcdf-c to read that dataset's
contents as a netcdf file using HTTP byte ranges
if the remote server supports byte-range access.

Originally, this PR was set up to access Amazon S3 objects,
but it can also access other remote datasets such as those
provided by a Thredds server via the HTTPServer access protocol.
It may also work for other kinds of servers.

Note that this is not intended as a true production
capability because, as is known, this kind of access to
can be quite slow. In addition, the byte-range IO drivers
do not currently do any sort of optimization or caching.

An additional goal here is to gain some experience with
the Amazon S3 REST protocol.

This architecture and its use documented in
the file docs/byterange.dox.

There are currently two test cases:

1. nc_test/tst_s3raw.c - this does a simple open, check format, close cycle
   for a remote netcdf-3 file and a remote netcdf-4 file.
2. nc_test/test_s3raw.sh - this uses ncdump to investigate some remote
   datasets.

This PR also incorporates significantly changed model inference code
(see the superceded PR https://github.com/Unidata/netcdf-c/pull/1259).

1. It centralizes the code that infers the dispatcher.
2. It adds support for byte-range URLs

Other changes:

1. NC_HDF5_finalize was not being properly called by nc_finalize().
2. Fix minor bug in ncgen3.l
3. fix memory leak in nc4info.c
4. add code to walk the .daprc triples and to replace protocol=
   fragment tag with a more general mode= tag.

Final Note:
Th inference code is still way too complicated. We need to move
to the validfile() model used by netcdf Java, where each
dispatcher is asked if it can process the file. This decentralizes
the inference code. This will be done after all the major new
dispatchers (PIO, Zarr, etc) have been implemented.
2019-01-01 18:27:36 -07:00
Ward Fisher
25417880f7 Updated libdispatch/ files with copyright notice. 2018-12-06 14:29:57 -07:00
Ward Fisher
8b75fbb6d9 Corrected VS2010 compilation issue in support of https://github.com/Unidata/netcdf-c/issues/1184 . Solution provided by Mark Rivers on the NetCDF mailing list. 2018-11-08 12:37:03 -07:00
Dennis Heimbigner
f46b65c602 Clear up coverity complaints
with respect to this coverity report.

https://u2389337.ct.sendgrid.net/wf/click?upn=08onrYu34A-2BWcWUl-2F-2BfV0V05UPxvVjWch-2Bd2MGckcRZApv53Ry-2FReUME-2Fmei1et81WqdBwm5AxSAYBM9iFSmbw-3D-3D_YxTn3nq1kZ9CfXcyYFaB2DtJSVHTn8U-2BK0LV-2FDyLYKcFvnwYJ7pm2zFB5UovsEUh5miIfeCfQzYxpN9HYcdqs9sKVjtklOkfPIjEpg0Tj8G9wiptznJeX-2FO4PrqJA1ViQYtQfpjugq01XGMQ3-2Bll42dUMxrmbdYSqi8T7bCcANC5evyZs4MKvmAWXt1bnPaHcOYJDN-2FUt-2FJTlhwu91HyR7h-2FnkvMY9paYUNB1gxiyUA-3D
2018-08-04 13:22:29 -06:00
Dennis Heimbigner
6f69e5a10d Fix github issue https://github.com/Unidata/netcdf-c/issues/942 2018-04-23 11:14:03 -06:00
Dennis Heimbigner
42e8028726 Re: github issues
https://github.com/Unidata/netcdf-c/issues/917
https://github.com/Unidata/netcdf-c/issues/915

Fix following memory errors:
1. global_buffer_overflow
2. nc4_att_list_add
2018-03-29 14:57:40 -06:00
Dennis Heimbigner
1246dd189b merge master 2018-03-20 21:50:58 -06:00
Dennis Heimbigner
25f062528b This completes (for now) the refactoring of libsrc4.
The file docs/indexing.dox tries to provide design
information for the refactoring.

The primary change is to replace all walking of linked
lists with the use of the NCindex data structure.
Ncindex is a combination of a hash table (for name-based
lookup) and a vector (for walking the elements in the index).
Additionally, global vectors are added to NC_HDF5_FILE_INFO_T
to support direct mapping of an e.g. dimid to the NC_DIM_INFO_T
object. These global vectors exist for dimensions, types, and groups
because they have globally unique id numbers.

WARNING:
1. since libsrc4 and libsrchdf4 share code, there are also
   changes in libsrchdf4.
2. Any outstanding pull requests that change libsrc4 or libhdf4
   are likely to cause conflicts with this code.
3. The original reason for doing this was for performance improvements,
   but as noted elsewhere, this may not be significant because
   the meta-data read performance apparently is being dominated
   by the hdf5 library because we do bulk meta-data reading rather
   than lazy reading.
2018-03-16 11:46:18 -06:00
Dennis Heimbigner
8cb1fc4cfe This is the second step in refactoring the libsrc4 code.
The first was branch newhash0.dmh.

As with newhash0.dmh, these changes should be transparent.
2018-02-24 20:36:24 -07:00
Dennis Heimbigner
727b613459 This is the initial step in moving to the new higher performance
(I hope) metadata mechanism. This mostly just adds new pieces of
code (e.g. nclistmap) and does some minor fixes.

It should be transparent to everything else.
The next set of changes will be the big step.
2018-02-08 19:53:40 -07:00
Ward Fisher
cfb6549606 Corrected an issue reported by clang. 2017-08-15 12:03:15 -06:00
dmh
19d377e87a conflict resolution 2014-03-10 12:13:16 -06:00
dmh
9d5c8ba308 1. Previous ci of conversion fixes in libdispatch
was incomplete; complete the fix.
2. There is an inconsistency in the netcdfh interface
   about whether rank (# dimensions) is an int or
   size_t. I inadvertently assumed the latter and that
   breaks some API calls; so revert back.
2014-03-10 12:09:36 -06:00
Ward Fisher
513cb29ea4 The replacement of 'int' with 'size_t' caused a number of errors across multiple platforms in our CI system. The culprit is not obvious, so I am reverting the commit until we can take a closer look at the underlying issue.
Revert "1. Attempted to reduce the number of conversion errors"

This reverts commit 1b941d342d.
2014-03-10 10:04:37 -06:00
dmh
1b941d342d 1. Attempted to reduce the number of conversion errors
when -Wconversion is set. Fixed libdispatch, but
   rest of netcdf remains to be done.
2014-03-09 15:51:45 -06:00
Ward Fisher
f43bf8f1da Addressed a handful of issues identified by
Coverity static analysis.
2013-08-05 20:36:33 +00:00
Dennis Heimbigner
2a0d68c530 update nchashmap; move some old stuff in libdap2; fix new debug code 2012-08-19 21:12:35 +00:00
Dennis Heimbigner
16dee702b7 fix NCF-120 2011-09-15 16:57:16 +00:00
Ed Hartnett
b600df57c8 fixed file name problems 2011-07-14 21:39:44 +00:00
Dennis Heimbigner
0b477e29cc rebuilt dap constraints 2011-04-16 20:56:36 +00:00