Commit Graph

29 Commits

Author SHA1 Message Date
Dennis Heimbigner
f1506d552e Change (again), and hopefully simplify, the file model inference algorithm.
* For URL paths, the new approach essentially centralizes all information
  in the URL into the "#mode=" fragment key and uses that value
  to determine the dispatcher for (most) URLs.

* The new approach has the following steps:

  1. canonicalize the path if it is a URL.
  2. use the mode= fragment key to determine the dispatcher
  3. if dispatcher still not determined, then use the mode flags
     argument to nc_open/nc_create to determine the dispatcher.
  4. if the path points to something readable, attempt to read the
     magic number at the front, and use that to determine the dispatcher.
     this case may override all previous cases.

* Misc changes.

  1. Update documentation
  2. Moved some unit tests from libdispatch to unit_test directory.
  3. Fixed use of wrong #ifdef macro in test_filter_reg.c
     [I think this may fix an previously reported esupport query].
2019-09-29 12:59:28 -06:00
edwardhartnett
2cd228bcd4 porting changes from other PR 2019-09-16 11:28:18 -06:00
Ed Hartnett
cd79569e45 removed refcount from nc.h 2019-07-04 15:46:47 -06:00
Dennis Heimbigner
6934aa2e8b Thread safety: step 1: cleanup
re: https://github.com/Unidata/netcdf-c/issues/1373 (partial)

* Mark some global constants be const to indicate to make them easier to track.
* Hide direct access to the ncrc_globalstate behind a function call.
* Convert dispatch tables to constants (except the user defined ones)
  This has some consequences in terms of function arguments needing to be marked
  as const also.
* Remove some no longer needed global fields
* Aggregate all the globals in nclog.c
* Uniformly replace nc_sizevector{0,1} with NC_coord_{zero,one}
* Uniformly replace nc_ptrdffvector1 with NC_stride_one
* Remove some obsolete code
2019-03-30 14:06:20 -06:00
Dennis Heimbigner
bf2746b8ea Provide byte-range reading of remote datasets
re: issue https://github.com/Unidata/netcdf-c/issues/1251

Assume that you have the URL to a remote dataset
which is a normal netcdf-3 or netcdf-4 file.

This PR allows the netcdf-c to read that dataset's
contents as a netcdf file using HTTP byte ranges
if the remote server supports byte-range access.

Originally, this PR was set up to access Amazon S3 objects,
but it can also access other remote datasets such as those
provided by a Thredds server via the HTTPServer access protocol.
It may also work for other kinds of servers.

Note that this is not intended as a true production
capability because, as is known, this kind of access to
can be quite slow. In addition, the byte-range IO drivers
do not currently do any sort of optimization or caching.

An additional goal here is to gain some experience with
the Amazon S3 REST protocol.

This architecture and its use documented in
the file docs/byterange.dox.

There are currently two test cases:

1. nc_test/tst_s3raw.c - this does a simple open, check format, close cycle
   for a remote netcdf-3 file and a remote netcdf-4 file.
2. nc_test/test_s3raw.sh - this uses ncdump to investigate some remote
   datasets.

This PR also incorporates significantly changed model inference code
(see the superceded PR https://github.com/Unidata/netcdf-c/pull/1259).

1. It centralizes the code that infers the dispatcher.
2. It adds support for byte-range URLs

Other changes:

1. NC_HDF5_finalize was not being properly called by nc_finalize().
2. Fix minor bug in ncgen3.l
3. fix memory leak in nc4info.c
4. add code to walk the .daprc triples and to replace protocol=
   fragment tag with a more general mode= tag.

Final Note:
Th inference code is still way too complicated. We need to move
to the validfile() model used by netcdf Java, where each
dispatcher is asked if it can process the file. This decentralizes
the inference code. This will be done after all the major new
dispatchers (PIO, Zarr, etc) have been implemented.
2019-01-01 18:27:36 -07:00
Ward Fisher
5be0126920 More standardizing of the copyright stanza. 2018-12-06 14:13:56 -07:00
Wei-keng Liao
2e85316dbb Improve parallel create/open mode logic.
1. When model is detected as NC_FORMATX_NC3 and is called from
   nc_create_par, change the model to NC_FORMATX_PNETCDF.
2. When called from nc_create() or nc_open(), using NC_MPIIO or
   NC_MPIPOSIX is considered invalid.
3. Handle the case when NETCDF4 is not enabled but cmode/omode
   contains NC_NETCDF4.
4. Handle the case when PNETCDF is not enabled but cmode/omode
   contains NC_PNETCDF.
5. Correct comments about PnetCDF only handles CDF-5 files.
6. Add a check for MPI_ERR_NO_SUCH_FILE error class.

Make NC_check_file_type() static, as it is used in dfile.c only.
2018-09-17 17:18:48 -05:00
Dennis Heimbigner
8cb1fc4cfe This is the second step in refactoring the libsrc4 code.
The first was branch newhash0.dmh.

As with newhash0.dmh, these changes should be transparent.
2018-02-24 20:36:24 -07:00
Dennis Heimbigner
9983b9d911 re e-support UBS-599337
re pull request https://github.com/Unidata/netcdf-c/pull/405
re pull request https://github.com/Unidata/netcdf-c/pull/446

Notes:
1. This branch is a cleanup of the magic.dmh branch.
2. magic.dmh was originally merged, but caused problems with parallel IO.
   It was re-issued as pull request https://github.com/Unidata/netcdf-c/pull/446.
3. This branch + pull request replace any previous pull requests and magic.dmh branch.

Given an otherwise valid netCDF file that has a corrupted header,
the netcdf library currently crashes. Instead, it should return
NC_ENOTNC.

Additionally, the NC_check_file_type code does not do the
forward search required by hdf5 files. It currently only looks
at file position 0 instead of 512, 1024, 2048,... Also, it turns
out that the HDF4 magic number is assumed to always be at the
beginning of the file (unlike HDF5).
The change is localized to libdispatch/dfile.c See
https://support.hdfgroup.org/release4/doc/DSpec_html/DS.pdf

Also, it turns out that the code in NC_check_file_type is duplicated
(mostly) in the function libsrc4/nc4file.c#nc_check_for_hdf.

This branch does the following.
1. Make NC_check_file_type return NC_ENOTNC instead of crashing.
2. Remove nc_check_for_hdf and centralize all file format checking
   NC_check_file_type.
3. Add proper forward search for HDF5 files (but not HDF4 files)
   to look for the magic number at offsets of 0, 512, 1024...
4. Add test tst_hdf5_offset.sh. This tests that hdf5 files with
   an offset are properly recognized. It does so by prefixing
   a legal file with some number of zero bytes: 512, 1024, etc.
5. Off-topic: Added -N flag to ncdump to force a specific output dataset name.
2017-10-24 16:25:09 -06:00
Ward Fisher
8dddd222a3 Merged master, DAP4 support into branch. 2017-04-19 09:29:35 -06:00
Wei-keng Liao
08aef9af5d remove special treatments for arm architecture 2017-03-11 13:33:58 -06:00
Dennis Heimbigner
3db4f013bf Primary change: add dap4 support
Specific changes:
1. Add dap4 code: libdap4 and dap4_test.
   Note that until the d4ts server problem is solved, dap4 is turned off.
2. Modify various files to support dap4 flags:
	configure.ac, Makefile.am, CMakeLists.txt, etc.
3. Add nc_test/test_common.sh. This centralizes
   the handling of the locations of various
   things in the build tree: e.g. where is
   ncgen.exe located. See nc_test/test_common.sh
   for details.
4. Modify .sh files to use test_common.sh
5. Obsolete separate oc2 by moving it to be part of
   netcdf-c. This means replacing code with netcdf-c
   equivalents.
5. Add --with-testserver to configure.ac to allow
   override of the servers to be used for --enable-dap-remote-tests.
6. There were multiple versions of nctypealignment code. Try to
   centralize in libdispatch/doffset.c and include/ncoffsets.h
7. Add a unit test for the ncuri code because of its complexity.
8. Move the findserver code out of libdispatch and into
   a separate, self contained program in ncdap_test and dap4_test.
9. Move the dispatch header files (nc{3,4}dispatch.h) to
   .../include because they are now shared by modules.
10. Revamp the handling of TOPSRCDIR and TOPBUILDDIR for shell scripts.
11. Make use of MREMAP if available
12. Misc. minor changes e.g.
	- #include <config.h> -> #include "config.h"
	- Add some no-install headers to /include
	- extern -> EXTERNL and vice versa as needed
	- misc header cleanup
	- clean up checking for misc. unix vs microsoft functions
13. Change copyright decls in some files to point to LICENSE file.
14. Add notes to RELEASENOTES.md
2017-03-08 17:01:10 -07:00
Ward Fisher
7edb08977a Added platform checks for ARM. 2017-02-03 11:19:39 -07:00
Wei-keng Liao
dbb9be59d2 temporally disable __arm__ for testing 2016-10-09 22:21:32 -05:00
Dennis Heimbigner
11a259ad86 Add provenance info for netcdf-4 files.
This consists of a persistent attribute named
_NCProperties plus two computed attributes
_IsNetcdf4 and _SuperblockVersion.
See the 'Provenance Attributes' section
of docs/attribute_conventions.md for details.
2016-05-07 14:32:07 -06:00
Dennis Heimbigner
a8ff523677 ckp 2016-04-06 14:05:58 -06:00
Ward Fisher
bd92caa176 More changes towards fixing the issue where the file is read incorrectly. 2015-12-28 17:36:05 +00:00
dmh
deeca5fb83 disable refcounting until semantics are decided 2014-03-07 14:56:52 -07:00
dmh
e3545bf88d 1. Allow files to be opened by name multiple
times, but returning the same ncid.
2. Separate out the dap auth tests
   and make them disabled by default.
3. turn of ncdap_test/test_varm3 until
   we can find a copy of coads_climatology.nc
2014-03-07 12:04:38 -07:00
dmh
582410a407 [NCF-273]/HZY-708311
Add a new function called nc_inq_format_extended that
returns more detailed format information (vis-a-vis
nc_inq_format) about an open dataset.

Note that the netcdf API will present the file as if it had
the format specified by nc_inq_format.  The true file
format, however, may not even be a netcdf file; it might be
DAP, HDF4, or PNETCDF, for example. This function returns
that true file type.  It also returns the effective mode for
the file.

signature: nc_inq_format_extended(int ncid, int* formatp, int* modep)
where
* ncid is the NetCDF ID from a previous call to nc_open() or
  nc_create().
* formatp is a pointer to a location for returned true format.
* modep is a pointer to a location for returned mode flags.

Refer to the actual list in the file netcdf.h to see the
currently defined set.

Also added test cases (tst_formatx*).
2013-12-22 12:53:20 -07:00
Dennis Heimbigner
5ca78309cc The effect of this change is to make the struct NC structure
contain as little file-type specific info as possible.  It
modifies especially libsrc so that all of the netcdf-3 data
that used to be in struct NC is now kept in a separate chunk
of data pointed to by the struct NC. This makes all of
current protocols consistent: netcdf-3, netcdf-4, and dap.
2012-09-06 19:44:03 +00:00
Russ Rew
4d6f11288b Fix Coverity complaint about passing big parameter by value 2012-08-17 22:00:36 +00:00
Dennis Heimbigner
bc3a732e25 fix e-support YOP-841363 plus a number of coverity found bugs 2012-08-02 18:43:21 +00:00
Dennis Heimbigner
93f722f594 accidental variable declaration in netcdf.h 2012-03-26 17:03:09 +00:00
Dennis Heimbigner
7e27052f87 - Implemented diskless files for both netcdf classic and extended.
The in-memory files can be made persistent if nc_create is called with
  NC_DISKLESS|NC_WRITE flags set. Initial test case also included.
- Modified ncio mechanism to support
  multiple ncio packages; this is so we
  can have posixio and memio operating
  at the same time.
- cleanup up a bunch of lint issues (unused variables, etc).
2012-03-26 01:34:32 +00:00
Dennis Heimbigner
d10a2605ce moved librpc/ from major branch to trunk 2011-10-23 20:17:56 +00:00
Dennis Heimbigner
d7790e7e7e 2011-09-16 18:36:08 +00:00
Ed Hartnett
d58c18c623 added diskless files API, subsetting not working, classic model only 2011-08-16 21:04:33 +00:00
Ed Hartnett
965a3aac70 minor refactor of the build system to work better for cross-compiling 2011-03-15 10:19:08 +00:00