The file docs/indexing.dox tries to provide design
information for the refactoring.
The primary change is to replace all walking of linked
lists with the use of the NCindex data structure.
Ncindex is a combination of a hash table (for name-based
lookup) and a vector (for walking the elements in the index).
Additionally, global vectors are added to NC_HDF5_FILE_INFO_T
to support direct mapping of an e.g. dimid to the NC_DIM_INFO_T
object. These global vectors exist for dimensions, types, and groups
because they have globally unique id numbers.
WARNING:
1. since libsrc4 and libsrchdf4 share code, there are also
changes in libsrchdf4.
2. Any outstanding pull requests that change libsrc4 or libhdf4
are likely to cause conflicts with this code.
3. The original reason for doing this was for performance improvements,
but as noted elsewhere, this may not be significant because
the meta-data read performance apparently is being dominated
by the hdf5 library because we do bulk meta-data reading rather
than lazy reading.
In non-classic netcdf-4 models, it is allowable to have
large numbers of dims and vars. In many operations, the
entire list of dims or vars is searched for a dim/var matching
a specific name which results in *lots* of strncmp or strcmp
calls.
If we add a hash field to the var and dim structs similar to what
has already been done for the netcdf-3 formats, then we can hash the
name being searched for and numerically compare that value with
the var/dim hash value. If they match, then do a more expensive
strncmp call to ensure that the names truly match.
to clean up resources properly on failure.
Refactored doubly-linked list code for objects in the libsrc4 directory,
cleaning up the add/del routines, breaking out the common next/prev
pointers into a struct and extracting the add/del operations on them,
changed the list of dims to add new dims in the same order as the other
types, made all add routines able to optionally return a pointer to the
newly created object.
Removed some dead code (pg_var(), nc4_pg_var1(), nc4_pg_varm(), misc. small
routines, etc)
Fixed fill value handling for string types in nc4_get_vara().
Changed many malloc()+strcpy() pairs into calls to strdup().
Cleaned up misc. other minor Coverity issues.
many cleanups to fix compiler warnings, streamline iteration over objects
in HDF5 file when opening the file, and generally straightening out the code
to be cleaner and simpler.
Tested on Mac OS/X with gcc 4.8 and OpenMPI (which uses clang).
with parallel-netcdf, to fully implement parallel-netcdf support for
other functions, and to prevent a hang in hdf5 from an eary return in
an nc4_put_vara() call. Also fixed an nccopy bug when
nc_inq_var_deflate() returns defalate_level of 0, but says the variable
is deflated.
contain as little file-type specific info as possible. It
modifies especially libsrc so that all of the netcdf-3 data
that used to be in struct NC is now kept in a separate chunk
of data pointed to by the struct NC. This makes all of
current protocols consistent: netcdf-3, netcdf-4, and dap.