1db3d07beb
The current library seems to have some behavior which is N^2 in the number of vars in a file. The `NC4_inq_dim` routine calls down to `nc4_find_dim_len` which iterates through each `var` in the file/group and calls `find_var_dim_max_length` on each var and finds the largest length of the dim on each of those vars. This is done only for unlimited vars. I have a file with 129 dim and 1630 vars. The unlimited dimension is of length 41. In my test program, I am reading data from 4 files which have the same dim and var count and reading every 4th time step (unlimited dimension). If I run a profile, I see that 98.2% of the program time is in the `nc_get_vara_float` call tree and most of that is in `find_var_dim_max_length` (94.8%). There are 66,142 calls to `nc_get_vara_float` resulting in 107,307,290 calls to `find_var_dim_max_length` with twice that number of calls to `malloc/free` and calls to 5 HDF5 routines. All of this, at least in my case, to return the same `41` each time. The proof of concept patch here will check whether the file is read-only (or no_write) and if so, it will cache the value of the dim length the first time it is calculated. With this change, my example run is sped up by a factor of 60. The time for `NC4_inq_dim` and below drops from 97.2% down to 2.7%. I'm not sure whether this is the correct fix, or if there is some behavior that I am overlooking, but my users would definitely like a 10 second run compared to a 10 minute run... This is on current Netcdf master branch. I will try to attach some valgrind/callgrind profiles. |
||
---|---|---|
.github | ||
cmake | ||
conda.recipe | ||
ctest_scripts | ||
dap4_test | ||
debug | ||
docs | ||
examples | ||
h5_test | ||
hdf4_test | ||
include | ||
libdap2 | ||
libdap4 | ||
libdispatch | ||
libhdf4 | ||
libhdf5 | ||
liblib | ||
libsrc | ||
libsrc4 | ||
libsrcp | ||
nc_perf | ||
nc_test | ||
nc_test4 | ||
ncdap_test | ||
ncdump | ||
ncgen | ||
ncgen3 | ||
nctest | ||
NUG | ||
oc2 | ||
plugins | ||
unit_test | ||
.gitignore | ||
.travis.yml | ||
acinclude.m4 | ||
appveyor.yml | ||
bootstrap | ||
cmake_uninstall.cmake.in | ||
CMakeInstallation.cmake | ||
CMakeLists.txt | ||
COMPILE.cmake.txt | ||
config.h.cmake.in | ||
config.h.cmake.in.old-works | ||
configure.ac | ||
COPYRIGHT | ||
CTestConfig.cmake.in | ||
CTestCustom.cmake | ||
dods.m4 | ||
FixBundle.cmake.in | ||
INSTALL.md | ||
lib_flags.am | ||
libnetcdf.settings.in | ||
Makefile.am | ||
mclean | ||
nc-config.cmake.in | ||
nc-config.in | ||
netcdf.pc.in | ||
netCDFConfig.cmake.in | ||
PostInstall.cmake | ||
postinstall.sh.in | ||
README.md | ||
RELEASE_NOTES.md | ||
test_common.in | ||
test_prog.c | ||
test-driver-verbose | ||
wjna |
Unidata NetCDF
About
The Unidata network Common Data Form (netCDF) is an interface for scientific data access and a freely-distributed software library that provides an implementation of the interface. The netCDF library also defines a machine-independent format for representing scientific data. Together, the interface, library, and format support the creation, access, and sharing of scientific data. The current netCDF software provides C interfaces for applications and data. Separate software distributions available from Unidata provide Java, Fortran, Python, and C++ interfaces. They have been tested on various common platforms.
Properties
NetCDF files are self-describing, network-transparent, directly
accessible, and extendible. Self-describing
means that a netCDF file
includes information about the data it contains. Network-transparent
means that a netCDF file is represented in a form that can be accessed
by computers with different ways of storing integers, characters, and
floating-point numbers. Direct-access
means that a small subset of a
large dataset may be accessed efficiently, without first reading through
all the preceding data. Extendible
means that data can be appended to
a netCDF dataset without copying it or redefining its structure.
Use
NetCDF is useful for supporting access to diverse kinds of scientific data in heterogeneous networking environments and for writing application software that does not depend on application-specific formats. For information about a variety of analysis and display packages that have been developed to analyze and display data in netCDF form, see
More information
For more information about netCDF, see
Latest releases
You can obtain a copy of the latest released version of netCDF software for various languages:
Copyright
Copyright and licensing information can be found here, as well as in the COPYRIGHT file accompanying the software
Installation
To install the netCDF-C software, please see the file INSTALL in the netCDF-C distribution, or the (usually more up-to-date) document:
Documentation
A language-independent User's Guide for netCDF, and some other language-specific user-level documents are available from:
- Language-independent User's Guide
- NetCDF-C Tutorial
- Fortran-90 User's Guide
- Fortran-77 User's Guide
- netCDF-Java/Common Data Model library
- netCDF4-python
A mailing list, netcdfgroup@unidata.ucar.edu, exists for discussion of the netCDF interface and announcements about netCDF bugs, fixes, and enhancements. For information about how to subscribe, see the URL
Feedback
We appreciate feedback from users of this package. Please send comments, suggestions, and bug reports to support-netcdf@unidata.ucar.edu.