mirror of https://github.com/Unidata/netcdf-c.git synced 2025-01-30 16:10:44 +08:00

Go to file

Greg Sjaardema 1db3d07beb Proof-of-Concept: Avoid N^2 behavior in NC4_inq_dim The current library seems to have some behavior which is N^2 in the number of vars in a file. The `NC4_inq_dim` routine calls down to `nc4_find_dim_len` which iterates through each `var` in the file/group and calls `find_var_dim_max_length` on each var and finds the largest length of the dim on each of those vars. This is done only for unlimited vars. I have a file with 129 dim and 1630 vars. The unlimited dimension is of length 41. In my test program, I am reading data from 4 files which have the same dim and var count and reading every 4th time step (unlimited dimension). If I run a profile, I see that 98.2% of the program time is in the `nc_get_vara_float` call tree and most of that is in `find_var_dim_max_length` (94.8%). There are 66,142 calls to `nc_get_vara_float` resulting in 107,307,290 calls to `find_var_dim_max_length` with twice that number of calls to `malloc/free` and calls to 5 HDF5 routines. All of this, at least in my case, to return the same `41` each time. The proof of concept patch here will check whether the file is read-only (or no_write) and if so, it will cache the value of the dim length the first time it is calculated. With this change, my example run is sped up by a factor of 60. The time for `NC4_inq_dim` and below drops from 97.2% down to 2.7%. I'm not sure whether this is the correct fix, or if there is some behavior that I am overlooking, but my users would definitely like a 10 second run compared to a 10 minute run... This is on current Netcdf master branch. I will try to attach some valgrind/callgrind profiles.		2020-04-30 11:01:10 -06:00
.github
cmake
conda.recipe
ctest_scripts	Added a ctest script with DAP tests enabled.	2020-02-11 15:09:29 -07:00
dap4_test	fixed distclean target in dap4_test	2020-01-23 04:40:37 -07:00
debug	force github checks restart	2020-03-29 14:50:28 -06:00
docs	Tweaked docs to fix dead references introduced as part of separating out NUG from netCDF-C.	2020-03-27 14:21:25 -06:00
examples	Add support for multiple filters per variable.	2020-02-16 12:59:33 -07:00
h5_test	fixed missing declaration	2020-04-23 23:32:29 -05:00
hdf4_test
include	Merge remote-tracking branch 'upstream/master'	2020-04-23 15:36:14 -05:00
libdap2	Fix conflicts with master	2020-02-27 14:06:45 -07:00
libdap4	Use proper CURLOPT values for VERIFYHOST and VERIFYPEER	2020-04-10 13:42:27 -06:00
libdispatch	cleanup	2020-04-15 05:53:59 -06:00
libhdf4	Fix reclamation of the ->format_XXX_info fields	2020-03-29 12:48:59 -06:00
libhdf5	Proof-of-Concept: Avoid N^2 behavior in NC4_inq_dim	2020-04-30 11:01:10 -06:00
liblib	Updated so version info in line with guidelines found at https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html	2020-03-26 11:26:10 -06:00
libsrc	Fix conflicts with master	2020-02-27 14:06:45 -07:00
libsrc4	Fix reclamation of the ->format_XXX_info fields	2020-03-29 12:48:59 -06:00
libsrcp	Fix conflicts with master	2020-02-27 14:06:45 -07:00
nc_perf	fix for memory leak due to HDF5 types	2020-02-09 11:47:13 -07:00
nc_test	whitespace cleanup of test	2020-04-15 06:10:12 -06:00
nc_test4	Fix missing forward declarations	2020-04-03 20:15:34 -06:00
ncdap_test
ncdump	Correcting a formatting error for scalars when dumping with ncdump -f	2020-04-28 15:49:03 -06:00
ncgen	Make utilities support NC_COMPACT	2020-02-29 12:06:21 -07:00
ncgen3
nctest
NUG	Use proper CURLOPT values for VERIFYHOST and VERIFYPEER	2020-04-10 13:42:27 -06:00
oc2	Use proper CURLOPT values for VERIFYHOST and VERIFYPEER	2020-04-10 13:42:27 -06:00
plugins	Fix missing forward declarations	2020-04-03 20:15:34 -06:00
unit_test	fixed warning	2020-03-02 16:29:52 -07:00
.gitignore	Shuffling NUG and documentation.	2020-02-06 16:14:25 -07:00
.travis.yml
acinclude.m4
appveyor.yml
bootstrap
cmake_uninstall.cmake.in
CMakeInstallation.cmake
CMakeLists.txt	Use proper CURLOPT values for VERIFYHOST and VERIFYPEER	2020-04-10 13:42:27 -06:00
COMPILE.cmake.txt
config.h.cmake.in	Use proper CURLOPT values for VERIFYHOST and VERIFYPEER	2020-04-10 13:42:27 -06:00
config.h.cmake.in.old-works
configure.ac	Use proper CURLOPT values for VERIFYHOST and VERIFYPEER	2020-04-10 13:42:27 -06:00
COPYRIGHT
CTestConfig.cmake.in
CTestCustom.cmake
dods.m4
FixBundle.cmake.in
INSTALL.md
lib_flags.am
libnetcdf.settings.in	Merge pull request #1619 from NOAA-GSD/ejh_more_szip	2020-02-06 12:42:27 -07:00
Makefile.am	Add support for multiple filters per variable.	2020-02-16 12:59:33 -07:00
mclean
nc-config.cmake.in
nc-config.in
netcdf.pc.in
netCDFConfig.cmake.in	Correct typo.	2020-01-24 16:53:16 -07:00
PostInstall.cmake
postinstall.sh.in
README.md	Correcting dead link to installation	2020-04-24 16:44:07 -06:00
RELEASE_NOTES.md	Updated release notes.	2020-04-28 15:52:40 -06:00
test_common.in
test_prog.c
test-driver-verbose
wjna

README.md

Unidata NetCDF

About

The Unidata network Common Data Form (netCDF) is an interface for scientific data access and a freely-distributed software library that provides an implementation of the interface. The netCDF library also defines a machine-independent format for representing scientific data. Together, the interface, library, and format support the creation, access, and sharing of scientific data. The current netCDF software provides C interfaces for applications and data. Separate software distributions available from Unidata provide Java, Fortran, Python, and C++ interfaces. They have been tested on various common platforms.

Properties

NetCDF files are self-describing, network-transparent, directly accessible, and extendible. Self-describing means that a netCDF file includes information about the data it contains. Network-transparent means that a netCDF file is represented in a form that can be accessed by computers with different ways of storing integers, characters, and floating-point numbers. Direct-access means that a small subset of a large dataset may be accessed efficiently, without first reading through all the preceding data. Extendible means that data can be appended to a netCDF dataset without copying it or redefining its structure.

Use

NetCDF is useful for supporting access to diverse kinds of scientific data in heterogeneous networking environments and for writing application software that does not depend on application-specific formats. For information about a variety of analysis and display packages that have been developed to analyze and display data in netCDF form, see

Software for Manipulating or Displaying NetCDF Data

More information

For more information about netCDF, see

Unidata Network Common Data Form (NetCDF)

Latest releases

You can obtain a copy of the latest released version of netCDF software for various languages:

Copyright

Installation

To install the netCDF-C software, please see the file INSTALL in the netCDF-C distribution, or the (usually more up-to-date) document:

Building NetCDF

Documentation

A language-independent User's Guide for netCDF, and some other language-specific user-level documents are available from:

A mailing list, netcdfgroup@unidata.ucar.edu, exists for discussion of the netCDF interface and announcements about netCDF bugs, fixes, and enhancements. For information about how to subscribe, see the URL

Unidata netCDF Mailing-Lists

Feedback

We appreciate feedback from users of this package. Please send comments, suggestions, and bug reports to support-netcdf@unidata.ucar.edu.