netcdf-c/ncdump/nccopy.1

210 lines
9.4 KiB
Groff

.\" $Id: nccopy.1 400 2010-08-27 21:02:52Z russ $
.TH NCCOPY 1 "$Date$" "Printed: \n(yr-\n(mo-\n(dy" "UNIDATA UTILITIES"
.SH NAME
nccopy \- Copy a netCDF file to specified variant of netCDF format,
optionally compressing or chunking data in the output copy.
.SH SYNOPSIS
.ft B
.HP
nccopy
.nh
\%[-k \fI kind \fP]
\%[-c \fI chunkspec \fP]
\%[-d \fI n \fP]
\%[-s]
\%[-u]
\%[-m \fI bufsize \fP]
\%[-h \fI chunk_cache \fP]
\%[-e \fI cache_elems \fP]
\%\fI infile \fP
\%\fI outfile \fP
.hy
.ft
.SH DESCRIPTION
.LP
\fBnccopy\fP
copies an input netCDF file (in any of the four format variants) to an
output netCDF file, in any of the four format variants, if possible.
For example, if built with the netCDF-3 library, a netCDF classic file
may be copied to a netCDF 64-bit offset file, permitting larger
variables.
If built with the netCDF-4 library, a netCDF classic file may be
copied to a netCDF-4 file or to a netCDF-4 classic
model file as well, permitting data compression, later efficient schema changes, larger variable sizes, and use of other netCDF-4
features in case the output uses the enhanced netCDF model.
.LP
\fB nccopy \fP also serves as an example of a generic netCDF-4 program,
with its ability to read any valid netCDF file and handle nested
groups, strings, and any user-defined types, including arbitrarily
nested compound types, variable-length types, and data of any valid
netCDF-4 type. Other generic utility programs can make use of parts
of \fB nccopy \fP for more complex operations on netCDF data.
.LP
As of netCDF version 4.1, and if DAP support was enabled when
\fB nccopy \fP
was built, the file name may specify a DAP URL. This allows \fB nccopy \fP
to convert data on DAP servers to local netCDF files.
.SH OPTIONS
.IP "\fB -k \fP \fI kind \fP"
Specifies the kind of file to be created (that is, the format variant)
and, by inference,
the data model (i.e. netcdf-3 (classic) versus
netcdf-4 (enhanced)).
The possible arguments are as follows.
.RS
.RS
.IP "'1', 'classic' => netcdf classic file format, netcdf-3 type model."
.IP "'2', '64-bit-offset', '64-bit offset' => netcdf 64 bit classic file format, netcdf-3 type model."
.IP "'3', 'hdf5', 'netCDF-4', 'enhanced' => netcdf-4 file format, netcdf-4 type model."
.IP "'4', 'hdf5-nc3', 'netCDF-4 classic model', 'enhanced-nc3' => netcdf-4 file format, netcdf-3 type model."
.RE
.RE
If no value for -k is specified, then the output will use the same
format as the input, except if the input is classic or 64-bit offset
and either chunking or compression is specified, in which case the output will be netCDF-4 classic
model format.
Note that attempting some kinds of format
conversion will result in an error, if the conversion is not
possible. For example, an attempt to copy a netCDF-4 file that uses
features of the enhanced model to any of the other kinds of netCDF
formats that use the classic model will result in an error.
.IP "\fB -d \fP \fI n \fP"
Specify deflation level (level of compression) for variable data in
output. 0 corresponds to no compression and 9 to maximum compression,
with higher levels of compression requiring marginally more time to
compress or uncompress than lower levels. Compression achieved may
also depend on chunking parameters, which will use default chunking in the current nccopy
implementation. If this option is specified for a classic format or
64-bit offset format input file, it is not necessary to also specify
that the output should be netCDF-4 classic model, as that will
be the default. If this option is not specified and the input file
has compressed variables, the compression will still be preserved
in the output, using the same chunking as in the input.
.P
Note that \fB nccopy \fP requires all variables to be compressed using
the same compression level, but the API has no such restriction. With
a program you can customize compression for each variable independently.
.IP "\fB -s \fP"
Specify shuffling of variable data bytes before compression or after
decompression. This option is ignored unless a non-zero deflation
level is specified. Turning shuffling on sometimes improves
compression.
.IP "\fB -u \fP"
Convert any unlimited size dimensions in the input to fixed size
dimensions in the output.
.IP "\fB -c \fP \fI chunkspec \fP"
Specify chunking (multidimensional tiling) for variable data in
the output, useful to specify the units of disk access, compression, or
other filters such as checksums.
The \fI chunkspec \fP argument is a string of comma-separated
associations, each specifying a dimension name, a `/' character, and
optionally the corresponding chunk length for that dimension. No
blanks should appear in the chunkspec string, except possibly escaped
blanks that are part of a dimension name. A
chunkspec must name at least one dimension, and may omit dimensions
which are not to be chunked or for which the default chunk length is
desired. If a dimension name is followed by a `/' character but no
subsequent chunk length, the actual dimension length is assumed. If
copying a classic model file to a netCDF-4 output file and not naming
all dimensions in the chunkspec, unnamed dimensions will also use the
actual dimension length for the chunk length.
An example of a chunkspec
for variables that use the `m' and `n' dimensions might be
`m/100,n/200' to specify 100 by 200 chunks. To see the chunking
resulting from copying with a chunkspec, use the `-s'
option of ncdump on the output file.
.IP "\fB -m \fP \fI bufsize \fP"
An integer or floating-point number that specifies the size, in bytes,
of the copy buffer used
to copy large variables. A suffix of K, M, G, or T multiplies
the copy buffer size by one thousand, million, billion, or trillion, respectively.
The default is 5,000,000 bytes,
but will be increased if necessary to hold at least one chunk of
netCDF-4 chunked variables in the input file. You may want to specify
a value larger than the default for copying large files over high
latency networks.
.IP "\fB -h \fP \fI chunk_cache \fP"
An integer or floating-point number that specifies the size in bytes
of chunk cache for chunked variables. This is
not a property of the file, but merely a performance tuning parameter
for avoiding compressing or decompressing the same data multiple times
while copying and changing chunk shapes. A suffix of K, M, G, or T multiplies
the chunk cache size by one thousand, million, billion, or trillion, respectively.
The default is 4,194,304 (or whatever was specified for the
configure-time constant CHUNK_CACHE_SIZE when the netCDF library was
built). Ideally, the nccopy utility should accept only one memory
buffer size and divide it optimally between a copy buffer and chunk
cache, but no general algorithm for computing the optimum chunk cache
size has been implemented yet.
.IP "\fB -e \fP \fI cache_elems \fP"
Specifies number of elements that the chunk cache can hold. This is
not a property of the file, but merely a performance tuning parameter
for avoiding compressing or decompressing the same data multiple times
while copying and changing chunk shapes. The default is 1009 (or
whatever was specified for the configure-time constant
CHUNK_CACHE_NELEMS when the netCDF library was built). Ideally, the
nccopy utility should determine an optimum value for this parameter,
but no general algorithm for computing the optimum number of chunk
cache elements has been implemented yet.
.P
Note that \fB nccopy \fP requires variables that share a dimension to
also share the chunk size associated with that dimension, but the API
has no such restriction. If you need to customize chunking
for each variable independently, you will need to use the library API
in a custom utility program.
.SH EXAMPLES
.LP
Make a copy of foo1.nc, a netCDF file of any type, to foo2.nc, a
netCDF file of the same type:
.RS
.HP
nccopy foo1.nc foo2.nc
.RE
Note that the above copy will not be as fast as use of a
simple copy utility, because the file is copied using
only the netCDF
API. If the input file has extra bytes
after the end of the
netCDF data, those will not be copied, because they are not accessible
through the netCDF interface. If the original file was generated in
`No fill' mode so that fill values are not stored for padding for data
alignment, the output file may have different padding bytes.
.LP
Convert a netCDF-4 classic model file, compressed.nc, that uses compression,
to a netCDF-3 file classic.nc:
.RS
.HP
nccopy -k classic compressed.nc classic.nc
.RE
Note that `1' could be used instead of `classic'.
.LP
Download the variable `time_bnds' and it's associated attributes from
an OPeNDAP server and copy the result to a netCDF file named `tb.nc':
.RS
.HP
nccopy 'http://test.opendap.org/opendap/data/nc/sst.mnmean.nc.gz?time_bnds' tb.nc
.RE
Note that URLs that name specific variables as command-line arguments
should generally be quoted, to avoid the shell interpreting special
characters such as `?'.
.LP
Compress all the variables in the input file foo.nc, a netCDF file of any
type, to the output file bar.nc:
.RS
.HP
nccopy -d1 foo.nc bar.nc
.RE
If foo.nc was a classic or 64-bit offset netCDF file, bar.nc will be a
netCDF-4 classic model netCDF file, because the classic and 64-bit
offset format variants don't support compression. If foo.nc was a
netCDF-4 file with some variables compressed using various deflation
levels, the output will also be a netCDF-4 file of the same type, but
all the variables, including any uncompressed variables in the input,
will now use deflation level 1.
.SH "SEE ALSO"
.LP
.BR ncdump(1), ncgen (1),
.BR netcdf (3)