mirror of
https://github.com/Unidata/netcdf-c.git
synced 2024-12-09 08:11:38 +08:00
Merge pull request #2388 from rkouznetsov/lossydoc
Add clarification for the meaning of NSB
This commit is contained in:
commit
4bdb093fca
@ -611,7 +611,99 @@ As part of its testing, the NetCDF build process creates a number of shared libr
|
||||
If you need a filter from that set, you may be able to set *HDF5\_PLUGIN\_PATH*
|
||||
to point to that directory or you may be able to copy the shared libraries out of that directory to your own location.
|
||||
|
||||
## Debugging {#filters_debug}
|
||||
# Lossy One-Way Filters
|
||||
|
||||
As of NetCDF version 4.8.2, the netcdf-c library supports
|
||||
bit-grooming filters.
|
||||
````
|
||||
Bit-grooming is a lossy compression algorithm that removes the
|
||||
bloat due to false-precision, those bits and bytes beyond the
|
||||
meaningful precision of the data. Bit Grooming is statistically
|
||||
unbiased, applies to all floating point numbers, and is easy to
|
||||
use. Bit-Grooming reduces data storage requirements by
|
||||
25-80%. Unlike its best-known competitor Linear Packing, Bit
|
||||
Grooming imposes no software overhead on users, and guarantees
|
||||
its precision throughout the whole floating point range
|
||||
[https://doi.org/10.5194/gmd-9-3199-2016].
|
||||
````
|
||||
The generic term "quantize" is used to refer collectively to the various
|
||||
precision-trimming algorithms. The key thing to note about quantization is that
|
||||
it occurs at the point of writing of data only. Since its output is
|
||||
legal data, it does not need to be "de-quantized" when the data is read.
|
||||
Because of this, quantization is not part of the standard filter
|
||||
mechanism and has a separate API.
|
||||
|
||||
The API for bit-groom is currently as follows.
|
||||
````
|
||||
int nc_def_var_quantize(int ncid, int varid, int quantize_mode, int nsd);
|
||||
int nc_inq_var_quantize(int ncid, int varid, int *quantize_modep, int *nsdp);
|
||||
````
|
||||
The *quantize_mode* argument specifies the particular algorithm.
|
||||
Currently, three are supported: NC_QUANTIZE_BITGROOM, NC_QUANTIZE_GRANULARBR,
|
||||
and NC_QUANTIZE_BITROUND. In addition quantization can be disabled using
|
||||
the value NC_NOQUANTIZE.
|
||||
|
||||
The input to ncgen or the output from ncdump supports special attributes
|
||||
to indicate if quantization was applied to a given variable.
|
||||
These attributes have the following form.
|
||||
````
|
||||
_QuantizeBitGroomNumberOfSignificantDigits = <NSD>
|
||||
or
|
||||
_QuantizeGranularBitRoundNumberOfSignificantDigits = <NSD>
|
||||
or
|
||||
_QuantizeBitRoundNumberOfSignificantBits = <NSB>
|
||||
````
|
||||
The value NSD is the number of significant (decimal) digits to keep.
|
||||
The value NSB is the number of bits to keep in the fraction part of an
|
||||
IEEE754 floating-point number. Note that NSB of QuantizeBitRound is the same as
|
||||
"number of explicit mantissa bits" (https://doi.org/10.5194/gmd-9-3199-2016) and same as
|
||||
the number of "keep-bits" (https://doi.org/10.5194/gmd-14-377-2021), but is not
|
||||
one less than the number of significant bunary figures:
|
||||
`_QuantizeBitRoundNumberOfSignificantBits = 0` means one significant binary figure,
|
||||
`_QuantizeBitRoundNumberOfSignificantBits = 1` means two significant binary figures etc.
|
||||
|
||||
## Distortions introduced by lossy filters
|
||||
|
||||
Any lossy filter introduces distortions to data.
|
||||
The lossy filters implemented in netcdf-c introduce a distortoin
|
||||
that can be quantified in terms of a _relative_ error. The magnitude of
|
||||
distortion introduced to every single value V is guaranteed to be within
|
||||
a certain fraction of V, expressed as 0.5 * V * 2**{-NSB}:
|
||||
i.e. it is 0.5V for NSB=0, 0.25V for NSB=1, 0.125V for NSB=2 etc.
|
||||
|
||||
|
||||
Two other methods use different definitions of _decimal precision_, though both
|
||||
are guaranteed to reproduce NSD decimals when printed.
|
||||
The margin for a relative error introduced by the methods are summarised in the table
|
||||
|
||||
```
|
||||
NSD 1 2 3 4 5 6 7
|
||||
|
||||
BitGroom
|
||||
Error Margin 3.1e-2 3.9e-3 4.9e-4 3.1e-5 3.8e-6 4.7e-7 -
|
||||
|
||||
GranularBitRound
|
||||
Error Margin 1.4e-1 1.9e-2 2.2e-3 1.4e-4 1.8e-5 2.2e-6 -
|
||||
|
||||
```
|
||||
|
||||
|
||||
If one defines decimal precision as in BitGroom, i.e. the introduced relative
|
||||
error must not exceed half of the unit at the decimal place NSD in the
|
||||
worst-case scenario, the following values of NSB should be used for BitRound:
|
||||
|
||||
```
|
||||
NSD 1 2 3 4 5 6 7
|
||||
NSB 3 6 9 13 16 19 23
|
||||
```
|
||||
|
||||
The resulting application of BitRound is as fast as BitGroom, and is free from
|
||||
artifacts in multipoint statistics introduced by BitGroom
|
||||
(see https://doi.org/10.5194/gmd-14-377-2021).
|
||||
|
||||
|
||||
# Debugging {#filters_debug}
|
||||
|
||||
|
||||
Depending on the debugger one uses, debugging plugins can be very difficult.
|
||||
It may be necessary to use the old printf approach for debugging the filter itself.
|
||||
|
Loading…
Reference in New Issue
Block a user