Add doc on errors introduced by lossy compression

Also references to relevant papers
This commit is contained in:
Rostislav Kouznetsov 2022-06-07 10:55:16 +03:00
parent b2097a77aa
commit 7564e7e797

View File

@ -629,7 +629,8 @@ unbiased, applies to all floating point numbers, and is easy to
use. Bit-Grooming reduces data storage requirements by
25-80%. Unlike its best-known competitor Linear Packing, Bit
Grooming imposes no software overhead on users, and guarantees
its precision throughout the whole floating point range [9].
its precision throughout the whole floating point range
[https://doi.org/10.5194/gmd-9-3199-2016].
````
The generic term "quantize" is used to refer collectively to the various
bitgroom algorithms. The key thing to note about quantization is that
@ -661,6 +662,46 @@ _QuantizeBitRoundNumberOfSignificantBits = <NSB>
The value NSD is the number of significant (decimal) digits to keep.
The value NSB is the number of significant bits to keep.
## Distortions introduced by lossy filters
Any lossy filter introduces distortions to data.
The lossy filters implemented in netcdf-c introduce a distortoin
that can be quantified in terms of a _relative_ error. The magnitude of
distortion introduced to every single value V is guaranteed to be within
a certain fraction of V, expressed as 0.5 * V * 2**{-NSB}:
i.e. it is 0.5V for NSB=0, 0.25V for NSB=1, 0.125V for NSB=2 etc.
Two other methods use different definitions of _decimal precision_, though both
are guaranteed to reproduce NSD decimls when printed.
The margin for a relative error introduced by the methods are summarised in the table
```
NSD 1 2 3 4 5 6 7
BitGroom
Error Margin 3.1e-2 3.9e-3 4.9e-4 3.1e-5 3.8e-6 4.7e-7 -
GranularBitRound
Error Margin 1.4e-1 1.9e-2 2.2e-3 1.4e-4 1.8e-5 2.2e-6 -
```
If one defines decimal precision as in BitGroom, i.e. the introduced relative
error must not exceed half of the unit at the decimal place NSD in the
worst-case scenario, the following values of NSB should be used for BitRound:
```
NSD 1 2 3 4 5 6 7
NSB 3 6 9 13 16 19 23
```
The resulting application of BitRound is as fast as BitGroom, and is free from
artifacts in multipoint statistics introduced by BitGroom
(see https://doi.org/10.5194/gmd-14-377-2021).
# Debugging {#filters_debug}
Depending on the debugger one uses, debugging plugins can be very difficult.