Merge branch 'master' into filterexpr.dmh

This commit is contained in:
Ward Fisher 2019-02-12 09:48:03 -07:00 committed by GitHub
commit e32e0b1eb2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
19 changed files with 1200 additions and 600 deletions

View File

@ -2,13 +2,12 @@ NetCDF-4 Filter Support
============================
<!-- double header is needed to workaround doxygen bug -->
NetCDF-4 Filter Support {#compress}
=================================
NetCDF-4 Filter Support {#filters}
============================
[TOC]
Introduction {#compress_intro}
==================
# Introduction {#filters_intro}
The HDF5 library (1.8.11 and later)
supports a general filter mechanism to apply various
@ -37,8 +36,7 @@ can locate, load, and utilize the compressor.
These libraries are expected to installed in a specific
directory.
Enabling A Compression Filter {#Enable}
=============================
# Enabling A Compression Filter {#filters_enable}
In order to compress a variable, the netcdf-c library
must be given three pieces of information:
@ -66,8 +64,7 @@ using __ncgen__, via an API call, or via command line parameters to __nccopy__.
In any case, remember that filtering also requires setting chunking, so the
variable must also be marked with chunking information.
Using The API {#API}
-------------
## Using The API {#filters_API}
The necessary API methods are included in __netcdf.h__ by default.
One API method is for setting the filter to be used
when writing a variable. The relevant signature is
@ -90,8 +87,7 @@ __params__. As is usual with the netcdf API, one is expected to call
this function twice. The first time to get __nparams__ and the
second to get the parameters in client-allocated memory.
Using ncgen {#NCGEN}
-------------
## Using ncgen {#filters_NCGEN}
In a CDL file, compression of a variable can be specified
by annotating it with the following attribute:
@ -107,8 +103,8 @@ This is a "special" attribute, which means that
it will normally be invisible when using
__ncdump__ unless the -s flag is specified.
Example CDL File (Data elided)
------------------------------
### Example CDL File (Data elided)
````
netcdf bzip2 {
dimensions:
@ -123,8 +119,8 @@ data:
}
````
Using nccopy {#NCCOPY}
-------------
## Using nccopy {#filters_NCCOPY}
When copying a netcdf file using __nccopy__ it is possible
to specify filter information for any output variable by
using the "-F" option on the command line; for example:
@ -186,8 +182,7 @@ by this table.
<tr><td>false<td>unspecified<td>none<td>unfiltered
</table>
Parameter Encoding {#ParamEncode}
==========
# Parameter Encode/Decode {#filters_paramcoding}
The parameters passed to a filter are encoded internally as a vector
of 32-bit unsigned integers. It may be that the parameters
@ -213,57 +208,83 @@ them between the local machine byte order and network byte
order.
Parameters whose size is larger than 32-bits present a byte order problem.
This typically includes double precision floats and (signed or unsigned)
64-bit integers. For these cases, the machine byte order must be
handled by the compression code. This is because HDF5 will treat,
This specifically includes double precision floats and (signed or unsigned)
64-bit integers. For these cases, the machine byte order issue must be
handled, in part, by the compression code. This is because HDF5 will treat,
for example, an unsigned long long as two 32-bit unsigned integers
and will convert each to network order separately. This means that
on a machine whose byte order is different than the machine in which
the parameters were initially created, the two integers are out of order
and must be swapped to get the correct unsigned long long value.
Consider this example. Suppose we have this little endian unsigned long long.
the parameters were initially created, the two integers will be separately
endian converted. But this will be incorrect for 64-bit values.
1000000230000004
So, we have this situation:
In network byte order, it will be stored as two 32-bit integers.
1. the 8 bytes come in as native machine order for the machine
doing the call to *nc_def_var_filter*.
2. HDF5 divides the 8 bytes into 2 four byte pieces and ensures that each piece
is in network (big) endian order.
3. When the filter is called, the two pieces are returned in the same order
but with the bytes in each piece consistent with the native machine order
for the machine executing the filter.
20000001 40000003
## Encoding Algorithms
On a big endian machine, this will be given to the filter in that form.
In order to properly extract the correct 8-byte value, we need to ensure
that the values stored in the HDF5 file have a known format independent of
the native format of the creating machine.
2000000140000003
The idea is to do sufficient manipulation so that HDF5
will store the 8-byte value as a little endian value
divided into two 4-byte integers.
Note that little-endian is used as the standard
because it is the most common machine format.
When read, the filter code needs to be aware of this convention
and do the appropriate conversions.
But note that the proper big endian unsigned long long form is this.
This leads to the following set of rules.
4000000320000001
### Encoding
So, the two words need to be swapped.
1. Encode on little endian (LE) machine: no special action is required.
The 8-byte value is passed to HDF5 as two 4-byte integers. HDF5 byte
swaps each integer and stores it in the file.
2. Encode on a big endian (BE) machine: several steps are required:
But consider the case when both original and final machines are big endian.
1. Do an 8-byte byte swap to convert the original value to little-endian
format.
2. Since the encoding machine is BE, HDF5 will just store the value.
So it is necessary to simulate little endian encoding by byte-swapping
each 4-byte integer separately.
3. This doubly swapped pair of integers is then passed to HDF5 and is stored
unchanged.
1. 4000000320000001
2. 40000003 20000001
3. 40000003 20000001
### Decoding
where #1 is the original number, #2 is the network order and
#3 is the what is given to the filter. In this case we do not
want to swap words.
1. Decode on LE machine: no special action is required.
HDF5 will get the two 4-bytes values from the file and byte-swap each
separately. The concatenation of those two integers will be the expected
LE value.
2. Decode on a big endian (BE) machine: the inverse of the encode case must
be implemented.
The solution is to forcibly encode the original number using some
specified endianness so that the filter always assumes it is getting
its parameters in that order and will always do swapping as needed.
This is irritating, but one needs to be aware of it. Since most
machines are little-endian. We choose to use that as the endianness
for handling 64 bit entities.
1. HDF5 sends the two 4-byte values to the filter.
2. The filter must then byte-swap each 4-byte value independently.
3. The filter then must concatenate the two 4-byte values into a single
8-byte value. Because of the encoding rules, this 8-byte value will
be in LE format.
4. The filter must finally do an 8-byte byte-swap on that 8-byte value
to convert it to desired BE format.
Filter Specification Syntax {#Syntax}
==========
To support these rules, some utility programs exist and are discussed in
<a href="#AppendixA">Appendix A</a>.
# Filter Specification Syntax {#filters_syntax}
Both of the utilities
<a href="#NCGEN">__ncgen__</a>
and
<a href="#NCCOPY">__nccopy__</a>
allow the specification of filter parameters.
allow the specification of filter parameters in text format.
These specifications consist of a sequence of comma
separated constants. The constants are converted
within the utility to a proper set of unsigned int
@ -272,7 +293,7 @@ constants (see the <a href="#ParamEncode">parameter encoding section</a>).
To simplify things, various kinds of constants can be specified
rather than just simple unsigned integers. The utilities will encode
them properly using the rules specified in
the <a href="#ParamEncode">parameter encoding section</a>.
the section on <a href="#filters_paramcoding">parameter encode/decode</a>.
The currently supported constants are as follows.
<table>
@ -285,9 +306,9 @@ The currently supported constants are as follows.
<tr><td>77<td>implicit unsigned 32-bit integer<td>No tag<td>
<tr><td>93U<td>explicit unsigned 32-bit integer<td>u|U<td>
<tr><td>789f<td>32-bit float<td>f|F<td>
<tr><td>12345678.12345678d<td>64-bit double<td>d|D<td>Network byte order
<tr><td>-9223372036854775807L<td>64-bit signed long long<td>l|L<td>Network byte order
<tr><td>18446744073709551615UL<td>64-bit unsigned long long<td>u|U l|L<td>Network byte order
<tr><td>12345678.12345678d<td>64-bit double<td>d|D<td>LE encoding
<tr><td>-9223372036854775807L<td>64-bit signed long long<td>l|L<td>LE encoding
<tr><td>18446744073709551615UL<td>64-bit unsigned long long<td>u|U l|L<td>LE encoding
</table>
Some things to note.
@ -298,10 +319,10 @@ Some things to note.
2. For signed byte and short, the value is sign extended to 32 bits
and then treated as an unsigned int value.
3. For double, and signed|unsigned long long, they are converted
to network byte order and then treated as two unsigned int values.
This is consistent with the <a href="#ParamEncode">parameter encoding</a>.
as specified in the section on
<a href="#filters_paramcoding">parameter encode/decode</a>.
Dynamic Loading Process {#Process}
Dynamic Loading Process {#filters_Process}
==========
The documentation[1,2] for the HDF5 dynamic loading was (at the time
@ -309,7 +330,7 @@ this was written) out-of-date with respect to the actual HDF5 code
(see HDF5PL.c). So, the following discussion is largely derived
from looking at the actual code. This means that it is subject to change.
Plugin directory {#Plugindir}
Plugin directory {#filters_Plugindir}
----------------
The HDF5 loader expects plugins to be in a specified plugin directory.
@ -321,7 +342,7 @@ The default directory is:
The default may be overridden using the environment variable
__HDF5_PLUGIN_PATH__.
Plugin Library Naming {#Pluginlib}
Plugin Library Naming {#filters_Pluginlib}
---------------------
Given a plugin directory, HDF5 examines every file in that
@ -335,7 +356,7 @@ as determined by the platform on which the library is being executed.
<tr halign="left"><td>Windows<td>*<td>.dll
</table>
Plugin Verification {#Pluginverify}
Plugin Verification {#filters_Pluginverify}
-------------------
For each dynamic library located using the previous patterns,
HDF5 attempts to load the library and attempts to obtain information
@ -355,7 +376,7 @@ specified for the variable in __nc_def_var_filter__ in order to be used.
If plugin verification fails, then that plugin is ignored and
the search continues for another, matching plugin.
Debugging {#Debug}
Debugging {#filters_Debug}
-------
Debugging plugins can be very difficult. You will probably
need to use the old printf approach for debugging the filter itself.
@ -371,7 +392,7 @@ Since ncdump is not being asked to access the data (the -h flag), it
can obtain the filter information without failures. Then it can print
out the filter id and the parameters (the -s flag).
Test Case {#TestCase}
Test Case {#filters_TestCase}
-------
Within the netcdf-c source tree, the directory
__netcdf-c/nc_test4__ contains a test case (__test_filter.c__) for
@ -380,7 +401,7 @@ bzip2. Another test (__test_filter_misc.c__) validates
parameter passing. These tests are disabled if __--enable-shared__
is not set or if __--enable-netcdf-4__ is not set.
Example {#Example}
Example {#filters_Example}
-------
A slightly simplified version of the filter test case is also
available as an example within the netcdf-c source tree
@ -459,45 +480,35 @@ has been known to work.
gcc -g -O0 -shared -o libbzip2.so <plugin source files> -L${HDF5LIBDIR} -lhdf5_hl -lhdf5 -L${ZLIBDIR} -lz
````
Appendix A. Byte Swap Code {#AppendixA}
Appendix A. Support Utilities {#filters_AppendixA}
==========
Since in some cases, it is necessary for a filter to
byte swap from little-endian to big-endian, This appendix
provides sample code for doing this. It also provides
a code snippet for testing if the machine the
endianness of a machine.
Byte swap an 8-byte chunk of memory
-------
````
static void
byteswap8(unsigned char* mem)
{
register unsigned char c;
c = mem[0];
mem[0] = mem[7];
mem[7] = c;
c = mem[1];
mem[1] = mem[6];
mem[6] = c;
c = mem[2];
mem[2] = mem[5];
mem[5] = c;
c = mem[3];
mem[3] = mem[4];
mem[4] = c;
}
Two functions are exported from the netcdf-c library
for use by client programs and by filter implementations.
````
1. ````int NC_parsefilterspec(const char* spec, unsigned int* idp, size_t* nparamsp, unsigned int** paramsp);````
* idp will contain the filter id value from the spec.
* nparamsp will contain the number of 4-byte parameters
* paramsp will contain a pointer to the parsed parameters -- the caller
must free.
This function can parse filter spec strings as defined in
the section on <a href="#filters_syntax">Filter Specification Syntax</a>.
This function parses the first argument and returns several values.
Test for Machine Endianness
-------
````
static const unsigned char b[4] = {0x0,0x0,0x0,0x1}; /* value 1 in big-endian*/
int endianness = (1 == *(unsigned int*)b); /* 1=>big 0=>little endian
````
References {#References}
========================
2. ````int NC_filterfix8(unsigned char* mem8, int decode);````
* mem8 is a pointer to the 8-byte value either to fix.
* decode is 1 if the function should apply the 8-byte decoding algorithm
else apply the encoding algorithm.
This function implements the 8-byte conversion algorithms.
Before calling *nc_def_var_filter* (unless *NC_parsefilterspec* was used),
the client must call this function with the decode argument set to 0.
Inside the filter code, this function should be called with the decode
argument set to 1.
Examples of the use of these functions can be seen in the test program
*nc_test4/tst_filterparser.c*.
# References {#filters_References}
1. https://support.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/HDF5DynamicallyLoadedFilters.pdf
2. https://support.hdfgroup.org/HDF5/doc/TechNotes/TechNote-HDF5-CompressionTroubleshooting.pdf
@ -505,8 +516,7 @@ References {#References}
4. https://support.hdfgroup.org/services/contributions.html#filters
5. https://support.hdfgroup.org/HDF5/doc/RM/RM_H5.html
Point of Contact
================
# Point of Contact
__Author__: Dennis Heimbigner<br>
__Email__: dmh at ucar dot edu

View File

@ -6,7 +6,7 @@
# Ed Hartnett, Dennis Heimbigner, Ward Fisher
include_HEADERS = netcdf.h netcdf_meta.h netcdf_mem.h netcdf_aux.h
include_HEADERS = netcdf.h netcdf_meta.h netcdf_mem.h netcdf_aux.h netcdf_filter.h
if BUILD_PARALLEL
include_HEADERS += netcdf_par.h
@ -17,7 +17,7 @@ ncuri.h ncutf8.h ncdispatch.h ncdimscale.h netcdf_f.h err_macros.h \
ncbytes.h nchashmap.h ceconstraints.h rnd.h nclog.h ncconfigure.h \
nc4internal.h nctime.h nc3internal.h onstack.h ncrc.h ncauth.h \
ncoffsets.h nctestserver.h nc4dispatch.h nc3dispatch.h ncexternl.h \
ncwinpath.h ncfilter.h ncindex.h hdf4dispatch.h hdf5internal.h \
ncwinpath.h ncindex.h hdf4dispatch.h hdf5internal.h \
nc_provenance.h hdf5dispatch.h
if USE_DAP

View File

@ -1,8 +1,8 @@
/* Copyright 2018, UCAR/Unidata and OPeNDAP, Inc.
See the COPYRIGHT file for more information. */
#ifndef NCFILTER_H
#define NCFILTER_H 1
#ifndef NETCDF_FILTER_H
#define NETCDF_FILTER_H 1
/* API for libdispatch/dfilter.c */
@ -18,10 +18,10 @@ extern "C" {
/* Provide consistent filter spec parser */
EXTERNL int NC_parsefilterspec(const char* spec, unsigned int* idp, size_t* nparamsp, unsigned int** paramsp);
EXTERNL void NC_byteswap8(unsigned char* mem);
EXTERNL void NC_filterfix8(unsigned char* mem, int decode);
#if defined(__cplusplus)
}
#endif
#endif /* NCFILTER_H */
#endif /* NETCDF_FILTER_H */

View File

@ -76,15 +76,8 @@ NCD4_processdata(NCD4meta* meta)
FAIL(ret,"delimit failure");
}
/* Swap the data for each top level variable,
including the checksum (if any)
*/
if(meta->swap) {
if((ret=NCD4_swapdata(meta,toplevel)))
FAIL(ret,"byte swapping failed");
}
/* Compute the checksums of the top variables */
/* must occur before any byte swapping */
if(meta->localchecksumming) {
for(i=0;i<nclistlength(toplevel);i++) {
unsigned int csum = 0;
@ -105,6 +98,14 @@ NCD4_processdata(NCD4meta* meta)
}
}
}
/* Swap the data for each top level variable,
*/
if(meta->swap) {
if((ret=NCD4_swapdata(meta,toplevel)))
FAIL(ret,"byte swapping failed");
}
done:
if(toplevel) nclistfree(toplevel);
return THROW(ret);

View File

@ -27,7 +27,7 @@
#endif
#define D4CATCH /* Warning: significant performance impact */
#undef D4CATCH /* Warning: significant performance impact */
#define PANIC(msg) assert(d4panic(msg));
#define PANIC1(msg,arg) assert(d4panic(msg,arg));

View File

@ -229,10 +229,10 @@ walkSeq(NCD4meta* compiler, NCD4node* topvar, NCD4node* vlentype, void** offsetp
offset = *offsetp;
/* process the record count */
if(compiler->swap)
swapinline64(offset);
recordcount = GETCOUNTER(offset);
SKIPCOUNTER(offset);
if(compiler->swap)
swapinline64(&recordcount);
basetype = vlentype->basetype; /* This may be of any type potentially */
assert(basetype->sort == NCD4_TYPE);

View File

@ -12,6 +12,7 @@
#endif
#include "netcdf.h"
#include "netcdf_filter.h"
/*
Common utilities related to filters.
@ -47,7 +48,7 @@ NC_parsefilterspec(const char* spec, unsigned int* idp, size_t* nparamsp, unsign
size_t len;
int i;
unsigned int* ulist = NULL;
unsigned char mem[8]; /* to convert to network byte order */
unsigned char mem[8];
if(spec == NULL || strlen(spec) == 0) goto fail;
sdata = strdup(spec);
@ -135,14 +136,13 @@ NC_parsefilterspec(const char* spec, unsigned int* idp, size_t* nparamsp, unsign
ulist[nparams++] = *(unsigned int*)&valf;
break;
/* The following are 8-byte values, so we must swap pieces if this
is a little endian machine */
case 'd':
sstat = sscanf(p,"%lf",&vald);
if(sstat != 1) goto fail;
/* convert to network byte order */
memcpy(mem,&vald,sizeof(mem));
#ifdef WORDS_BIGENDIAN
NC_byteswap8(mem); /* convert big endian to little endian */
#endif
NC_filterfix8(mem,0);
vector = (unsigned int*)mem;
ulist[nparams++] = vector[0];
ulist[nparams++] = vector[1];
@ -153,12 +153,9 @@ NC_parsefilterspec(const char* spec, unsigned int* idp, size_t* nparamsp, unsign
else
sstat = sscanf(p,"%lld",(long long*)&val64u);
if(sstat != 1) goto fail;
/* convert to network byte order */
memcpy(mem,&val64u,sizeof(mem));
#ifdef WORDS_BIGENDIAN
NC_byteswap8(mem); /* convert big endian to little endian */
#endif
vector = (unsigned int*)mem;
NC_filterfix8(mem,0);
vector = (unsigned int*)&mem;
ulist[nparams++] = vector[0];
ulist[nparams++] = vector[1];
break;
@ -216,9 +213,8 @@ gettype(const int q0, const int q1, int* isunsignedp)
#ifdef WORDS_BIGENDIAN
/* Byte swap an 8-byte integer in place */
EXTERNL
void
NC_byteswap8(unsigned char* mem)
static void
byteswap8(unsigned char* mem)
{
unsigned char c;
c = mem[0];
@ -234,4 +230,35 @@ NC_byteswap8(unsigned char* mem)
mem[3] = mem[4];
mem[4] = c;
}
/* Byte swap an 8-byte integer in place */
static void
byteswap4(unsigned char* mem)
{
unsigned char c;
c = mem[0];
mem[0] = mem[3];
mem[3] = c;
c = mem[1];
mem[1] = mem[2];
mem[2] = c;
}
#endif
EXTERNL void
NC_filterfix8(unsigned char* mem, int decode)
{
#ifdef WORDS_BIGENDIAN
if(decode) { /* Apply inverse of the encode case */
byteswap4(mem); /* step 1: byte-swap each piece */
byteswap4(mem+4);
byteswap8(mem); /* step 2: convert to little endian format */
} else { /* encode */
byteswap8(mem); /* step 1: convert to little endian format */
byteswap4(mem); /* step 2: byte-swap each piece */
byteswap4(mem+4);
}
#else /* Little endian */
/* No action is necessary */
#endif
}

View File

@ -23,13 +23,15 @@ write2(int ncid, int parallel)
int dimid[NDIM2];
char str[NC_MAX_NAME + 1];
int varid[NVARS];
int i;
int j;
/* define dimension */
if (nc_def_dim(ncid, "Y", NC_UNLIMITED, &dimid[0])) ERR;
if (nc_def_dim(ncid, "X", NX, &dimid[1])) ERR;
/* Define vars. */
for (int i = 0; i < NVARS; i++)
for (i = 0; i < NVARS; i++)
{
if (i % 2)
{
@ -46,14 +48,14 @@ write2(int ncid, int parallel)
if (nc_enddef(ncid)) ERR;
/* write all variables */
for (int i = 0; i < NVARS; i++)
for (i = 0; i < NVARS; i++)
{
size_t start[NDIM2] = {0, 0};
size_t count[NDIM2];
int buf[NX];
/* Initialize some data. */
for (int j = 0; j < NX; j++)
for (j = 0; j < NX; j++)
buf[j] = i * 10 + j;
/* Write the data. */
@ -95,7 +97,10 @@ extend(int ncid)
int
read2(int ncid)
{
for (int i = 0; i < NVARS; i++)
int i;
int j;
for (i = 0; i < NVARS; i++)
{
int buf[NX];
size_t start[2] = {0, 0}, count[2];
@ -110,7 +115,7 @@ read2(int ncid)
count[1] = NX;
}
if (nc_get_vara_int(ncid, i, start, count, buf)) ERR;
for (int j = 0; j < NX; j++)
for (j = 0; j < NX; j++)
{
if (buf[j] != i * 10 + j)
{

View File

@ -10,6 +10,7 @@
#include <hdf5.h>
#include "netcdf.h"
#include "netcdf_filter.h"
#undef DEBUG
@ -289,7 +290,13 @@ showparameters(void)
static void
insert(int index, void* src, size_t size)
{
unsigned char src8[8];
void* dst = &baseline[index];
if(size == 8) {
memcpy(src8,src,size);
NC_filterfix8(src8,0);
src = src8;
}
memcpy(dst,src,size);
}

View File

@ -9,7 +9,7 @@
#include <stdlib.h>
#include "netcdf.h"
#include "ncfilter.h"
#include "netcdf_filter.h"
#define PARAMS_ID 32768
@ -19,19 +19,14 @@
*/
#define DBLVAL 12345678.12345678
#define FLTVAL 789.0
#define LONGLONGVAL -9223372036854775807LL
#define ULONGLONGVAL 18446744073709551615ULL
#define MAXPARAMS 32
#define NPARAMS 16 /* # of unsigned ints in params */
static unsigned int baseline[NPARAMS];
/* Expected contents of baseline:
id = 32768
params = 4294967279, 23, 4294967271, 27, 77, 93, 1145389056, 3287505826, 1097305129, 1, 2147483647, 4294967295U, 4294967295U
*/
static const char* spec =
"32768, -17b, 23ub, -25S, 27US, 77, 93U, 789f, 12345678.12345678d, -9223372036854775807L, 18446744073709551615UL, 2147483647, -2147483648, 4294967295U";
/* Test support for the conversions */
/* Not sure if this kind of casting via union is legal C99 */
static union {
@ -50,6 +45,79 @@ static union {
long long ll;
} ul;
/* Expected contents of baseline:
id = 32768
index spec item Value
----------------------------------------
0 -17b 4294967279
1 23ub 23
2 -25S 42949677271
3 27US 27
4 77 77
5 93U 93
6 2147483647 ?
7 -2147483648 ?
8 4294967295U ?
9 789f 1145389056
<8-bytes start here >
10 -9223372036854775807L 1, 2147483647
12 18446744073709551615UL 4294967295U, 4294967295U
14 12345678.12345678d 3287505826, 1097305129
expected (LE):
{239, 23, 65511, 27, 77, 93, 2147483647, 2147483648, 4294967295, 1145389056,
1, 2147483648, 4294967295, 4294967295, 3287505826, 1097305129}
params (LE):
0x000000ef
0x00000017
0x0000ffe7
0x0000001b
0x0000004d
0x0000005d
0x7fffffff
0x80000000
0xffffffff
0x44454000
0x000000001 .ll
0x80000000
0xffffffff .ull
0xffffffff
0xc3f35ba2 .d
0x41678c29
expected (BE):
{239, 23, 65511, 27, 77, 93, 2147483647, 2147483648, 4294967295, 1145389056,
16777216, 128, 4294967295, 4294967295, 2723935171, 697067329}
params (BE):
0x000000ef
0x00000017
0x0000ffe7
0x0000001b
0x0000004d
0x0000005d
0x7fffffff
0x80000000
0xffffffff
0x44454000
0x01000000 .ll
0x00000080
0xffffffff .ull
0xffffffff
0xa25bf3c3 .d
0x298c6741
*/
static unsigned int baseline[MAXPARAMS]; /* Expected */
static const char* spec =
"32768, -17b, 23ub, -25S, 27US, 77, 93U, 2147483647, -2147483648, 4294967295U, 789f, -9223372036854775807L, 18446744073709551615UL, 12345678.12345678d";
/* Define the type strings for each spec entry */
static const char* spectype[] = {"i", "b", "ub", "s", "us", "i", "ui", "i", "i", "ui", "f", "ll", "ull", "d"};
static int nerrs = 0;
static void
@ -85,31 +153,31 @@ buildbaseline(void)
double float8;
val4 = ((unsigned int)-17) & 0xff;
insert(0,&val4,sizeof(val4)); /* 0 signed int*/
insert(0,&val4,sizeof(val4)); /* signed int*/
val4 = (unsigned int)23;
insert(1,&val4,sizeof(val4)); /* 1 unsigned int*/
insert(1,&val4,sizeof(val4)); /* unsigned int*/
val4 = ((unsigned int)-25) & 0xffff;
insert(2,&val4,sizeof(val4)); /* 3 signed int*/
insert(2,&val4,sizeof(val4)); /* signed int*/
val4 = (unsigned int)27;
insert(3,&val4,sizeof(val4)); /* 4 unsigned int*/
insert(3,&val4,sizeof(val4)); /* unsigned int*/
val4 = (unsigned int)77;
insert(4,&val4,sizeof(val4)); /* 4 signed int*/
insert(4,&val4,sizeof(val4)); /* signed int*/
val4 = (unsigned int)93;
insert(5,&val4,sizeof(val4)); /* 5 unsigned int*/
insert(5,&val4,sizeof(val4)); /* unsigned int*/
val4 = 2147483647; /*0x7fffffff*/
insert(6,&val4,sizeof(val4)); /* signed int */
val4 = (-2147483647)-1; /*0x80000000*/
insert(7,&val4,sizeof(val4)); /* signed int */
val4 = 4294967295U; /*0xffffffff*/
insert(8,&val4,sizeof(val4)); /* unsigned int */
float4 = 789.0f;
insert(6,&float4,sizeof(float4)); /* 6 float */
float8 = DBLVAL;
insert(7,&float8,sizeof(float8)); /* 7 double */
insert(9,&float4,sizeof(float4)); /*float */
val8 = -9223372036854775807L;
insert(9,&val8,sizeof(val8)); /* 9 signed long long */
insert(10,&val8,sizeof(val8)); /* signed long long */
val8 = 18446744073709551615UL;
insert(11,&val8,sizeof(val8)); /* 11 unsigned long long */
val4 = 2147483647;
insert(13,&val4,sizeof(val4)); /* 13 signed int */
val4 = (-2147483647)-1;
insert(14,&val4,sizeof(val4)); /* 14 signed int */
val4 = 4294967295U;
insert(15,&val4,sizeof(val4)); /* 15 unsigned int */
insert(12,&val8,sizeof(val8)); /* unsigned long long */
float8 = DBLVAL;
insert(114,&float8,sizeof(float8)); /* double */
}
/**************************************************/
@ -132,46 +200,224 @@ main(int argc, char **argv)
}
if(id != PARAMS_ID)
fprintf(stderr,"mismatch: id: expected=%u actual=%u\n",PARAMS_ID,id);
for(i=0;i<nparams;i++) {
if(baseline[i] != params[i])
mismatch(i,params,"N.A.");
}
/* Now some specialized tests */
uf.ui = params[6];
if(uf.f != (float)789.0)
mismatch(6,params,"uf.f");
ud.ui[0] = params[7];
ud.ui[1] = params[8];
#ifdef WORD_BIGENDIAN
byteswap8((unsigned char*)&ud.d);
#endif
if(ud.d != (double)DBLVAL)
mismatch2(7,params,"ud.d");
ul.ui[0] = params[9];
ul.ui[1] = params[10];
#ifdef WORD_BIGENDIAN
byteswap8((unsigned char*)&ul.ll);
#endif
if(ul.ll != -9223372036854775807LL)
mismatch2(9,params,"ul.ll");
ul.ui[0] = params[11];
ul.ui[1] = params[12];
#ifdef WORD_BIGENDIAN
byteswap8((unsigned char*)&ul.ull);
#endif
if(ul.ull != 18446744073709551615ULL)
mismatch2(11,params,"ul.ull");
if (params)
free(params);
/* Do all the 32 bit tests */
for(i=0;i<=8;i++) {
if(baseline[i] != params[i])
mismatch(i,params,spectype[i]);
}
/* float */
uf.ui = params[9];
if(uf.f != (float)FLTVAL)
mismatch(9,params,"uf.f");
/* signed long long */
ul.ui[0] = params[10];
ul.ui[1] = params[11];
NC_filterfix8((unsigned char*)&ul.ll,1);
if(ul.ll != LONGLONGVAL)
mismatch2(10,params,"ul.ll");
/* unsigned long long */
ul.ui[0] = params[12];
ul.ui[1] = params[13];
NC_filterfix8((unsigned char*)&ul.ull,1);
if(ul.ull != ULONGLONGVAL)
mismatch2(12,params,"ul.ull");
/* double */
ud.ui[0] = params[14];
ud.ui[1] = params[15];
NC_filterfix8((unsigned char*)&ud.d,1);
if(ud.d != (double)DBLVAL)
mismatch2(14,params,"ud.d");
if (!nerrs)
printf("SUCCESS!!\n");
if(params) free(params);
return (nerrs > 0 ? 1 : 0);
}
#ifdef WORD_BIGENDIAN
#if 0
/* Look at q0 and q1) to determine type */
static int
gettype(const int q0, const int q1, int* isunsignedp)
{
int type = 0;
int isunsigned = 0;
char typechar;
isunsigned = (q0 == 'u' || q0 == 'U');
if(q1 == '\0')
typechar = q0; /* we were given only a single char */
else if(isunsigned)
typechar = q1; /* we have something like Ux as the tag */
else
typechar = q1; /* look at last char for tag */
switch (typechar) {
case 'f': case 'F': case '.': type = 'f'; break; /* float */
case 'd': case 'D': type = 'd'; break; /* double */
case 'b': case 'B': type = 'b'; break; /* byte */
case 's': case 'S': type = 's'; break; /* short */
case 'l': case 'L': type = 'l'; break; /* long long */
case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9': type = 'i'; break;
case 'u': case 'U': type = 'i'; isunsigned = 1; break; /* unsigned int */
case '\0': type = 'i'; break;
default: break;
}
if(isunsignedp) *isunsignedp = isunsigned;
return type;
}
static int
parsefilterspec(const char* spec, unsigned int* idp, size_t* nparamsp, unsigned int** paramsp)
{
int stat = NC_NOERR;
int sstat; /* for scanf */
char* p;
char* sdata = NULL;
unsigned int id;
size_t count; /* no. of comma delimited params */
size_t nparams; /* final no. of unsigned ints */
size_t len;
int i;
unsigned int* ulist = NULL;
unsigned char mem[8];
if(spec == NULL || strlen(spec) == 0) goto fail;
sdata = strdup(spec);
/* Count number of parameters + id and delimit */
p=sdata;
for(count=0;;count++) {
char* q = strchr(p,',');
if(q == NULL) break;
*q++ = '\0';
p = q;
}
count++; /* for final piece */
if(count == 0)
goto fail; /* no id and no parameters */
/* Extract the filter id */
p = sdata;
sstat = sscanf(p,"%u",&id);
if(sstat != 1) goto fail;
/* skip past the filter id */
p = p + strlen(p) + 1;
count--;
/* Allocate the max needed space; *2 in case the params are all doubles */
ulist = (unsigned int*)malloc(sizeof(unsigned int)*(count)*2);
if(ulist == NULL) goto fail;
/* walk and convert */
nparams = 0; /* actual count */
for(i=0;i<count;i++) { /* step thru param strings */
unsigned long long val64u;
unsigned int val32u;
double vald;
float valf;
unsigned int *vector;
int isunsigned = 0;
int isnegative = 0;
int type = 0;
char* q;
len = strlen(p);
/* skip leading white space */
while(strchr(" ",*p) != NULL) {p++; len--;}
/* Get leading sign character, if any */
if(*p == '-') isnegative = 1;
/* Get trailing type tag characters */
switch (len) {
case 0:
goto fail; /* empty parameter */
case 1:
case 2:
q = (p + len) - 1; /* point to last char */
type = gettype(*q,'\0',&isunsigned);
break;
default: /* > 2 => we might have a two letter tag */
q = (p + len) - 2;
type = gettype(*q,*(q+1),&isunsigned);
break;
}
/* Now parse */
switch (type) {
case 'b':
case 's':
case 'i':
/* special case for a positive integer;for back compatibility.*/
if(!isnegative)
sstat = sscanf(p,"%u",&val32u);
else
sstat = sscanf(p,"%d",(int*)&val32u);
if(sstat != 1) goto fail;
switch(type) {
case 'b': val32u = (val32u & 0xFF); break;
case 's': val32u = (val32u & 0xFFFF); break;
}
ulist[nparams++] = val32u;
break;
case 'f':
sstat = sscanf(p,"%lf",&vald);
if(sstat != 1) goto fail;
valf = (float)vald;
ulist[nparams++] = *(unsigned int*)&valf;
break;
/* The following are 8-byte values, so we must swap pieces if this
is a little endian machine */
case 'd':
sstat = sscanf(p,"%lf",&vald);
if(sstat != 1) goto fail;
memcpy(mem,&vald,sizeof(mem));
NC_filterfix8(mem,0);
vector = (unsigned int*)mem;
ulist[nparams++] = vector[0];
ulist[nparams++] = vector[1];
break;
case 'l': /* long long */
if(isunsigned)
sstat = sscanf(p,"%llu",&val64u);
else
sstat = sscanf(p,"%lld",(long long*)&val64u);
if(sstat != 1) goto fail;
memcpy(mem,&val64u,sizeof(mem));
NC_filterfix8(mem,0);
vector = (unsigned int*)&mem;
ulist[nparams++] = vector[0];
ulist[nparams++] = vector[1];
break;
default:
goto fail;
}
p = p + strlen(p) + 1; /* move to next param */
}
/* Now return results */
if(idp) *idp = id;
if(nparamsp) *nparamsp = nparams;
if(paramsp) {
*paramsp = ulist;
ulist = NULL; /* avoid duplicate free */
}
done:
if(sdata) free(sdata);
if(ulist) free(ulist);
return stat;
fail:
stat = NC_EFILTER;
goto done;
}
#ifdef WORDS_BIGENDIAN
/* Byte swap an 8-byte integer in place */
static void
byteswap8(unsigned char* mem)
@ -190,4 +436,38 @@ byteswap8(unsigned char* mem)
mem[3] = mem[4];
mem[4] = c;
}
/* Byte swap an 8-byte integer in place */
static void
byteswap4(unsigned char* mem)
{
unsigned char c;
c = mem[0];
mem[0] = mem[3];
mem[3] = c;
c = mem[1];
mem[1] = mem[2];
mem[2] = c;
}
#endif
static void
NC_filterfix8(unsigned char* mem, int decode)
{
#ifdef WORDS_BIGENDIAN
if(decode) { /* Apply inverse of the encode case */
byteswap4(mem); /* step 1: byte-swap each piece */
byteswap4(mem+4);
byteswap8(mem); /* step 2: convert to little endian format */
} else { /* encode */
byteswap8(mem); /* step 1: convert to little endian format */
byteswap4(mem); /* step 2: byte-swap each piece */
byteswap4(mem+4);
}
#else /* Little endian */
/* No action is necessary */
#endif
}
#endif /*0*/

View File

@ -17,12 +17,12 @@
#endif
#include <string.h>
#include "netcdf.h"
#include "netcdf_filter.h"
#include "nciter.h"
#include "utils.h"
#include "chunkspec.h"
#include "dimmap.h"
#include "nccomps.h"
#include "ncfilter.h"
#include "list.h"
#undef DEBUGFILTER

View File

@ -17,7 +17,7 @@ static char SccsId[] = "$Id: ncgen.y,v 1.42 2010/05/18 21:32:46 dmh Exp $";
#include "ncgeny.h"
#include "ncgen.h"
#ifdef USE_NETCDF4
#include "ncfilter.h"
#include "netcdf_filter.h"
#endif
/* Following are in ncdump (for now)*/

File diff suppressed because it is too large Load Diff

View File

@ -80,7 +80,7 @@ static char SccsId[] = "$Id: ncgen.y,v 1.42 2010/05/18 21:32:46 dmh Exp $";
#include "ncgeny.h"
#include "ncgen.h"
#ifdef USE_NETCDF4
#include "ncfilter.h"
#include "netcdf_filter.h"
#endif
/* Following are in ncdump (for now)*/
@ -2981,7 +2981,7 @@ makeprimitivetype(nc_type nctype)
sym->typ.typecode = nctype;
sym->typ.size = ncsize(nctype);
sym->typ.nelems = 1;
sym->typ.alignment = ncaux_class_alignment(nctype);
sym->typ.alignment = ncaux_class_alignment(nctype);
/* Make the basetype circular so we can always ask for it */
sym->typ.basetype = sym;
sym->prefix = listnew();
@ -3284,7 +3284,7 @@ makespecial(int tag, Symbol* vsym, Symbol* tsym, void* data, int isconst)
else if(tag == _NCPROPS_FLAG) {
globalspecials._NCProperties = sdata;
sdata = NULL;
}
}
} else {
Specialdata* special;
/* Set up special info */
@ -3361,7 +3361,7 @@ makespecial(int tag, Symbol* vsym, Symbol* tsym, void* data, int isconst)
break;
case _CHUNKSIZES_FLAG: {
int i;
list = (isconst ? const2list(con) : list);
list = (isconst ? const2list(con) : list);
special->nchunks = list->length;
special->_ChunkSizes = (size_t*)ecalloc(sizeof(size_t)*special->nchunks);
for(i=0;i<special->nchunks;i++) {

View File

@ -108,7 +108,7 @@ occompile1(OCstate* state, OCnode* xnode, XXDR* xxdrs, OCdata** datap)
if(!xxdr_uint(xxdrs,&xdrcount))
{ocstat = OC_EXDR; goto fail;}
if(xdrcount != nelements)
{ocstat=OC_EINVALCOORDS; goto fail;}
{ocstat=OCTHROW(OC_EINVALCOORDS); goto fail;}
/* allocate space to capture all the element instances */
data->instances = (OCdata**)malloc(nelements*sizeof(OCdata*));
@ -160,7 +160,7 @@ occompile1(OCstate* state, OCnode* xnode, XXDR* xxdrs, OCdata** datap)
break; /* we are done with the this sequence instance*/
} else {
nclog(NCLOGERR,"missing/invalid begin/end record marker\n");
ocstat = OC_EINVALCOORDS;
ocstat = OCTHROW(OC_EINVALCOORDS);
goto fail;
}
}
@ -301,11 +301,11 @@ occompileatomic(OCstate* state, OCdata* data, XXDR* xxdrs)
nelements = octotaldimsize(xnode->array.rank,xnode->array.sizes);
/* Get first copy of the dimension count */
if(!xxdr_uint(xxdrs,&xxdrcount)) {ocstat = OC_EXDR; goto fail;}
if(xxdrcount != nelements) {ocstat=OC_EINVALCOORDS; goto fail;}
if(xxdrcount != nelements) {ocstat=OCTHROW(OC_EINVALCOORDS); goto fail;}
if(xnode->etype != OC_String && xnode->etype != OC_URL) {
/* Get second copy of the dimension count */
if(!xxdr_uint(xxdrs,&xxdrcount)) {ocstat = OC_EXDR; goto fail;}
if(xxdrcount != nelements) {ocstat=OC_EINVALCOORDS; goto fail;}
if(xxdrcount != nelements) {ocstat=OCTHROW(OC_EINVALCOORDS); goto fail;}
}
} else { /*scalar*/
nelements = 1;

View File

@ -102,6 +102,9 @@ ocopen(OCstate** statep, const char* url)
NCURI* tmpurl = NULL;
CURL* curl = NULL; /* curl handle*/
if(!ocinitialized)
ocinternalinitialize();
if(ncuriparse(url,&tmpurl) != NCU_OK) {
OCTHROWCHK(stat=OC_EBADURL);
goto fail;

View File

@ -6,6 +6,7 @@
#include <hdf5.h>
/* Older versions of the hdf library may define H5PL_type_t here */
#include <H5PLextern.h>
#include "h5misc.h"
#ifndef DLL_EXPORT
#define DLL_EXPORT
@ -27,8 +28,6 @@ will generate an error.
*/
#include "h5misc.h"
#undef DEBUG
/* The C standard apparently defines all floating point constants as double;
@ -37,9 +36,10 @@ will generate an error.
#define DBLVAL 12345678.12345678
static int paramcheck(size_t nparams, const unsigned int* params);
static void byteswap8(unsigned char* mem);
static void mismatch(size_t i, const char* which);
extern void NC_filterfix8(void* mem, int decode);
const H5Z_class2_t H5Z_TEST[1] = {{
H5Z_CLASS_T_VERS, /* H5Z_class_t version */
(H5Z_filter_t)(H5Z_FILTER_TEST), /* Filter id number */
@ -138,12 +138,10 @@ static int
paramcheck(size_t nparams, const unsigned int* params)
{
size_t i;
/* Test endianness of this machine */
const unsigned char b[4] = {0x0,0x0,0x0,0x1}; /* value 1 in big-endian*/
int bigendian = (1 == *(unsigned int*)b); /* 1=>big 0=>little*/
unsigned char mem[8];
if(nparams != 14) {
fprintf(stderr,"Too few parameters: need=16 sent=%ld\n",(unsigned long)nparams);
fprintf(stderr,"Too few parameters: need=14 sent=%ld\n",(unsigned long)nparams);
goto fail;
}
@ -190,33 +188,37 @@ paramcheck(size_t nparams, const unsigned int* params)
{mismatch(i,"float"); goto fail; };
break;
case 8: {/*double*/
double x = *(double*)&params[i];
double x;
memcpy(mem,&params[i],sizeof(mem));
NC_filterfix8(mem,1); /* Fix up endianness */
x = *(double*)mem;
dval = DBLVAL;
i++; /* takes two parameters */
if(bigendian)
byteswap8((unsigned char*)&x);
if(dval != x) {
mismatch(i,"double");
goto fail;
}
}; break;
case 10: {/*signed long long*/
signed long long x = *(signed long long*)&params[i];
signed long long x;
memcpy(mem,&params[i],sizeof(mem));
NC_filterfix8(mem,1); /* Fix up endianness */
x = *(signed long long*)mem;
NC_filterfix8(&x,1); /* Fix up endianness */
lval = -9223372036854775807L;
i++; /* takes two parameters */
if(bigendian)
byteswap8((unsigned char*)&x);
if(lval != x) {
mismatch(i,"signed long long");
goto fail;
}
}; break;
case 12: {/*unsigned long long*/
unsigned long long x = *(unsigned long long*)&params[i];
unsigned long long x;
memcpy(mem,&params[i],sizeof(mem));
NC_filterfix8(mem,1); /* Fix up endianness */
x = *(unsigned long long*)mem;
lval = 18446744073709551615UL;
i++; /* takes two parameters */
if(bigendian)
byteswap8((unsigned char*)&x);
if(lval != x) {
mismatch(i,"unsigned long long");
goto fail;
@ -245,24 +247,6 @@ fail:
return 0;
}
static void
byteswap8(unsigned char* mem)
{
unsigned char c;
c = mem[0];
mem[0] = mem[7];
mem[7] = c;
c = mem[1];
mem[1] = mem[6];
mem[6] = c;
c = mem[2];
mem[2] = mem[5];
mem[5] = c;
c = mem[3];
mem[3] = mem[4];
mem[4] = c;
}
static void
mismatch(size_t i, const char* which)
{

65
plugins/H5Zutil.c Normal file
View File

@ -0,0 +1,65 @@
/*
* Copyright 2018, University Corporation for Atmospheric Research
* See netcdf/COPYRIGHT file for copying and redistribution conditions.
*/
#include <hdf5.h>
/*
Common utilities related to filters.
Taken from libdispatch/dfilters.c.
*/
#ifdef WORDS_BIGENDIAN
/* Byte swap an 8-byte integer in place */
static void
byteswap8(unsigned char* mem)
{
unsigned char c;
c = mem[0];
mem[0] = mem[7];
mem[7] = c;
c = mem[1];
mem[1] = mem[6];
mem[6] = c;
c = mem[2];
mem[2] = mem[5];
mem[5] = c;
c = mem[3];
mem[3] = mem[4];
mem[4] = c;
}
/* Byte swap an 8-byte integer in place */
static void
byteswap4(unsigned char* mem)
{
unsigned char c;
c = mem[0];
mem[0] = mem[3];
mem[3] = c;
c = mem[1];
mem[1] = mem[2];
mem[2] = c;
}
#endif /*WORDS_BIGENDIAN*/
void
NC_filterfix8(void* mem0, int decode)
{
#ifdef WORDS_BIGENDIAN
unsigned char* mem = mem0;
if(decode) { /* Apply inverse of the encode case */
byteswap4(mem); /* step 1: byte-swap each piece */
byteswap4(mem+4);
byteswap8(mem); /* step 2: convert to little endian format */
} else { /* encode */
byteswap8(mem); /* step 1: convert to little endian format */
byteswap4(mem); /* step 2: byte-swap each piece */
byteswap4(mem+4);
}
#else /* Little endian */
/* No action is necessary */
#endif
}

View File

@ -8,7 +8,7 @@ PLUGINSRC=H5Zbzip2.c
PLUGINHDRS=h5bzip2.h
EXTRA_DIST=${PLUGINSRC} ${BZIP2SRC} ${PLUGINHDRS} ${BZIP2HDRS} \
H5Ztemplate.c H5Zmisc.c CMakeLists.txt
H5Ztemplate.c H5Zmisc.c H5Zutil.c CMakeLists.txt
# WARNING: This list must be kept consistent with the corresponding
# AC_CONFIG_LINK commands near the end of configure.ac.
@ -21,7 +21,7 @@ lib_LTLIBRARIES = libbzip2.la libmisc.la
libbzip2_la_SOURCES = ${HDF5PLUGINSRC}
libbzip2_la_LDFLAGS = -module -avoid-version -shared -export-dynamic -no-undefined
libmisc_la_SOURCES = H5Zmisc.c h5misc.h
libmisc_la_SOURCES = H5Zmisc.c H5Zutil.c h5misc.h
libmisc_la_LDFLAGS = -module -avoid-version -shared -export-dynamic -no-undefined -rpath ${abs_builddir}
endif #ENABLE_FILTER_TESTING