Fixing Bug 2092 - h5dump does not display index for a dataset
Description:
h5dump skip displaying array indices every certain number when the
array type dataset is relatively big. The certain number varies
according to the size of each array.
This checkin fix the skipping array indices problem. This fix also
correct indentation of the dataset data output.
Tested:
jam (linux32-LE), amani (linux64-LE), heiwa (linuxppc64-BE), tejeda (mac32-LE), linew (solaris-BE)
h5dump: add dangling external link test case as part of Chicago project.
Tested:
jam (linux32-LE), amani (linux64-LE), heiwa (linuxppc64-BE), tejeda (mac32-LE), linew (solaris-BE)
Bring Coverity changes from branch to trunk:
r19161:
Fixed the part for matching the subset info with dataset
r19189:
BZ1646: h5dump does not check number of dimensions for subsetting parameters against the dataset
Changed subset_t structure from holding hsize_t pointers to holding new subset_d pointers, which hold the original hsize_t pointer + len. this len is then checked against dataset ndims in the handle_dataset function of h5dump.
Changed all references to use new data structure.
Added tests for each subset parameter.
r19190:
Added new h5dump ddl files
Tested on:
Mac OS X/32 10.6.4 (amazon) w/debug & production
(h5committested on branch)
Bring changes from file free space branch back to the trunk. *yay!*
Tested on:
FreeBSD/32 6.3 (duty) in debug mode
FreeBSD/64 6.3 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (jam) w/PGI compilers, w/default API=1.8.x,
w/C++ & FORTRAN, w/threadsafe, in debug mode
Linux/64-amd64 2.6 (smirom) w/Intel compilers, w/default API=1.6.x,
w/C++ & FORTRAN, in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in debug mode
Linux/64-amd64 2.6 (abe) w/parallel, w/FORTRAN, in production mode
Mac OS X/32 10.5.8 (amazon) in debug mode
Mac OS X/32 10.5.8 (amazon) w/C++ & FORTRAN, w/threadsafe,
in production mode
ISSUE : the tools use the following formula to read by hyperslabs: hyperslab_size[i] = MIN( dim_size[i], H5TOOLS_BUFSIZE / datum_size) where H5TOOLS_BUFSIZE is a constant defined of 1024K. This is OK as long as the datum_size does not exceed 1024K, otherwise we have a hyperslab size of 0 (since 1024K/(greater than 1024K) = 0). This affects h5dump. h5repack, h5diff
SOLUTION: add a check for a 0 size and define as 1 if so.
TEST FOR H5DUMP: Defined a case in the h5dump test generator program of such a type (an array type of doubles with a large array dimension, that was the case the user reported). Since the written file commited in svn would be around 1024K, opted for not writing the data (the part of the code where the hyperslab is defined is executed, since h5dump always reads the files). Defined a macro WRITE_ARRAY to enable such writing if needed. Added a run on the h5dump shell script. Added 2 new files to svn: tools/testfiles/tarray8.ddl, tools/testfiles/tarray8.h5. NOTE: while doing this I thought of adding this dataset case to an existing file, but that would add the large array output to those files (the ddls). The issue is that the file list is increasing.
TEST FOR H5DIFF: for h5diff the check for reading by hyperslabs is H5TOOLS_MALLOCSIZE (128 * H5TOOLS_BUFSIZE) or 128 Mb. This makes it not possible to add such a file to svn, so used the same method as h5dump (only write the dataset if WRITE_ARRAY is defined). As opposed to h5dump, the hyperslab code is NOT executed when the dataset is empty (dataset is not read). Added the new dataset to existing files and shell run (tools/h5diff/testfiles/h5diff_dset1.h5 and tools/h5diff/testfiles/h5diff_dset2.h5 and output in tools/h5diff/testfiles/h5diff_80.txt).
TEST FOR H5REPACK: similar issue as h5diff with the difference that the hyperslab code is run. Added a run to the shell script (with a filter, otherwise the code uses H5Ocopy).
tested: linux (h5commitest failed , apparently it did not detect the code changes in /tools/lib that fix the bug: the error in an assertion in the hyperslab of 0. I am sure that making h5ccomitest --distclean will detect the new code , but don't want to wait more 3 hours :-) )
Introduced a new feature in the tools library regarding command line parsing
In the definition of arguments, an "*" means that the switch can or can not have an optional argument. This "*" is put in the code regarding the letter definition, and it is transparent to the user (e.g b* instead of the previous b: ), where ":" notes a required argument after the letter (and no ":" or "*" notes no argument, mandatory)
Used for the h5dump binary option -b
It can be now
1) -b (defaults to NATIVE)
2) - b NATIVE
3) - b FILE
4) -b LE
5) -b BE
Note: the keyword NATIVE replaces MEMORY
This feature (-b with no argument) was tested with the sequence of h5dump to binary (NATIVE) then h5import to generate an HDF5 file from the binary file and h5diff to compare the 2 HDF5 files
Tested: linux
Description: Improved external link traversal of h5dump. h5dump will now
properly avoid all cycles, even those spanning multiple files. Improvement
to the output of committed datatypes. Committed datatypes are now checked
for uniqueness (like other objects). Tests added for these cases.
Tested: kagiso, linew, smirom (h5committest)
Usage is
-m T, --format=T
Where T - is a string containing the floating point format, e.g '%.3f'
The test consists of writing a number with 7 fractional digits (default precision display of %f is 6 digits) and have the 7 digits displayed with
-m %.7f fpformat.h5
Tested: windows, linux, solaris
Note: the output file was generated in linux, it may be possible that platforms other than the ones tested have a different representation of the number
add a check for block overlap after the command line parsing
* Algorithm
*
* In a inner loop, the parameters from SSET are translated into temporary
* variables so that 1 row is printed at a time (getting the coordinate indices
* at each row).
* We define the stride, count and block to be 1 in the row dimension to achieve
* this and advance until all points are printed.
* An outer loop for cases where dimensionality is greater than 2D is made.
* In each iteration, the 2D block is displayed in the inner loop. The remaining
* slower dimensions above the first 2 are incremented one at a time in the outer loop
*
* The element position is obtained from the matrix according to:
* Given an index I(z,y,x) its position from the beginning of an array
* of sizes A(size_z, size_y,size_x) is given by
* Position of I(z,y,x) = index_z * size_y * size_x
* + index_y * size_x
* + index_x
*
tested: windows, linux
Added support for displaying several iteration orders on dataset attributes, 4 new tests in test script (name ascending, name descending, creation_order ascending, creation_order descending)
New h5 file is made on the generator program
Tested: windows, linux
bug fix
the binary option expects a full path in -o
TOOLTEST tbin1.ddl -d integer -o $TESTDIR/out1.bin -b LE tbinary.h5
and it prints it in the expected output , making it absolutely not portable
Solution: made a special macro function TOOLTEST1 identical to TOOLTEST except that it does not print the Expected output header
#############################
Expected output for 'h5dump -d integer -o /home/pvn/kagiso/build_hdf5/tools/h5dump/../testfiles/out1.bin -b LE tbinary.h5'
#############################
Tested : linux
1) added 5 new tests for the group creation order
2) modified the h5dump test script to automatically generated non existing (new) output files
3) cleaning of unused DDL files
4) new modified DDL files include tcomp-3.ddl ( new form of named datatype) and the binary output files
tested : linux
Bug fix.
Description:
The "h5dump -o ..." test generates temporay files in the testfiles of the
source code and later on remove them. This could cuase a racing condition
if more than one --srcdir build is using the same copy of the source code.
Since they use the same file name in the testfiles, they may conflict with
each other.
Solution:
Changed to generate the temporary files in the build-dir's own testfiles
directory. Since the build-dir can have different names, the CMP of expected
output now skip the first three lines which are label lines that contains
the location of the temporary.
Also removed the CREATE code since actual files created now cannot be
blindly copied to the expect files. Also, expected files should be
create by explicit action and careful inspection of files generated.
Tested platform:
Done in kagiso, both by --src-dir and in-place build.
Modified the current h5dump test script to use h5import/h5diff calls to validate the binary output. At this moment it can only be used with the native test, since h5import does not deal with input endianess.
tested: linux, sunos 5.10
2 tests that were previously incorporated inside the array indices test file were separated from it. These are a test with a dataset with dimensions greater tan 4GB and a test to read by hyperslabs
New version of the function h5tools_dump_simple_subset, to display subsetting. The new algorithm is:
Introduced an outer loop for cases where dimensionality is greater
than 2D. In each iteration a 2D block is displayed by rows in a inner
loop. The remainning slower dimensions above the first 2 are incremented
one at a time in the outer loop
Note: when blocks are introduced, the display is not correct. This is a bug that requires an improvement of the algorithm.
Fixed#720 h5dump: improve how region references are displayed. h5dump now uses the new API function H5Rget_name to display the name of the dataset referenced instead of its ID. Added a case to the script test file
Fix several bugs
1) the parsing of subsetting was using atoi to convert the parameter to an int, which caused problems for numbers greater that int. Substitute with atof
2) the printing of indices in the subsetting case was not being done. Solution: calculate the element position at the start of the subsetting using the algorythm
Given an index I(z,y,x) its position from the beginning of an array of sizes A(size_z, size_y,size_x) is given by
Position of I(z,y,x) = index_z * size_y * size_x
+ index_y * size_x
+ index_x
And pass that position to the function that dumps data, h5tools_dump_simple_data.
3) several index counters were declared as int, use hsize_t instead
4) modified the test generation program so that it includes test cases for subsetting of 1d, 2d, 3d, and 4d arrays and add these tests to the shell script
revised binary flags, added a new file to the test generator program to
be used in the binary tests
usage is now
-o F, --output=F Output raw data into file F
-b F, --binary=F Binary output, of form F (into file -o F).
Recommended usage is with --dataset=P
Form F of binary output is: MEMORY for memory type,
FILE for the disk file type, LE or BE for pre-existing
little or big endian types
example
./h5dump -d integer -b MEMORY -o out.bin tbinary.h5
Clean up compiler warnings/failures in test/links.c, especially when
--disable-production flag used with --enable-group-revision
Modify binary dumping in h5dump to clean up files created [a band-aid
solution to not actually creating the files in the srcdir, but better than
just leaving the files around... :-/ ]
Tested:
FreeBSD 4.11 (sleipnir) (w/ configure flags above)
Too minor to require h5committest
Users can create external links using H5L_create_external(). These links
point to an object in another HDF5 file. Users can alter the behavior of
external links or create new kinds of links by registering callbacks
using the H5L interface.
Added tests, tools support, etc.
Also a number of other, minor changes have been made (some restructuring of
the H5L interface, for instance).
Additional documentation and examples are forthcoming.
1. changed the -F flag option names to "BE and "LE" for big and little endian
2. added a more verbose usage message for these options
3. add a new test
4. add a make clean instruction to *.bin
new feature
Description:
added support for h5dump to dump binary data using the file type format
added one test to the test script that tests this
Solution:
Platforms tested:
mir
shanti
copper
Misc. update:
new feature. h5dump output of binary data
Description:
a new switch -b FILE_NAME that dumps the contents of memory data to file FILE_NAME in binary form
new program binread.c that reads the contents of this file and outputs it to stdout
added a test for the h5dump shell script that does a run of -b
the binread.c program reads the data used in this run, usage is ./binread FILE_NAME
Solution:
Platforms tested:
linux
solaris
AIX
Misc. update:
bug fixes
Description:
h5dump/h5ls were not displaying long doubles correctly
Solution:
1) the print datatype functions were incorrectly testing for the valid return value from H5Tequal,
(TRUE), causing the display of an incorrect name of a dataype in error cases from H5Tequal
2) h5tools_print_str did not have a case for native long double
3) added a file generator for a long double dataset
4) added one script test for the long double data (commented , some sytems don't have a native long double match, and the output differs)
5) added a vms file and h5dump script test
Platforms tested:
linux 32, 64
solaris
AIX
Misc. update: