Bug Fix
So, for Raw I/O in parallel, if you open a file with truncation by
multiple processes, it looks as if one process could open the file
and start writing to it while another process also opens the file
with truncation, thus wiping out all of the stuff the first process
wrote to the file.
This is bad.
Also added some garbage collection to the pio_perf routine to reclaim
the space taken by some of the tables.
Placed an MPI_Barrier() statement after the Raw open()/create() call
so that all processes are synced up before they start writing to the
Added free() calls to the tables which weren't being free'd.
Platforms tested:
Linux-pp (eirene)
Bone-headed Bug Fix
There were blanks being put into the output. The cause: the
"print_indent()" routine was printing indents for all of the
processes, but only process 0 should have been printing them out at
all (since process 0 is the one which prints out the reports).
Check to make sure that we're process 0 before printing the indents.
Platforms tested:
Remove perf and mpi-perf from the parallel test targets since their
functions are replaced by pio_perf.
Platforms tested:
modi4 and eirene, both parallel modes.
Change default actions.
Change the default maximum number of processes (-P) to use all processes
instead of just 1 (old default). Someone most likely wants to test
the I/O performance with all processes involved.
Also starts performance measurement with maximum number of processes
and decrement it with each loop. If the performance measurement
needs to restart, it can run with fewer processes if those loops
have completed.
Platforms tested:
modi4 and eirene.
Bug Fix
Throughput wasn't being calculated correctly.
We were using a value other than the actual time. Changed so that
we're using the correct structure to grab the time out of it.
Platforms tested:
Feature add and algorithm reworking.
Added a "--debug" flag so we can print out various extra debugging
Reworked the algorithm so that it's printing the correct throughput.
Here's how it's supposed to work:
T_0 T_1 T_2 T_3 ... T_n
iteration 1
Retrieve the maximum time from each iteration over the number of
processes. (So, if T_i had the maximum time in iteration j, then use
that time). Calculate the "Throughput" of iteration j:
S_j = (raw_size / T_i)
Collect that information over all of the iterations. Then output the
Max, Min, and Ave of all of the S_k's.
Platforms tested:
Linux (pp)
Small Fix
Fixed the Min/Max/Average accumlation stuff...
Actually thought about the code and made it accumulate the
information in the correct way.
Platforms tested:
Feature Fix
Added timer from open to close for write. Changed reporting of time
for write and read to only measure the actual read/write instead of
the open/close as well...
Platforms tested:
Feature Fix
Instead of outputting the # of kilobytes in a transfer size, output
the total bytes. This helps when you're interested in just doing a
copy-paste type of thing for the transfer buffer size.
Changed the output report
Platforms tested:
Feature Add
Added output which tells how large the file is (that is, the number
of dsets * number of elements in a dset * sizeof(int)).
Platforms tested:
The default minimum xfer size of 1K was way too slow for
parallel file system like the PFS of Tflops. Set it to
128K to complete soon for default settings.
Platforms tested:
Tflops, modi4.
int64_t type is not available everywhere.
#include H5private.h which has platform dependent hooks
to define int64_t to something available.
Platforms tested:
Removing the DPSS (gridstorage) driver source code.
The DPSS (using Grid-Storage) driver is retired.
Removed the configure option with-gridstorage from configure.in.
Cvs remove the following files
Regenerated Dependencies files (some had to be hand-edited since
'make depend' did not cover them.)
Removed reference to DPSS Virtual file driver from H5F.c.
Platforms tested:
modi4 (Parallel; -with-gass=...), eirene, arabica (fortran, cxx).
Feature Fix
Added the minimum, maximum, and average time and MB/s for the write
and read operations. It now prints the report out in a pretty clear
format. It also includes how many iterations were done for the
write/read operation.
Platforms tested:
Bug fix
All processes, including those that are not part of the PIO test
sub-communicator, all attempted to run the PIO test. It resulted
in failures for those processes that are not supposed to get involved.
The function that creates the sub-communicator also returns a
parameter indicating if the process is included in the PIO test
sub-communicator. Then only those processes will really do the
PIO test.
Platforms tested:
eirene (pp) and Modi4 (pp)
Feature Fix
Changed so the "pio_perf" module handles creating and destroying the
MPI Comm. Worked it so we get the minimum, maximum, and average times
over a set of iterations.
Lots. Had to pull the MPI Comm code from the "pio_engine" module and
place it in the "pio_perf" module. Then worked on a way to have all
processes send their time output to process 0, who collects it and
gives back the min, max, and avg times for the iterations.
Platforms tested:
Linux. Doesn't work if you use more than 1 processor...*hrmph*
Bug Fix
The --raw, --hdf5, --mpiio options weren't being done correctly.
I had to change some of the tests for the io_type to & instead of |
so that they'd work how I wanted them to work.
Platforms tested:
Feature Changes
Okay, I needed to add in more parameters so that the user can modify
how things are supposed to work with the PIO programs. Also needed
to change the algorithm a bit to make these work. And needed to add
in timing for the READ option.
Added the above things. The parameters took a major rewrite of the
command-line parsing stuff. Here's the usage statement:
usage: pio_perf [OPTIONS]
-h, --help Print a usage message and exit
-d N, --num-dsets=N Number of datasets per file [default:1]
-f S, --file-size=S Size of a single file [default: 64M]
-F N, --num-files=N Number of files [default: 1]
-H, --hdf5 Run HDF5 performance test
-i, --num-iterations Number of iterations to perform [default: 1]
-m, --mpiio Run MPI/IO performance test
-o F, --output=F Output raw data into file F [default: none]
-P N, --max-num-processes=N Maximum number of processes to use [default: 1]
-p N, --min-num-processes=N Minimum number of processes to use [default: 1]
-r, --raw Run raw (UNIX) performance test
-X S, --max-xfer-size=S Maximum transfer buffer size [default: 1M]
-x S, --min-xfer-size=S Minimum transfer buffer size [default: 1K]
F - is a filename.
N - is an integer >=0.
S - is a size specifier, an integer >=0 followed by a size indicator:
K - Kilobyte
M - Megabyte
G - Gigabyte
Example: 37M = 37 Megabytes
Platforms tested:
Linux, but not fully finished...
Added nfiles and ndsets parameters checking.
Removed iteration variable since iterations would be done
in Control routine.
Platforms tested:
Eirene(pp) and modi4(pp)
Bug fix (or more like feature)
MPI_File_open does not truncate the filesize if file already exists.
This created confusion during debugging as what the real file size
is. It also interfere the real write bandwidth since the times
required to allocate new disk-space vanishes for subsequent writes
that are for offset shorter than previous file sizes.
Added a MPI_File_set_size to reset the file size to 0 for every new
Another bug is that the 'remove()' call may not work for MPIO/PHDF5
files. (e.g., filename may have some MPI prefix like "pfs:filename").
Replaced "remove" with MPI_File_delete for those cases.
Platforms tested:
modi4(pp) and eirene (pp)
bug fix, new feature
Added HDF5 write in do_write.
Added a complete do_read.
Still need timing code for the read part.
Platforms tested:
eirene (pp), modi4(pp).
Did not test serial since all changes were done in parallel area.
Ugh! I forgot to add the -m flag to the list of "short" parameter
codes. It wasn't even looking for them. *sigh*
Added it
Platforms tested:
Bug fixes, new features
There was a coding error in handling file open flags. Changed it
to use &.
Added do_cleanup to cleanup temporary test files but only if
$HDF5_NOCLEANUP is not set. This is consistent with other test programs.
Added logic so that each process is writing its own slabs of data only.
Moved the number of process, rank of process and the communicator used
for PIO run to be global variables. Makes the coding easier. (but this
is not thread-safe.)
Platforms tested:
modi4(pp) and eirene(pp).
Code cleanup
Tweaked internal error handling macros to reduce the size of the library's
object code by about 10-20%.
Also cleaned up some compiler warnings...
Platforms tested:
FreeBSD 4.4 (sleipnir)
Feature Fix
Changed default size of file to 512MB. The "-m" flag is now in
megabytes as well. This makes running things a bit faster.
Platforms tested:
Bug Fix
Fixed so that it will display the correct timing data. It will also
write to the correct file (which it wasn't before).
Put the code in for displaying the time. Had to change the way I was
passing an object to the pio_fopen() function from just being a
structure to being a pointer so that the changes could be propagated
Platforms tested:
Bug Fix
I wasn't calculating the total time correctly.
I had to subtract the previous time from the current time. This
wasn't being done...DOH
Platforms tested:
Small Fixes
After conversation with Albert, here are some small fixes for the
performance stuff. Not too significant. Though, we did add the
"buffer size" as a parameter I pass to the engine.
Feature Fix
Added code so that it measures the time it takes to do I/O and return
that to the calling function.
This code doesn't quite work yet. There is something wrong with the
MPI code in the "pio_engine.c" file...I don't know what's up with
Platforms tested:
Bug fix
Cleaned up obvious syntax errors to make it to compile.
Changed unnecessarily unsigned variables to signed to avoid
messy compiler warnings.
Platforms tested:
eirene (parallel)
Backward Compatibility Fix
One of H5P[gs]et_cache's parameters changed between v1.4 and the development
Added v1.4 compat stuff around H5P[gs]et_cache implementation and testing
to allow v1.4.x users to continue to use their source code without
These changes are for everything except the FORTRAN wrappers - I spoke with
Elena and she will make the FORTRAN wrapper changes.
Platforms tested:
FreeBSD 4.4 (hawkwind)
Code Cleanup and Feature Add
Finally checking in the changes I made to the performance code. It
just modularizes it a bit more and performs some more checks, etc. I
also renamed the timer functions to be more inline with how other
things are named here...
Platforms tested:
bug Fix
Changed the code so that if parallel stuff isn't enabled, then we
don't compile the parallel code.
Cleaned up the code and put #ifdef's around it checking for parallel
Platforms tested:
Bug patch
pio_xxx.c will fail compiling in serial mode.
I temporary disable the compile of the pio-perform code from
the Makefile. Will fix it after sunday.
Platforms tested:
eirene (serial).
New addition
Initial version of the Parallel I/O performance measurement program.
Not fully implemented yet but checking them in before I may destroy
them by accident.
Don't run this in small file system (like AFS or eirene) since it
generates gigabytes test files.
Platforms tested:
modi4 64bits. It compiled and ran but took a long time because
the current test parametes are too "wild".
Since we're only about halfway through converting the internal use of
property lists from the "old way" to the generic property lists, we turned
off snapshots to avoid exposing lots of API changes to users, until the
APIs settled down.
Getting the snapshots rolling again seems to have become a priority, so
some changes are going to have to be made now that were going to be
postponed until we were completely finished with the conversion. This
requires that the old API functions be able to deal with both the old
and new property lists smoothly.
Kludge together the property list code so that they can transparently handle
dealing with both the old and new property lists
Platforms tested:
FreeBSD 4.4 (hawkwind)
Code cleanup for better compatibility with C++ compilers
C++ compilers are choking on our C code, for various reasons:
we used our UNUSED macro incorrectly when referring to pointer types
we used various C++ keywords as variables, etc.
we incremented enum's with the ++ operator.
Changed variables, etc.to avoid C++ keywords (new, class, typename, typeid,
Fixed usage of UNUSED macro from this:
char UNUSED *c
to this:
char * UNUSED c
Switched the enums from x++ to x=x+1
Platforms tested:
FreeBSD 4.4 (hawkwind)
Bug fix
Added the condition that Parallel programs are dependent on
the hdf5 library too.
Platforms tested:
eirene (parallel), modi4(serial and parallel).
New feature.
Test programs were assumed to be serial programs only.
There was no provision to test parallel programs automatically.
Added $(TEST_PARA_PROGS) to hold parallel test programs and
added appropriate action entry to test them if defined.
Platforms tested:
Eirene (parallel).