In recent years, it has become increasingly clear that the overall time to comple- tion of parallel applications may depend
to a large extent on the time taken to perform I/O in the program. This is because many parallel applications need to access
arge amounts of data, and although great advances have been made in the CPU and communication performance of parallel machines,
similar advances have not been made in their I/O performance. The densities and capacities of disks have increased significantly,
but improvement in performance of individual disks has not followed the same pace. For parallel computers to be truly us-
able for solving real, large-scale problems, the I/O performance must be scalable and balanced with respect to the CPU and
communication performance of the system.