Duality between Prefetching and Queued Writing with Parallel Disks
David A. Hutchinson5
, Peter Sanders6
and Jeffrey Scott Vitter5 
| (5) |
Department of Computer Science, Duke University, Durham, NC, 27708-0129 |
| (6) |
Max-Planck-Institute for Computer Science, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany |
Abstract
Parallel disks promise to be a cost effective means for achieving high bandwidth in applications involving massive data sets,
but algorithms for parallel disks can be difficult to devise. To combat this problem, we define a useful and natural duality
between writing to parallel disks and the seemingly more difficult problem of prefetching. We first explore this duality for
applications involving read-once accesses using parallel disks. We get a simple linear time algorithm for computing optimal
prefetch schedules and analyze the efficiency of the resulting schedules for randomly placed data and for arbitrary interleaved
accesses to striped sequences. Duality also provides an optimal schedule for the integrated caching and prefetching problem,
in which blocks can be accessed multiple times. Another application of this duality gives us the first parallel disk sorting
algorithms that are provably optimal up to lower order terms. One of these algorithms is a simple and practical variant of
multiway merge sort, addressing a question that has been open for some time.
Supported in part by the NSF through research grant CCR-0082986.
Partially supported by the IST Programme of the EU under contract number IST-1999-14186 (ALCOM-FT)
Supported in part by the NSF through research grants CCR-9877133 and EIA-9870724 and by the ARO through MURI grant DAAH04-96-1-0013
References secured to subscribers.