Parallel disks provide a cost effective way of speeding up
I/Os in applications that work with large amounts of data.
The main challenge is to achieve as much parallelism as
possible, using prefetching to avoid bottlenecks in disk
access.
Efficient algorithms have been developed for some particular
patterns of accessing the disk blocks. In this paper, we
consider general request sequences. When the request
sequence consists of unique block requests, the problem is
called prefetching and is a well-solved problem for
arbitrary request sequences. When the reference sequence can
have repeated references to the same block, we need to
devise an effective caching policy as well. While optimum
offline algorithms have been recently designed for the
problem, in the online case, no effective algorithm was
previously known.
Our main contribution is a deterministic online algorithm
threshold-LRU which achieves
O((MD/L)2/3) competitive
ratio and a randomized online algorithm threshold-MARK which
achieves
competitive ratio for
the caching/prefetching problem on the parallel disk model
(PDM), where D is the number of disks, M is the size of fast
memory buffer, and M + L is the amount of lookahead
available in the request sequence. The best-known lower
bound on the competitive ratio is
for
lookahead L >= M in both models. We also show that if the
deterministic online algorithm is allowed to have twice the
memory of the offline then a tight competitive ratio of
can be achieved. This problem generalizes the
well-known paging problem on a single disk to the parallel
disk model.