The I/O-Model

Typical disks have the seek time much larger than the time to transfer a single record; hence, to amortize this difference, data is transferred between main memory and disks in blocks. The standard two-level I/O-model with one (logical) disk captures the main properties of disks and defines the following parameters:

An Input/Output operation (or simply an I/O) is the operation of transferring one block between disk and internal memory. Because disks are much slower than CPUs, the I/O-model measures performance as the number of I/Os performed.

Sorting and scanning are fundamental problems in the field of I/O algorithms. Optimal algorithms and bounds for these problems are known.

Overall, we see that

Scan(N) < Sort(N) << N.

In practice the difference between an algorithm doing N I/Os and one doing Sort(N) I/Os can be very significant: Consider for example a problem involving N=1GB, and a typical system with block size B=32KB and internal memory size M=256MB. Then the speedup is more than three orders of magnitude. If the external memory algorithm takes 10 minutes to complete, the internal memory algorithm could use more than 150 hours, or about 6 days!


<laura@cs.duke.edu>
Last modified: Tue Apr 17 11:25:59 2001 by laura.