There has recently been much productive work in the algorithms community on techniques for efficient use of external memory in large-scale applications. In order to implement I/O-optimal algorithms efficiently, the machines they run on must support fundamental external-memory operations. Unfortunately, existing file systems generally do not support the necessary semantics or provide useful tools. There are three basic approaches to supporting development of I/O-efficient code: array-oriented systems (such as PASSION and ViC*), access-oriented systems (such as the UNIX file system and Panda), and framework-oriented systems (such as TPIE, a Transparent Parallel I/O Programming Environment). In this position statement, we discuss the advantages and potential of the TPIE approach in enabling I/O-efficient computation.
Categories and Subject Descriptors: D.4.2 [Operating Systems]: Storage Management - Secondary Storage; D.4.4 [Operating Systems]: Communications Management - Input/Output; E.2 [Data Storage Representations]: Contiguous representations; E.5 [Files]: Sorting/searching; F.2.1 [Analysis of Algorithms and Problem Complexity: Numerical Algorithms and Problems - Computations on matrices; F.2.2 [Analysis of Algorithms and Problem Complexity: Nonnumerical Algorithms and Problems - Computations on discrete structures, geometrical problems and computations, sorting and searching; B.4.4 [Input/Output and Data Communications: Performance Analysis and Design Aids] - Formal models, Worst-case analysis;
General Terms: Algorithms, Design, Languages, Performance, Theory. Additional Key Words and Phrases: I/O, external memory, secondary memory, communication, disk drive, parallel disks, sorting.