TPIE is a software environment (written in C++) that facilitates the implementation of
external memory algorithms.
The goal of theoretical work in the area of external memory algorithms
(also called I/O algorithms or out-of-core algorithms) has
been to develop algorithms that minimize the Input/Output communication
(or just I/O) performed when solving problems on very large data
sets. The area was
effectively started in the late eighties by Aggarwal and Vitter and subsequently
I/O algorithms have been developed for several problem domains.
See below for recent surveys of theoretical results in the area
of I/O-efficient algorithms.
TPIE is designed to bridge the gap between the theory and practice of
parallel I/O systems. It is intended to demonstrate all of the following:
Abstract away the I/O details through a simple high-level interface.
Implement I/O-efficient paradigms to show their practical viability.
Be flexible, allowing
a wide variety of algorithms to be implemented within the system.
Be portable across a variety of hardware platforms.
Be extensible, so that new features can be easily added later.
The TPIE library consists of a kernel and a set of I/O-efficient algorithms and data
structures implemented on top of the kernel. Most of the functionality
is provided as templated classes and functions in C++. In addition,
small programs are provided for testing and illustrating the usage of
the application interface.
The latest release includes the routines for solving fundamental batched
problems such as sorting, merging, and permuting. These routines enable the programmer to write
efficient and portable implementations of algorithms that make use of
the fundamental streaming primitives which have been identified
in theoretical work. It also includes support for random access
to disk blocks, which is the foundation for implementing external
memory data structures. The classical external memory structure,
the B-tree, is included in the distribution.
A short summary of changes relative to previous versions can
be found in the CHANGES file. Installation instructions for the current release are provided
in the TPIE README file and in
the TPIE manual. The TPIE project is work in progress and users
of TPIE are encouraged to send bug reports, comments, and suggestions
Work on extending and improving TPIE is in
progress. Projects include further performance improvements, support
for parallel disks, addition of the distribution sweeping primitive,
addition of various data structure implementations, and addition of
several application examples.
Users interested in obtaining/testing preliminary versions of the
above extensions are encouraged to send a request to firstname.lastname@example.org.
A mailing list
for TPIE users interested in information about new releases and bug
reports exists. Requests to be added to the list should be sent to email@example.com.
L. Arge, J. Chase, L. Toma, J. S. Vitter,
R. Wickremesinghe, P. Halpin, and D. Urban,
Flow Computation on Massive Grids.
Proceedings of the 9th ACM International Symposium
on Advances in Geographic Information Systems (ACM-GIS '01),
Atlanta, GA, November 2001.
L. Arge, K. H. Hinrichs, J. Vahrenhold, and J. S. Vitter.
Bulk Operations on Dynamic R-trees. Proc. 1st Workshop on Algorithm
Engineering and Experimentation (ALENEX '99), Lecture Notes in Computer
Science, Springer-Verlag, 1999.
L. Arge, O. Procopiuc, S. Ramaswamy, T. Suel, and J. S. Vitter.
Sweep-Based Spatial Join. Proc. International Conference on
Very Large Databases (VLDB '98), 1998, pp. 570-581.
J. S. Vitter. Online Data
Structures in External MemoryProc. 26th Annual International
Colloquium on Automata, Languages, and Programming (ICALP '99), Lecture
Notes in Computer Science, Springer-Verlag, 1999, 119--133.