Many data sets to be sorted consist of a limited number of
distinct keys. Sorting such data sets can be thought of as
bundling together identical keys and having the bundles placed in
order; we therefore denote this as bundle sorting. We
describe an efficient algorithm for bundle sorting in external
memory that requires at most
disk
accesses, where N is the number of keys, M is the size of
internal memory, k is the number of distinct keys, B is the
transfer block size, and 2<c<4. For moderately sized k, this
bound circumvents the
I/O lower
bound known for general sorting. We show that our algorithm is
optimal by proving a matching lower bound for bundle sorting. The
improved running time of bundle sorting over general sorting can
be significant in practice, as demonstrated by experimentation. An
important feature of the new algorithm is that it is executed
``in-place'', requiring no additional disk space.