We propose measures for compressed data
structures, in which space usage is measured in a data-aware manner. In
particular, we consider the fundamental dictionary problem on
set data, where the task is to construct a data structure to
represent a set S of n items out of a universe
and support various queries on S. We use a well-known
data-aware measure for set data called gap to bound the space of
our data structures. We describe a novel dictionary structure taking
bits. Under the RAM
model, our dictionary supports membership, rank, select, and predecessor
queries in nearly optimal time, matching the time bound of Andersson and
Thorup's predecessor structure, while simultaneously improving upon their
space usage. Our dictionary structure uses exactly gap bits in the
leading term (i.e., the constant factor is 1) and answers queries in
near-optimal time. When seen from the worst case perspective, we present
the first
-bit dictionary structure which supports these
queries in near-optimal time under RAM model. We also build a dictionary
which requires the same space and supports membership, select, and
partial rank queries even more quickly in
time. To the
best of our knowledge, this is the first of a kind result which achieves
data-aware space usage and retains near-optimal time.