Efficient Maintenance of Materialized Top-k Views

Written with Ke Yi, Jun Yang, Gangqiang Xia, and Yuguo Chen.

In Proc. 19th International Conference on Data Engineering, pages 189-200, 2003.

Downloads: PDF, PS

Abstract: We tackle the problem of maintaining materialized top-k views in this paper. Top-k queries, including MIN and MAX as important special cases, occur frequently in common database workloads. A top-k view can be materialized to improve query performance, but in general it is not self-maintainable unless it contains all tuples in the base table. Deletions and updates on the base table may cause tuples to leave the top-k view, resulting in expensive queries over the base table to "refill" the view. In this paper, we propose an algorithm that reduces the frequency of refills by maintaining a top-k' view instead of a top-k view, where k' changes at runtime between k and some kmax>k. We show that in most practical cases, our algorithm can reduce the expected amortized cost of refill queries to O(1) while still keeping the view small. Compared with the simple approach of maintaining either the top-k view itself or a copy of the base table, our algorithm can provide orders-of-magnitude improvements in performance with appropriate kmax values. We show how to choose kmax dynamically to adapt to the actual system workload and performance at runtime, without requiring accurate prior knowledge.