Perturbation Analysis of Database Queries

Ph. D. Defense
Speaker Name
Brett Walenz
Date and Time

Data-driven decision making plays a dominant role across all domains, from health, business, government, to sports. These data-driven decisions are often ad-hoc and resource-intensive: a bank has to compare and analyze all users, sporting events might use previous events to estimate an acceptable ticket sales rate. In this dissertation, I describe efficient methods for optimizing complex analytic queries.

I begin with a discussion of modeling certain complex queries as perturbation analysis, where a same query template is instantiated and evaluated with a large number of different parameter settings. I then show how to tackle this problem from three distinct angles: with parallel/distributed execution, with database query optimization and processing, and with approximation methods. For each distinct angle, I provide empirical results that show the effectiveness of our techniques for perturbation analysis, and how they benefit a wide range of analytic queries in diverse settings.

Advisor: Jun Yang Committee: Pankaj Agarwal, Sudeepa Roy, Ashwin Machanavajjhala