Research Projects
Automating the Process of SQL Tuning
The need to improve a suboptimal execution plan picked by the query optimizer for a repeatedly run SQL query arises routinely. Complex expressions, skewed or correlated data, and changing conditions can cause the optimizer to make mistakes. For example, the optimizer may pick a poor join order, overlook an important index, use a nested-loop join when a hash join would have done better, or cause an expensive, but avoidable, sort to happen. SQL tuning, the attempt to improve poorly-performing execution plans, is a critical aspect of database performance tuning. Ironically, as commercial databases strive to improve on the manageability front, SQL tuning is becoming more of a black art. It requires a high level of expertise in areas like:
- query optimization, run-time execution of query plan operators, configuration parameter settings, and other database internals
- identification of missing indexes and other access structures
- statistics maintained about the data
- characteristics of the underlying storage system
Since database systems, their workloads, and the data that they manage are not getting any simpler, database users and administrators often rely on intuition and trial-and-error for SQL tuning. The main idea of this work is that the trial and error (or, experiment-driven) process of SQL tuning can be automated by the database system itself in an efficient manner; freeing the user or administrator from this burden in most cases.
zTuned: Automated SQL Tuning through Trial and (Sometimes) Error
zTuned is a new system that formulates and automates the process of SQL tuning using an experiment-driven approach, similar to the typical tuning process performed by a Database Administrator. Experiments are conducted online with almost zero impact on the user-facing production database. zTuned faces three nontrivial challenges:
- how is the SQL tuning logic integrated with the regular query optimizer
- how to plan the experiments to conduct so that a satisfactory (new) plan can be found quickly
- how to conduct experiments with minimal impact on the user-facing production workload
Xplus: A SQL-Tuning-Aware Query Optimizer
The difficulty of SQL tuning can be lessened considerably if users and higher-level tuning tools can tell the optimizer: "I am not satisfied with the performance of the plan p being used for the query Q that runs repeatedly. Can you generate a (δ%) better plan?" A novel feature of zTuned is a new SQL-tuning-aware query optimizer, called Xplus, that goes beyond the traditional plan-first-execute-next approach; Xplus is able to run some (sub)plans proactively, collect monitoring data from the runs, and iterate. Xplus guides the SQL tuning process efficiently using an extensible architecture that incorporates SQL-tuning experts with different goals, and a policy to arbitrate among the experts. Xplus has been prototyped using PostgreSQL.
The .eX Framework
Xplus and zTuned address the challenge of where to run plans and
conduct experiments with minimal impact on the user-facing production
workload, by leveraging recent solutions that let the database system
run query plans noninvasively in sandboxed and standby settings for tuning.
In the same context, I am part of a research group that is in the process
of designing a new framework, called
.eX
,
for conducting automated,
online experiments. We are taking advantage of emerging mechanisms like
virtualization and cloud computing to provide the foundations for such a
workbench. The paradigm of experiment-driven management encapsulates
mechanisms and policies that identify as well as collect missing
instrumentation data; the missing data is essential to generate the
knowledge required to address certain administrative tasks satisfactorily
and efficiently. We understand that experiments will not fully replace
current model-based management practices; rather, there is interesting
synergy between them that needs to be explored.
Software Release
The source code is distributed using the Software License Agreement for academic and research (non-commercial) purposes from Duke University. You may also license this software under a specially-negotiated non-exclusive commercial use license. The term "commercial use" is defined broadly: if the software is used for commercial gain or to further any commercial purpose, a commercial use license is required. If you have any question about whether your use would be considered commercial, or if you would like to negotiate a non-exclusive commercial use license, please contact me at: hero at cs dot duke dot edu.
The software release consists of three components:
- PostgreSQL 8.3.4 with ExplainPlan (Download:
tar.gz,
zip)
This file contains modified source code for PostgreSQL 8.3.4. We implemented a new server command, called EXPLAIN_PLAN, to expose an interface for costing and running physical plans. - Xplus (Download:
tar.gz,
zip)
This file contains the implementation of Xplus in Java. Xplus uses the EXPLAIN_PLAN interface from above for costing and runing the (sub)plans selected for execution. - Sample Database (Download:
tar.gz,
zip)
This file contains scripts to create and populate a toy database used for testing the EXPLAIN_PLAN and Xplus (though JUnit tests).
Each software component includes a detailed README file explaining how to install and use the software.
Publications
-
H. Herodotou and S. Babu.
Xplus: A SQL-Tuning-Aware Query Optimizer. (Slides)
In Proc. of the 36th Intl. Conf. on Very Large Data Bases (VLDB '10), September 2010. -
H. Herodotou and S. Babu.
Automated SQL Tuning through Trial and (Sometimes) Error. (Slides)
In Proc. of the Second Intl. Workshop on Testing Database Systems (DBTest '09), June 2009. -
S. Babu, N. Borisov, S. Duan, H. Herodotou, and V. Thummala.
Automated Experiment-Driven Management of (Database) Systems.
In Proc. of the 12th Workshop on Hot Topics in Operating Systems (HotOS-XII), May 2009. -
H. Herodotou.
zTuned: Automated SQL Tuning through Trial and (Sometimes) Error.
Master's Thesis, Duke University, April 2009.