|
|
Databases are typically used as a subsystem in a larger system that contains Web servers, application servers, and network-attached storage servers. Such complex systems experience some form of change all the time, e.g., an update to a Java module in the application server, a statistics update in the database, or a RAID rebuild in a storage volume. Such changes in different subsystems can cause an overall performance degradation whose cause is hard to diagnose. The diagnosis task is all the more daunting because enterprise environments have isolated administration teams and tools for each subsystem.
This project is developing an integrated tool called DIADS that automates complex administrative tasks like problem diagnosis, what-if analysis, orchestrating disaster recovery, and online tuning when a database is used as a subsystem in a larger system. DIADS contains two technical innovations. Problem diagnosis involves reconstructing system behavior at various points of time using historic and current monitoring data collected from the system. However, the amount and quality of monitoring data available from production systems is constrained by the need to keep monitoring overhead low. DIADS uses an abstraction called Annotated Plan Graph to represent and reason about database behavior in the context of a larger system. Annotated Plan Graphs are generated from light-weight monitoring data.
The other innovation in DIADS is a suite of workflows for administrative tasks that combine machine-learning techniques with domain knowledge from system experts. For example, for problem diagnosis, the machine-learning part of the workflow provides core techniques to handle large and noisy streams of monitoring data, while the domain-knowledge part acts as checks-and-balances to guide the diagnosis in the right direction. This unique design enables DIADS to function effectively even in the presence of multiple concurrent problems as well as noisy monitoring data prevalent in production environments. DIADS is being prototyped for research and educational purposes in a datacenter setting with PostgreSQL databases and an enterprise-level storage area network.
DIADS is supported generously by NSF, startup funds from Duke, and three faculty awards from IBM.
Collaborators