Duke DBGroup Logo

CPS 216: Advanced Database Systems
(Fall 2007, Shivnath Babu)

Course information
Course schedule and notes
Assignments
Readings
Project

Project Schedule and Guidelines

See this document for the project schedule and guidelines.

Guidelines for the project proposal: Project proposal is due on Oct 5, by noon. The proposal will be graded and should include (i) a description of the problem, (ii) the motivation for the problem (e.g., why is the problem interesting, why is it challenging, who will benefit from a solution to the problem, etc.), (iii) your initial ideas on how to attack the problem, and (iv) a brief discussion of previous work related to this problem. There is no page limit for the proposal.

Further Readings

Here are some further readings for each of the project topics. To access some of the following links (e.g., papers in the ACM digital library), you need to be on the Duke Network.

Query Optimization in Database Systems

  • Guy Lohman's talk on Self-Managing DB2 with an overview of their recent work on query optimization.
  • The Picasso project and a related paper.
  • As we discussed in class, the goals of query optimization have changed over the years. Here is a paper on robust query optimization.
  • The following paper is the first technical paper on the LEO system that Volker Markl talked about. Michael Stillger, Guy M. Lohman, Volker Markl, Mokhtar Kandil: LEO - DB2's LEarning Optimizer. Available here. A new and improved version of this paper is available here.
  • A less technical, but more forward looking paper, on the LEO project appeared in the IBM Systems Journal. Available here.

Adaptive Query Processing in Database Systems

  • A recent paper on changing query plans if a problem is detected when a query is running: Volker Markl, Vijayshankar Raman, David E. Simmen, Guy M. Lohman, Hamid Pirahesh: Robust Query Processing through Progressive Optimization. Available here.
  • An attempt by Shivnath and colleagues to correct some problems with the above approach: Proactive Re-optimization.

Query Execution in Database Systems

  • A paper on Interaction-Aware Query Processing and Scheduling.
  • A paper on query suspension and resumption.
  • A paper on estimating time to completion of a query plan.

Data Stream Systems

  • Two recent projects on building data stream management systems: STREAM and Aurora. Here are two overview papers: from STREAM and from Aurora.
  • Adaptive query processing in a data stream management system. Shivnath's slides on adaptive query processing and an overview paper.
  • Work on load shedding which gracefully tackles high stream arrival rates by reducing the accuracy of query results: paper 1, paper 2, paper 3.

Configuration of Database Systems: Physical Design (e.g., Indexes and Materialized Views))

Configuration of Database Systems: Resources and Configuration Parameters

  • A paper from IBM on automated configuration of application servers.
  • A paper on our project at Duke on Active and Accelerated Learning of Cost Models for Optimizing Scientific Applications; with extensions to web services, database servers, storage servers, etc.
  • IBM DB2's Configuration Advisor.

Databases + Information Retrieval (DB+IR)

  • A paper on Google's system architecture. The paper is outdated, but the basic principles remain.
  • Some papers from IBM on the DB+IR problem: paper 1, paper 2.

Self-Healing Database Systems

  • A paper from Oracle on quick identification of performance problems.
  • Work from IBM on automated scheduling of statistics updates for DB2: paper 1, paper 2 (a non-technical article).
  • A paper from IBM on identifying distinct symptoms for different causes of DB2 failures.

Project Resources

Here are installation instructions for DB2 on the Duke CS research cluster.
Some useful information on running DB2 on Duke CS research clusters is available from CPS116 web site