CPS 216: Advanced Database Systems
Course Information


Index


Course Description

This course covers advanced database management system design principles and techniques. The course materials will be drawn from both classic and recent research literature. Possible topics include access methods, query processing and optimization, transaction processing, distributed databases, object-oriented and object-relational databases, data warehousing, data mining, Web and semistructured data, search engines, etc. Programming projects are required.

Prerequisites: An introductory database course or consent of instructor.


Time and Place

MW 2:20pm-3:35pm, D243 LSRC

Recitation and programming help sessions may be scheduled on Fridays 2:20pm-3:35pm, D243 LSRC.


Books

Required:

Recommended for reference:


Staff

Instructor: Jun Yang
Web: http://www.cs.duke.edu/~junyang/
Email: junyang@cs.duke.edu
Office hours: Wednesdays 3:35pm-4:35pm and Fridays 2:00pm-3:00pm in D327 LSRC, or by email appointment

TA: Junyi Xie
Web: http://www.cs.duke.edu/~junyi/
Email: junyi@cs.duke.edu
Office hours: Tuesdays and Thursdays 12:00pm-1:00pm in D328 LSRC


Web, Newsgroup, and Blackboard

Most of the course materials, including the syllabus, lecture notes, reading assignments, homeworks, programming FAQ's, etc., will be available through the course Web page (http://www.cs.duke.edu/courses/spring03/cps216/).

The newsgroup duke.cs.cps216 is useful for posting questions that are likely to be of interest to the rest of the class. We very much encourage students in the class to post responses to questions. We will monitor the the newsgroup regularly, and post responses to questions that have not previously been asked or answered. Before posting a question, please do make sure that you've read all previous messages and that your question has not yet been discussed.

We will use the Blackboard course management system (https://courses.duke.edu/bin/common/course.pl?course_id=_6454_1&frame=top) for grades. Log onto Blackboard and verify that it has your correct email address. Please check your emails regularly, as important course announcements will be sent via email.


Grading

Homeworks30%
Project30%
Midterm20%
Final20%

There are four homeworks, with a mix of written and programming problems. Late homeworks will not be graded.

There is one course project, details of which will be available in the third week of the class.


Honor Code

Under the Duke Honor Code, you are expected to submit your own work in this course, including homeworks, projects, and exams. On many occasions when working on homeworks and projects, it is useful to ask others (the instructor, the TA, or other students) for hints or debugging help, or to talk generally about the written problems or programming strategies. Such activity is both acceptable and encouraged, but you must indicate in your submission any assistance you received. Any assistance received that is not given proper citation will be considered a violation of the Honor Code. In any event, you are responsible for understanding and being able to explain on your own all written and programming solutions that you submit. The course staff will pursue aggressively all suspected cases of Honor Code violations, and they will be handled through official University channels.


Tentative Syllabus

WeekDate TopicReference*
12003-01-08 Introduction
22003-01-13 Relational model and algebra GMUW 3.1, 5.1, 5.2
2003-01-15 Relational database design GMUW 3.4-3.6
32003-01-20 Martin Luther King, Jr. Day holiday
2003-01-22 SQL schema definition and query basics GMUW 6.6.1, 6.6.2, 6.1.1-6.1.4, 6.1.7, 6.2, 6.4.1, 6.4.2
42003-01-27 SQL subqueries and aggregates GMUW 6.3, 6.4.3-6.4.6
2003-01-29 SQL NULL's, modifications, constraints, triggers GMUW 6.1.5, 6.1.6, 6.5, 7.1, 7.2, 7.4
52003-02-03 SQL index, views, transactions, and application programming GMUW 6.6.5, 6.6.6, 6.7, 8.1-8.6
2003-02-05 Physical data organization GMUW 11.2, 11.3, 12
62003-02-10 Indexing: basics, ISAM, B-tree GMUW 13.1-13.3, except 13.2.4
2003-02-12 Indexing: R-tree, GiST GMUW 14.3.7, 14.3.8
72003-02-17 Special guest lecture
2003-02-19 Indexing: hashing, inverted lists, suffix arrays GMUW 13.4, 13.2.4
82003-02-24 Main-memory indexing
2003-02-26 Query processing: scan, sort, hash GMUW 15.1-15.5, 15.8
92003-03-03 Midterm exam
2003-03-05 No class
102003-03-10 Spring recess
2003-03-12 Spring recess
112003-03-17 Query processing using indexes GMUW 15.6
2003-03-19 Buffer management GMUW 15.7
122003-03-24 Query optimization: query rewrite GMUW 16.2, 16.3
2003-03-26 Query optimization: cost estimation GMUW 16.4
132003-03-31 Query optimization: plan selection GMUW 16.5, 16.6
2003-04-02 Advanced query optimization techniques
142003-04-07 Concurrency control GMUW 18
2003-04-09 Recovery GMUW 17
152003-04-14 Distributed databases GMUW 19.4-19.6
2003-04-16 XML
162003-04-21 XML indexing
2003-04-23 Graduate reading period
172003-05-01 Final exam

* GMUW refers to the required textbook by Garcia-Molina, Ullman, and Widom.