CPS 296.1: Topics in Databases Systems
(Spring 2002)


Course Information

Please check CourseInfo for grades.


Reading Assignments

Week PapersReview Due
2 "Searching the Web," by Arasu et al., ACM Transactions on Internet Technology, 2001. Not due
3 "Optimal Aggregation Algorithms for Middleware," by Fagin et al., PODS, 2001. 2002-01-20
"Proximity Search in Databases," by Goldman et al., VLDB, 1998. 2002-01-20
4 "WSQ/DSQ: A Practical Approach for Combined Querying of Databases and the Web," by Goldman and Widom, SIGMOD, 2000. 2002-01-27
"Incremental Maintenance of Views with Duplicates," by Griffin and Libkin, SIGMOD, 1995. 2002-01-27
5 "How To Roll a Join: Asynchronous Incremental View Maintenance," by Salem et al., SIGMOD, 2000. 2002-02-03
"Making Views Self-Maintainable for Data Warehousing," by Quass et al., PDIS, 1996. 2002-02-03
6 "DynaMat: A Dynamic View Management System for Data Warehouses," by Kotidis and Roussopoulos, SIGMOD, 1999. 2002-02-10
"Answering Queries Using Views: A Survey," by Halevy, VLDB Journal, 2001. 2002-02-12
7 "Semantic Data Caching and Replacement," by Dar et al., VLDB, 1996. 2002-02-17
"Loading a Cache with Query Results," by Haas et al., VLDB, 1999. 2002-02-17
8 "WebView Materialization," by Labrinidis and Roussopoulos, SIGMOD, 2000. Not due
"Update Propagation Strategies for Improving the Quality of Data on the Web," by Labrinidis and Roussopoulos, VLDB, 2001. Not due
9 "A Publishing System for Efficiently Creating Dynamic Web Content," by Challenger et al., INFOCOMM, 2000. 2002-03-03
"Caching Strategies for Data-Intensive Web Sites," by Yagoub et al., VLDB, 2000. 2002-03-03
11 "Relational Databases for Querying XML Documents: Limitations and Opportunities," by Shanmugasundaram et al., VLDB, 1999. 2002-03-20
"Query Optimization for XML," by McHugh and Widom, VLDB, 1999. 2002-03-20
12 "Index Structures for Path Expressions," by Milo and Suciu, ICDT, 1997. 2002-03-24
"A Fast Index for Semistructured Data," by Cooper et al., VLDB, 2001. 2002-03-24
13 "View Maintenance for Hierarchical Semistructured Data," by Liefke and Davidson, DAWAK, 2000. 2002-03-31
"Efficient Evaluation of XML Middle-ware Queries," by Fernandez et al., SIGMOD, 2001. 2002-03-31
14 "Fast Algorithms for Mining Association Rules," by Agrawal and Srikant, VLDB, 1994. Not due
"Mining Frequent Patterns without Candidate Generation," by Han et al., SIGMOD, 2000. 2002-04-07
15 "Online Association Rule Mining," by Hidber, SIGMOD, 1999. 2002-04-14
"Discovering Typical Structures of Documents: A Road Map Approach," by Wang and Liu, SIGIR, 1998. 2002-04-14


Course Project

If you are taking CPS 296.1 as a regular student, the course project consitutes 60% of your total grade. The important project milestones are listed below. Please see the project description for details.

MilestoneDate
Proposal meetingsBy Thursday, 2002-02-28
Proposal dueFriday, 2002-03-01
Progress meetingsThursday, 2002-04-04 to Thursday, 2002-04-11
Final presentationThursday, 2002-05-02
Final report dueThursday, 2002-05-02

Final reports:

GroupProject
Anagha Gupte and Rahul LakhotiaStudy and Evaluation of Document Comparing Mechanisms
Andy Huang and Qiang XueExploring Implicit Relationships In a Relational Database
Sara SprenkleAn Architecture for Scaling Database-backed Web Applications
Dazhi Wang and Junyi XieBatch Mode Update For View Maintenance Over Semi-structured Data
Zhihui Wang and Ke YiWorkload Aware B+-Trees


Lecture Notes

WeekDate TopicSlides
12002-01-10 Introduction and review of basic concepts PDF
22002-01-15 Web search: ranking Web pages PDF
2002-01-17 Web search: indexing Web pages PDF
32002-01-22 Web search: crawling the Web PDF
2002-01-24 Integrating Web and database searches: rank aggregation PDF
42002-01-29 Integrating Web and database searches: proximity search and WSQ PDF
2002-01-31 Views: incremental maintenance PDF
52002-02-05 Views: practical incremental maintenance PDF
2002-02-07 Views: self maintenance PDF
62002-02-12 Views: selecting views to materialize PDF
2002-02-14 Views: answering queries using views PDF
72002-02-19 Views: answering queries using views / Datalog primer PDF
2002-02-21 Views: answering queries using views PDF
82002-02-26 Caching: query caching PDF
2002-02-28 Caching: query caching for Web PDF
92002-03-05 Caching dynamic Web content PDF
PDF
2002-03-07 XML primer PDF
112002-03-19 XML primer PDF
2002-03-21 XML storage PDF
122002-03-26 XML query processing PDF
2002-03-28 XML indexing PDF
132002-04-02 XML indexing PDF
2002-04-04 XML views PDF
142002-04-09 XML publishing PDF
2002-04-11 Data mining PDF
PDF
152002-04-16 Data mining PDF
PDF
2002-04-18 Data mining PDF


Tentative Schedule of Student Presentations

If you are taking CPS 296.1 as a regular student, you need to present at least one research paper in class. Depending on the material, the presentation can take anywhere from 30 minutes to one entire lecture. Below is the list of papers to be presented by students. Please note that the schedule is still tentative at this point.

Tentative DatePaperPresenter
Week 9 (03-05) Yagoub et al. "Caching Strategies for Data-Intensive Web Sites." VLDB, 2000 Sara Sprenkle
Week 9 (03-05) Challenger et al. "A Publishing System for Efficiently Creating Dynamic Web Content." INFOCOMM, 2000 Anaghe Gupte
Week 12 (03-26) McHugh and Widom. "Query Optimization for XML." VLDB, 1999. Dazhi Wang
Week 12 (03-28) Cooper et al. "A Fast Index for Semistructured Data." VLDB, 2001. Zhihui Wang
Week 13 (04-04) Liefke and Davidson. "View Maintenance for Hierarchical Semistructured Data." DAWAK, 2000. Junyi Xie
Week 14 (04-11) Han et al. "Mining Frequent Patterns without Candidate Generation." SIGMOD, 2000 Ke Yi
Week 15 (04-16) Hidber. "Online Association Rule Mining." SIGMOD, 1999 Andy Huang