# |
Date |
Topic |
Readings |
Sampling |
|||
1 |
Wed, 8/26 |
Reservoir Sampling (slides) |
J. Vitter, “Random Sampling with a Reservoir”, ACM Transactions on Mathematical Software, 1985 |
2 |
Fri, 8/28 |
Sampling from databases (slides) |
N. Dalvi, R. Kumar, A. Machanavajjhala, V. Rastogi,
"Sampling hidden objects using nearest-neighbor oracles",
KDD 2011. Optional: Frank Olken "Random Sampling from Databases", PhD Thesis |
3 |
Wed,9/2 |
Monte Carlo Estimation, DNF counting (slides) | R. Karp, M. Luby, N. Madras, "Monte Carlo Estimation Algorithm for Enumeration Problems", Journal of Algorithms 10(3) 1989 |
4 |
Fri, 9/4 |
Markov Chain Monte Carlo (slides) Assignment 1 posted |
L. Lovasz "Random Walks on Graphs: A Survey", Combinatorics 1993 |
5 |
Wed, 9/9 |
Coupling and Mixing Times (slides) (notes) |
|
Streaming Algorithms | |||
6 |
Fri, 9/11 |
Filtering & Estimating the number of distinct objects (slides) Assignment 1 DUE |
Chapter 4: Section 4.3, 4.4, Ullman & Rajaraman textbook |
7 |
Wed, 9/16 |
Frequency Moments (notes) |
Chapter 3: Section 4.5, Ullman & Rajaraman textbook Alon, Mathias, Szegedy, "The space complexity of approximating the frequency moments", STOC 96 |
8 |
Fri, 9/18 |
k-wise independence & Count min sketch (Andrew McGregor's Slides) Assignment 2 posted |
Cormode, Muthukrishnan, "An improved data stream summary: The count min sketch and its applications" |
9 |
Wed, 9/23 |
Count min sketch and Heavy Hitters |
|
10 |
Fri, 9/25 |
Online Prediction and Decision Making (slides) Assignment 2 DUE Monday 9/28 |
|
Parallel Architechtures |
|||
11 |
Wed, 9/30 |
Map Reduce basics (slides) | Chapter 2: Sections 2.1, 2.2, 2.3, Ullman & Rajaram textbook |
12 |
Fri, 10/2 |
Map Reduce (continued) Locality Sensitive Hashing (slides) Project Proposal is Due |
Chapter 2: Sections 2.1, 2.2, 2.3, Ullman & Rajaram textbook Chapter 3: Ullman & Rajaraman textbook |
13 |
Wed, 10/7 |
Graphs on Map Reduce (slides) | Rastogi, Machanavajjhala, Chitnis, Das Sarma, "Finding Connected Components in Map Reduce in Logarithmic Rounds", ICDE 2013 Kang, Tsourakis, Faloutsos, "Pegasus: a Peta Scale Graph Mining System", ICDM 2009 |
14 |
Fri, 10/9 |
Graph Processing & Iterations (slides) |
Malewicz et al, "Pregel: A system for large scale graph processing", SIGMOD 2010 |
Wed 10/14 |
NO CLASS Assignment 3 posted |
||
15 |
Fri, 10/16 |
Asynchronous Processing |
Low, Gonzales, Kyrola, Bickson, Guestrin, Hellerstein, "Distributed Graphlab", VLDB 2012 Wang, Xie, Demers, Gehrke, "Asynchronous Large Scale Graph Processing Made Easy", CIDR 2013 |
16 |
Wed, 10/21 |
Resilient Distributed Datasets (slides) |
Zaharia et al "Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing", NSDI 2012 |
17 |
Fri, 10/23 |
Feed Following
(slides) |
Silberstein, Terrace, Cooper, Ramakrishnan, "Feeding Frency: Selectively Materializing User's Event Feeds", SIGMOD 2010 Silberstein, Machanavajjhala, Ramakrishnan, "Feed Following", DBSocial 2011 |
Joins | |||
18 |
Wed 10/28 |
Basic Join Algorithms (slides) Assignment 3 DUE |
|
19 |
Fri 10/30 |
Worst Case Optimal Join Algorithms (slides) |
|
20 |
Wed 11/4 |
Record Linkage and Fuzzy Joins Mid-term report DUE |
Chapter 3: Ullman & Rajaraman textbook |
Computing with Error |
|||
21 |
Fri, 11/6 |
Motivation: Differential privacy |
"Calibrating Noise to Sensitivity in Private Date Analysis" C. Dwork, F. McSherry, K. Nissim, A.Smith TCC 2006 "Differential Privacy" C. Dwork, ICALP 2006 |
22 |
Wed, 11/11 |
Answering sets of counting queries |
"Boosting the Accuracy of Differentially Private Histograms Through Consistency", M. Hay, V. Rastogi, G. Miklau, D. Suciu, PVLDB 2010 "Optimizing Linear Counting Queries Under Differential Privacy" C. Li, M. Hay, V. Rastogi, G. Miklau, A. McGregor PODS 2010 |
23 |
Fri, 11/13 |
Computing Histograms
|
"A Data- and Workload- aware Algorithm for Range Queries under Differential privacy", Li, Hay, Miklau, Wang, VLDB 2014 "A simple and practical algorithm for differentially private data release", F. McSherry, K. Ligett, M. Hardt, NIPS 2012 |
24 |
Wed 11/18 |
Sorting under Error | "Computing with Noisy Information", Fiege, Raghavan, Peleg, Upfal, SIAM Journal of Computing 1994 |
25 |
Fri, 11/20 |
Project Presentations |