|
|
The course schedule will be posted here.
Week | Date | Topic | Lecture slides and reference |
1 | 08-30 | Introduction and overview |
Notes 1: ppt,
pdf
|
| 09-01 | Introduction to MapReduce and Hadoop |
Notes 2: ppt,
pdf
|
2 | 09-06 | Introduction to MapReduce and Hadoop (contd.) |
Notes 2: ppt,
pdf
|
| 09-08 | Some MapReduce Algorithms |
Notes 3:
pdf
|
3 | 09-13 | Some MapReduce Algorithms (contd.) |
Notes 3:
pdf
|
| 09-15 | How Hadoop Works |
Notes 4: ppt,
pdf
|
4 | 09-20 | How Hadoop Works (contd.) |
Notes 4: ppt,
pdf
|
| 09-22 |
Overview of query processing |
Notes 5: ppt,
pdf
|
5 | 09-27 | Pipelining (iterators) and Materialization |
Notes 6: ppt,
pdf
|
| 09-29 | Rule-based optimization |
Notes 6: ppt,
pdf
|
6 | 10-04 | Block-based data storage |
Notes 7: ppt,
pdf
|
| 10-06 | Index-based access |
Notes 8: ppt,
pdf
|
7 | 10-11 | Fall break (no class) |
|
| 10-13 | Index-based access (contd.) |
Notes 9: ppt,
pdf
|
8 | 10-18 | Sort processing |
Notes 10: ppt,
pdf
|
| 10-20 | Introduction to Join processing |
Notes 10: ppt,
pdf
|
9 | 10-25 | Sort-merge joins, Block and Index nested-loop joins |
Notes 10: ppt,
pdf
|
| 10-27 | Midterm |
|
10 | 11-01 | Introduction to Pig and Pig Latin |
Notes 11: ppt,
pdf
|
| 11-03 | Hash joins |
Notes 10: ppt,
pdf
|
11 | 11-08 | Cost-based Query Optimization |
Notes 12: ppt,
pdf
|
| 11-10 | Failure recovery, Logging |
Notes 13: ppt,
pdf
|
11 | 11-15 | HBase: The Hadoop Database |
Chapter 12 in textbook
by Tom White
|
| 11-17 | Programming and Debugging Large-Scale
Data Processing Workflows |
Colloquium by Chris Olston,
Yahoo! Research
|
12 | 11-22 |
HBase, Concurrency control, Serializability
|
Notes 14: ppt,
pdf,
Exercises
|
13 | 11-29 |
Scalable key-value stores, Yahoo! Cloud Serving Benchmark (YCSB)
|
YCSB page
|
|