CPS 590:
Rethinking Networking Paradigms for Cloud Computing and Big Data Analytics:
Infrastructure for Big Data

Fall '13: Course Home Page

[ Home | Reading List | Schedule | Assignments ]
Overview, Syllabus, Structure

In this class, we will explore the broader theme of understanding the design principles for architecting software defined cloud for big data analytics. Of particular importance will be implication of various design choices on latency both between applications within the cloud and between external facing services and the users they serve. The goal is to touch upon relevant dimensions in the design space ranging from networking, storage, virtualization, and big data application frameworks to security and reliability.

Toward this goal, the class will cover key topics in big data analytics frameworks, Cloud systems, networking and security, such as the architecture of various Cloud computing frameworks; big data workload characteristics; popular and emerging storage paradigms; the internals of data center networks; the promise of, and challenges in, Software Defined Networking; state-of-the-art schemes for Cloud security and fault tolerance. The hope is to extract key lessons for designing infrastructure for big data at various points along the course.

Emphasis: The course is somewhat networks-oriented in that we will cover both network abstractions/related software systems, as well as "lower-level" issues such as hardware and impact of protocols. For other aspects, e.g., Big data applications, we will mainly discuss abstractions and related software design/implementation issues. Future versions of this class may place more emphasis on other aspects than networking.

Note that the list of topics covered is, of course, not complete; e.g., it does not include, e.g., core virtualization technologies and Cloud programming languages, both of which are central to software-defined clouds. These may be covered in detail in future special-topics classes.

Readings: The course will be paper reading-based. See the reading list here.

Project: While readings will cover the "theory" behind Infrastructure for Big Data Analytics, spanning 5-7 wks, will help students explore the "practical" side.



Admin Details

Course prerequisites: The prerequisites for this course are CS 114 and CS 214, or equivalent under-graduate courses. Both grads and undergrads are welcome to take this class. Feel free to talk to me first if you feel you may not be able to "handle" it.

Text: There is no required text for this course. The lectures will be based on discussing research papers. The entire paper reading list is available here.

Grading: The course project carries 40% of the grade. Final will count for 40% of the grade. Participation in class and on HotCRP reviews for 20% of the grade.

Class Time:MW 3:05PM to 4:30PM

Location: LSRC D106.

Instructor: Theophilus Benson
Email: tbenson@cs.duke.edu
Office: LSRC D342
Office Hours: 1:00pm-2:00pm, Monday and Wednesday. Also by appointment.