Towards Efficient Cloud Systems for Data-Intensive Applications
Part of the IBM Back to School Series. This is a continuing series held at IBM where university professors share their knowledge and experience with the IBM technical community. Duke CS is invited to attend as colleagues of the speaker.
Moving data-intensive applications (e.g., deep learning, data analytics, stream processing) into the cloud is one of the most important trends in the industry today. How to allow these applications to run efficiently has become a primary question in cloud system design.
In this talk, I will focus on the communication aspect for data-intensive applications in the cloud. I will cover two projects: (1) Slim, an efficient network stack design for container virtualization. Unlike traditional container networking approaches that rely on packet-based network virtualization, Slim virtualizes the network at a per-connection level, lowering the overheads of the operating systems. Slim results in 11-66% CPU utilization reduction on popular cloud applications, such as Memcached, Nginx, PostgreSQL, and Apache Kafka. (2) Hoplite, an efficient and fault-tolerant collective communication layer for distributed serverless frameworks. Hoplite computes data transfer schedules on the fly and executes data transfer efficiently using fine-grained pipelining. Hoplite speeds up asynchronous stochastic gradient descent, reinforcement learning, and model serving by up to 7.8x, 3.9x, 3.3x, respectively.
Danyang Zhuo is an assistant professor in the Computer Science Department at Duke University. His work focuses on building efficient and reliable cloud systems, including operating systems, data center networks, and serverless frameworks. He completed his Ph.D. at the University of Washington, advised by Tom Anderson and Arvind Krishnamurthy. Before joining Duke, he was a postdoctoral scholar at the University of California - Berkeley, advised by Ion Stoica.