Diagnosing Performance Stragglers and Dynamically Allocating Executors for Spark
zyzhang at cs.duke.edu
||Thursday, April 13, 2017
||2:00pm - 3:00pm
||D344 LSRC, Duke
Spark is a useful framework that runs machine learning and graph applications. However, its performance is hard to manage and improve. In this paper, we present studies on profiling and management for Spark. First, we implement a framework that profiles both hardware and software metrics at task granularity as Spark runs its applications. We use correlation and elastic net methods to provide insights into the root causes for straggling Spark tasks. Second, we implement a scheduler that dynamically manages executors a max-min allocation policy, which produces a system in which users perform better and are more willing to share their executor resources.
Advisor(s): Benjamin Lee
Committee: Jun Yang, Debmalya Panigrahi