Publications
On Starfish's Vision
- H. Herodotou, H. Lim, G. Luo, N. Borisov, L. Dong, F. B. Cetin, and S. Babu.
Starfish: A Self-tuning System for Big Data Analytics
In Proc. of the 5th Conference on Innovative Data Systems Research (CIDR '11), January 2011
{ Paper | Presentation | Poster }
On Optimizing MapReduce Programs / Hadoop Jobs
- H. Lim, H. Herodotou, and S. Babu.
Stubby: A Transformation-based Optimizer for MapReduce Workflows.
In Proc. of the 38th International Conference on Very Large Data Bases (VLDB '12), August 2012
{ Paper | Technical Report | Presentation } - H. Herodotou and S. Babu.
Profiling, What-if Analysis, and Cost-based Optimization of MapReduce Programs.
In Proc. of the 37th International Conference on Very Large Data Bases (VLDB '11), August 2011
{ Paper | Presentation } - H. Herodotou, F. Dong, and S. Babu.
MapReduce Programming and Cost-based Optimization? Crossing this Chasm with Starfish.
Demonstration at the 37th International Conference on Very Large Data Bases (VLDB '11), August 2011
{ Paper | Poster } -
H. Herodotou.
Hadoop Performance Models
Technical Report CS-2011-05, Duke University, February 2011
{ Paper } - S. Babu.
Towards Automatic Optimization of MapReduce Programs
In Proc. of the ACM Symposium on Cloud Computing 2010 (SOCC '10), June 2010
{ Paper | Presentation (PDF) | Presentation (PPT) }
On Automatic Cluster Sizing on the Cloud
- H. Herodotou, F. Dong, and S. Babu.
No One (Cluster) Size Fits All: Automatic Cluster Sizing for Data-intensive Analytics
In Proc. of the ACM Symposium on Cloud Computing 2011 (SOCC '11), October 2011
{ Paper | Presentation }
On Elastic Storage
- H. Lim, S. Babu and J. Chase.
Automated Control for Elastic Storage
In Proc. of the Intl. Conference on Autonomic Computing (ICAC '10), June 2010
{ Paper | Presentation }
On Partitioned Join Processing
- H. Herodotou, N. Borisov, and S. Babu.
Query Optimization Techniques for Partitioned Tables
In Proceedings of the 2011 ACM International Conference on Management of data, (SIGMOD '11), June 2011
{ Paper | Presentation | Poster }
Relevant Past Work
On Efficient Cost Modeling
-
P. Shivam, V. Marupadi, J. Chase, and S. Babu.
Cutting Corners: Workbench Automation for Server Benchmarking
In Proc. of the 2008 USENIX Annual Technical Conference, June 2008 - P. Shivam, S. Babu, and J. Chase.
Active and Accelerated Learning of Cost Models for Optimizing Scientific Applications
In Proc. of the 32nd International Conference on Very Large Data Bases (VLDB '06), September 2006
On Robust and Adaptive Query Processing
- S. Babu, P. Bizarro, and D. DeWitt.
Proactive Re-optimization
In Proc. of the 2005 ACM Intl. Conf. on Management of Data (SIGMOD '05), June 2005
The Rio system described in this paper was demonstrated at SIGMOD 2005, June 2005 - S. Babu and P. Bizarro.
Adaptive Query Processing in the Looking Glass
In Proc. of the Second Biennial Conference on Innovative Data Systems Research (CIDR '05), January 2005