Rethinking the Role of Optimization in Learning
In this talk, I will overview our recent progress towards understanding how we learn large capacity machine learning models. In the modern practice of machine learning, especially deep learning, many successful models have far more trainable parameters compared to the number of training examples. Consequently, the optimization objective for training such models have multiple minimizers that perfectly fit the training data. More problematically, while some of these minimizers generalize well to new examples, most minimizers will simply overfit or memorize the training data and will perform poorly on new examples. In practice though, when such ill-posed objectives are minimized using local search algorithms like (stochastic) gradient descent ((S)GD), the "special" minimizers returned by these algorithms have remarkably good performance on new examples. In this talk, we will explore the role optimization algorithms like (S)GD in learning overparameterized models in simpler setting of learning linear predictors.
Suriya Gunasekar is a research assistant professor at the Toyota Technological Institute at Chicago. Prior to joining TTIC, she finished her PhD at the University of Texas at Austin advised by Prof. Joydeep Ghosh. Her research interests are broadly driven by statistical, algorithmic, and societal aspects of machine learning including topics of optimization, high dimensional learning, and algorithmic fairness.