Instructor
Rong
Ge
D226 LSRC
Email: rongge@cs.duke.edu
Lectures: Tuesdays and Thursdays 4:40-5:55 pm in LSRC D106
Textbook: Ankur Moitra's monograph. Lecture notes for new topics will be provided.
Prerequisites: Familiarity with algorithms, probability and linear algebra.
Evaluation:
The course grades will be determined based on:
Date | Contents | References | Future Reading |
8/27 | Intro: Machine Learning Problems | Slides | |
Nonnegative Matrix Factorization and Topic Models | |||
9/1 | Matrix Factorizations, Nonnegative rank | Section 2.1 in textbook Short Slides/ illustrative example |
Section 2.2 in textbook |
9/3 | New algorithms for separable NMF | Section 2.3 in textbook | [3] |
9/8 | Topic Models | Section 2.4 in textbook | [4] |
9/10 | Implementing the Topic Modeling Algorithm | [5] Slides | See links in the slides |
[1]
D. Lee and S. Seung. Learning
the Parts of Objects by Nonnegative Matrix Factorization,
Nature 1999. [2] S. Vavasis. On the Complexity of Nonnegative Matrix Factorization, SIOPT 2009. [3] S. Arora, R. Ge, R. Kannan and A. Moitra. Computing a Nonnegative Matrix Factorization -- Provably, STOC 2012. [4] S. Arora, R. Ge and A. Moitra. Learning Topic Models -- Going Beyond SVD, FOCS 2012. [5] S. Arora et al. A Practical Algorithm for Topic Modeling with Provable Guarantees, ICML 2013. |
|||
Spectral Clustering | |||
9/15 | Stochastic block model, McSherry's algorithm | [1] | |
9/17 | Provable guarantee for Lloyd's algorithm | [2] | [2]
Especially Section 6 [3] semi-random models |
[1] F. McSherry. Spectral
Partitioning of Random Graphs, FOCS 2001 [2] A. Kumar, R. Kannan. Clustering with spectral norm and the k-means algorithm, FOCS 2010 [3] U. Feige, J. Kilian. Heuristics for finding large independent sets, with applications to coloring semi-random graphs FOCS 1998 |
|||
Tensor Decompositions | |||
9/22 | Tensor basics | Section 3.1 in textbook | |
9/24 | Tensor Decompositions | Section 3.1 in textbook | Section 3.2 in textbook |
9/29 | Power Method | [4] | [4] |
10/1 | Applications: topic models, mixture of Gaussians | Section 3.5 in textbook | [3] |
10/6 | HMMs, Phylogenetic tree reconstruction, stochastic block model | Sections 3.3, 3.4 in textbook | [2] |
10/8 | Independenct Component Analysis, sample projects | Section 3.6 in textbook | [5] |
[1]C.
Hillar and L. Lim. Most Tensor Problems
are NP-hard, JACM 2013.
[2]E. Mossel and S. Roch. Learning Nonsingular Phylogenies and Hidden Markov Models, STOC 2005. [3]A. Anandkumar, D. Foster, D. Hsu, S. Kakade and Y. Liu A Spectral Algorithm for Latent Dirichlet Allocation, NIPS 2012. [4]A. Anandkumar, R. Ge, D. Hsu, S. Kakade, M. Telgarsky. Tensor decompositions for learning latent variable models, JMLR 2014. [5]N. Goyal, S. Vempala and Y. Xiao. Fourier PCA, STOC 2014. |
|||
(non-convex) Optimization | |||
10/13 | Fall break | ||
10/15 | Basic optimization techniques | [1] Chapter 2 | |
10/20 | Strong convexity, approximate gradient descent | [3] Section 2 | [2] |
10/22 | Approximate gradient descent for sparse coding | [3] Section 4 | [3] |
10/27 | Second order optimization, cubic regularization | [4] | [4], Section 5 |
10/29 | Saddle point problem, stochastic gradient | [5] | |
[1]Y. Nesterov. Introductory
Lectures on Convex Optimization. [2]A. Rakhlin, O. Shamir, K. Sridharan. Making gradient descent optimal for strongly convex stochastic optimization, ICML 2012. [3]S. Arora, R. Ge, T. Ma and A. Moitra. Simple, Efficient and Neural Algorithms for Sparse Coding, COLT 2015. [4]Y. Nesterov and B.T. Polyak. Cubic regularization of Newton method and its global performance. Mathematical Programming 2006. [5]R. Ge, F. Huang, C. Jin, Y. Yuan. Escaping from Saddle-Points --- Online Stochastic Gradient for Tensor Decomposition, COLT 2015. |
|||
Matrix Completion | |||
11/3 | Matrix Completion, nuclear norm, Rademacher complexity | [1] | |
11/5 | Rademacher complexity (continued), matrix concentrations | [1] | [6] |
11/10 | Matrix Completion via Convex Optimization | [2], Section 7 in textbook | |
11/12 | Matrix Completion via Convex Optimization (continued) | [2], Section 7 in textbook | [3] |
11/17 | Matrix Completion via Alternating Minimization | [5] | |
11/19 | Matrix Completion via Alternating Minimization (continued) | [5] | [4] |
[1] N.
Srebro and A. Shraibman. Rank, Trace-Norm
and Max-Norm, COLT 2005. [2] E. Candes and B. Recht. Exact Matrix Completion via Convex Optimization, FOCM 2009. [3] V. Chandrasekaran, P. Parrilo, B. Recht and A. Willsky. The Convex Geometry of Linear Inverse Problems, FOCM 2012. [4] P. Jain, P. Netrapalli and S. Sanghavi. Low-rank Matrix Completion using Alternating Minimization, STOC 2013. [5] M. Hardt. Understanding Alternating Minimization for Matrix Completion, FOCS 2014. [6] J. Tropp. User-friendly tail bounds for sums of random matrices. FOCM 2012. |
|||
11/22 | Summary: what we learned and what we didn't cover | Slides |