Understanding Deflation Process in Over-parametrized Tensor Decomposition

Algorithms Seminar
Speaker Name
Xiang Wang
Date and Time
The talk will be virtual on Zoom.
The Zoom link will be emailed to CS faculty/grad students, or contact Tatiana Phillips (tatiana.phillips at duke.edu) to request it.

In this paper we study the training dynamics for gradient flow on over-parametrized tensor decomposition problems. Empirically, such training process often first fits larger components and then discovers smaller components, which is similar to a tensor deflation process that is commonly used in tensor decomposition algorithms. We prove that for orthogonally decomposable tensor, a slightly modified version of gradient flow would follow a tensor deflation process and recover all the tensor components. Our proof suggests that for orthogonal tensors, gradient flow dynamics works similarly as greedy low-rank learning in the matrix setting, which is a first step towards understanding the implicit regularization effect of over-parametrized models for low-rank tensors.

Short Biography

Xiang Wang is a fifth year Ph.D. student at Duke University, in the Computer Science Department. His advisor is Prof. Rong Ge, and he is broadly interested in deep learning theory. He completed his undergraduate study at Shanghai Jiao Tong University (SJTU).