Gromov-Wasserstein Learning: A New Machine Learning Framework for Structured Data Analysis

Duke Computer Science/Biostatistics & Bioinformatics Colloquium
Speaker Name
Hongteng Xu
Date and Time
Lunch served at 11:45 am.

Many biomedical data types like protein-protein interaction (PPI) networks and biological molecules are structured data, which are represented as graphs optionally accompanied with node attributes. From the viewpoint of machine learning, tasks focusing on these structured data, such as network alignment and molecule analysis, can often be formulated as graph representation, matching, partitioning, and clustering problems. Unfortunately, due to their NP-completeness, we have to rely on heuristic methods to solve these problems in practice, often without theoretical support for stability and rationality.

 In this talk, I will introduce a novel machine learning framework called Gromov-Wasserstein Learning (GWL) — a new systematic solution I proposed for structured data analysis. First, I will introduce the theoretical fundamentals of GWL and link it to learning tasks from structured data. Next, I will describe the optimization algorithms in the GWL, analyzing their convergence, computational complexity, and scalability in detail. Finally, I will show that the GWL unifies graph matching, partitioning, and representation into the same algorithmic framework, which outperforms existing methods on PPI network analysis and molecule clustering and classification.

Short Biography

Hongteng Xu is a senior research scientist in Infinia ML Inc. At the same time, he is a visiting researcher in the Department of Electrical and Computer Engineering, Duke University. In 2018, Hongteng was a postdoctoral researcher at Duke University. He obtained his Ph.D. from Georgia Institute of Technology in Electrical and Computer Engineering in 2017. His research interests are machine learning theory, methodologies and their applications to health and biomedical data analysis. Specifically, his work focuses on leveraging optimal transport theory and temporal point process (TPP) theory to develop practical models and learning algorithms.

Hongteng has built two TPP toolboxes for educational and industrial purposes, respectively, which provide practical solutions to disease network modeling, ICU transition prediction, and smoking behavior prediction. Recently, he has further integrated his work on Gromov-Wasserstein learning into packages and applied them to PPI network alignment and molecule clustering. For details of his research, publications, and software, see:

David Page