Scalable Learning Over Distributions
A great deal of attention has been applied to studying new and better ways to perform learning tasks involving static finite vectors. Indeed, over the past century the fields of statistics and machine learning have amassed a vast understanding of various learning tasks like clustering, classification, and regression using simple real valued vectors. However, we do not live in a world of simple objects. From the contact lists we keep, the sound waves we hear, and the distribution of cells we have, complex objects such as sets, distributions, sequences, and functions are all around us. Furthermore, with ever-increasing data collection capacities at our disposal, not only are we collecting more data, but richer and more bountiful complex data are becoming the norm.
In this presentation we analyze regression problems where input covariates, and possibly output responses, are probability distribution functions from a nonparametric function class. Such problems cover a large range of interesting applications including learning the dynamics of cosmological particles and general tasks like parameter estimation.
However, previous nonparametric estimators for functional regression problems scale badly computationally with the number of input/output pairs in a data-set. Yet, given the complexity of distributional data it may be necessary to consider large data-sets in order to achieve a low estimation risk.
To address this issue, we present two novel scalable nonparametric estimators: the Double-Basis Estimator (2BE) for distribution-to-real regression problems; and the Triple-Basis Estimator (3BE) for distribution-to-distribution regression problems. Both the 2BE and 3BE can scale to massive data-sets. We show an improvement of several orders of magnitude in terms of prediction speed and a reduction in error over previous estimators in various synthetic and real-world data-sets.
Junier Oliva is a Ph.D. candidate in the Machine Learning Department at the School of Computer Science, Carnegie Mellon University. His main research interest is to build algorithms that understand data at an aggregate, holistic level. Currently, he is working to push machine learning past the realm of operating over static finite vectors, and start reasoning ubiquitously with complex, dynamic collections like sets and sequences. Moreover, he is interested in exporting concepts from learning on distributional and functional inputs to modern techniques in deep learning, and vice-versa. He is also developing methods for analyzing massive datasets, both in terms of instances and covariates. Prior to beginning his Ph.D. program, he received his B.S. and M.S. in Computer Science from Carnegie Mellon University. He also spent a year as a software engineer for Yahoo!, and a summer as a machine learning intern at Uber ATG.