Rethinking Text Generation Models and How to Train Them
Much of the recent empirical success in natural language processing has relied on the use of neural networks to compute expressive, global representations of sentences and documents. However, incorporating such representations into systems that produce outputs with combinatorial structure, such as text, may require rethinking both our models and how we train them. In terms of training, I will argue for training text generation models to search for optimal outputs, which will address some of the shortcomings of standard maximum likelihood-based training. In terms of modeling, I will argue that standard text generation models are difficult to interpret and to control, and I will suggest a model that automatically induces discrete template-like objects, which can be used for controlling and interpreting generation.
Sam Wiseman is a research assistant professor at TTIC. In 2018 he obtained his PhD in Computer Science from Harvard University; during his PhD he also spent two summers at Facebook AI Research. Sam is broadly interested in deep learning approaches to structured prediction for natural language processing problems, and in structured approaches to text generation in particular. He received an honorable mention for best paper at EMNLP 2016, and is a Siebel Scholar.