Upcoming Colloquia Events

Geometry of Bias Mitigation Techniques for Word Representations

Duke Computer Science Colloquium
Speaker Name
Jeff M. Phillips
Date and Time

Vectorized representation of textual data has revolutionized natural language processing, first with methods like Word2Vec and GloVe and then with contextual variants like BERT and RoBERTa. Similar representations are useful for learning on other structured data types like images, trajectories, business transactions, and many more. However, as these are trained on enormous quantities of real-world data -- in the case of language, large amounts of text from the internet, they encode some of the biases from that text.