Weijia Shi (U of Washington)- Breaking the Language Model Monolith
Abstract: Language models (LMs) are typically monolithic: a single model storing all knowledge and serving every use case. This design presents significant challenges; they often generate factually incorrect statements, require costly retraining to add or remove information, and face serious privacy and copyright issues. In this talk, I will discuss how to break this monolith by introducing modular architectures and training algorithms that separate capabilities across composable components. I’ll cover two forms of modularity: (1) External modularity, which augments LMs with external tools like retrievers to improve factuality and reasoning; and (2) internal modularity, which builds inherently modular LMs from decentrally trained components to enable flexible composition and an unprecedented level of control.
Speakers
Weijia Shi
Weijia Shi is a Ph.D. candidate at the University of Washington, where she is advised by Luke Zettlemoyer and Noah Smith. Her research focuses on developing augmented and modular architectures and training algorithms that enable language models to be more controllable, collaboratively developed, and factual. She received an Outstanding Paper Award at ACL 2024 and was recognized as Rising Stars in Machine Learning in 2023 and in Data Science in 2024.