Volume 79, Numbers 1-2, 123-149, DOI: 10.1007/s10994-009-5148-0

Multi-domain learning by confidence-weighted parameter combination

Mark Dredze, Alex Kulesza and Koby Crammer

From the issue entitled "Special Issue on Learning from Multiple Sources; Guest Editors: Nicolò Cesa-Bianchi, David R. Hardoon, and Gayle Leen"

View Related Documents

Abstract

State-of-the-art statistical NLP systems for a variety of tasks learn from labeled training data that is often domain specific. However, there may be multiple domains or sources of interest on which the system must perform. For example, a spam filtering system must give high quality predictions for many users, each of whom receives emails from different sources and may make slightly different decisions about what is or is not spam. Rather than learning separate models for each domain, we explore systems that learn across multiple domains. We develop a new multi-domain online learning framework based on parameter combination from multiple classifiers. Our algorithms draw from multi-task learning and domain adaptation to adapt multiple source domain classifiers to a new target domain, learn across multiple similar domains, and learn across a large number of disparate domains. We evaluate our algorithms on two popular NLP domain adaptation tasks: sentiment classification and spam filtering.

Keywords  Online learning - Domain adaptation - Classifier combination - Transfer learning - Multi-task learning

Editors: Nicolo Cesa-Bianchi, David R. Hardoon, and Gayle Leen.
Preliminary versions of the work contained in this article appeared in the proceedings of the conference on Empirical Methods in Natural Language Processing (Dredze and Crammer 2008).
K. Crammer is a Horev Fellow, supported by the Taub Foundations.

Fulltext Preview

Image of the first page of the fulltext document