Lecture Notes in Computer Science, 2006, Volume 4209/2006, 13-24, DOI: 10.1007/11880561_2

TreeBoost.MH: A Boosting Algorithm for Multi-label Hierarchical Text Categorization

Andrea Esuli, Tiziano Fagni and Fabrizio Sebastiani

View Related Documents

Abstract

In this paper we propose TreeBoost.MH, an algorithm for multi-label Hierarchical Text Categorization (HTC) consisting of a hierarchical variant of AdaBoost.MH. TreeBoost.MH embodies several intuitions that had arisen before within HTC: e.g. the intuitions that both feature selection and the selection of negative training examples should be performed “locally”, i.e. by paying attention to the topology of the classification scheme. It also embodies the novel intuition that the weight distribution that boosting algorithms update at every boosting round should likewise be updated “locally”. We present the results of experimenting TreeBoost.MH on two HTC benchmarks, and discuss analytically its computational cost.

Fulltext Preview

Image of the first page of the fulltext document