Serial Analysis of Gene Expression (SAGE) is a relatively new method for monitoring gene expression levels and is expected
to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. A promising application
of SAGE gene expression data is classification of tumors. In this paper, we build three event models (the multivariate Bernoulli
model, the multinomial model and the normalized multinomial model) for SAGE data classification. Both binary classification
and multicategory classification are investigated. Experiments on two SAGE datasets show that the multivariate Bernoulli model
performs well with small feature sizes, but the multinomial performs better at large feature sizes, while the normalized multinomial
performs well with medium feature sizes. The multinomial achieves the highest overall accuracy.