Volume 17, Number 1, 39-56, DOI: 10.1007/s10618-008-0107-0

The Boolean column and column-row matrix decompositions

Pauli Miettinen

From the issue entitled "Special issue: Selected Papers from ECML PKDD 2008. Guest editors: Walter Daelemans, Bart Goethals, Katharina Morik"

View Related Documents

Abstract

Matrix decompositions are used for many data mining purposes. One of these purposes is to find a concise but interpretable representation of a given data matrix. Different decomposition formulations have been proposed for this task, many of which assume a certain property of the input data (e.g., nonnegativity) and aim at preserving that property in the decomposition. In this paper we propose new decomposition formulations for binary matrices, namely the Boolean CX and CUR decompositions. They are natural combinations of two previously presented decomposition formulations. We consider also two subproblems of these decompositions and present a rigorous theoretical study of the subproblems. We give algorithms for the decompositions and for the subproblems, and study their performance via extensive experimental evaluation. We show that even simple algorithms can give accurate and intuitive decompositions of real data, thus demonstrating the power and usefulness of the proposed decompositions.

Keywords  Matrix decompositions - Approximation - CX decomposition - CUR decomposition - Boolean decompositions

Responsible editors: Walter Daelemans, Bart Goethals, and Katharina Morik.

Fulltext Preview

Image of the first page of the fulltext document