We propose a mining framework that supports the identification of useful knowledge based on data clustering. With the recent
advancement of microarray technologies, we focus our attention on gene expression datasets mining. In particular, given that
genes are often co-expressed under subsets of experimental conditions, we present a novel subspace clustering algorithm. In
contrast to previous approaches, our method is based on the observation that the number of subspace clusters is related with
the number of maximal subspace clusters to which any gene pair can belong. By performing discretization to gene expression
profiles, the similarity between two genes is transformed as a sequence of symbols that represents the maximal subspace cluster
for the gene pair. This domain transformation (from genes into gene-gene relations) allows us to make the number of possible
subspace clusters dependent on the number of genes. Based on the symbolic representations of genes, we present an efficient
subspace clustering algorithm that is scalable to the number of dimensions. In addition, the running time can be drastically
reduced by utilizing inverted index and pruning non-interesting subspaces. Experimental results indicate that the proposed
method efficiently identifies co-expressed gene subspace clusters for a yeast cell cycle dataset.