Applying Biclustering to Text Mining: An Immune-Inspired Approach
Pablo A. D. de Castro1
, Fabrício O. de França1
, Hamilton M. Ferreira1
and Fernando J. Von Zuben1 
| (1) |
Laboratory of Bioinformatics and Bio-Inspired Computing - LBIC, School of Electrical and Computer Engineer – FEEC, University
of Campinas – UNICAMP, Campinas-SP, Brazil |
Abstract
With the rapid development of information technology, computers are proving to be a fundamental tool for the organization
and classification of electronic texts, given the huge amount of available information. The existent methodologies for text
mining apply standard clustering algorithms to group similar texts. However, these algorithms generally take into account
only the global similarities between the texts and assign each one to only one cluster, limiting the amount of information
that can be extracted from the texts. An alternative proposal capable of solving these drawbacks is the biclustering technique.
The biclustering is able to perform clustering of rows and columns simultaneously, allowing a more comprehensive analysis
of the texts. The main contribution of this paper is the development of an immune-inspired biclustering algorithm to carry
out text mining, denoted BIC-aiNet. BIC-aiNet interprets the biclustering problem as several two-way bipartition problems,
instead of considering a single two-way permutation framework. The experimental results indicate that our proposal is able
to group similar texts efficiently and extract implicit useful information from groups of texts.
Keywords Artificial Immune System - Biclustering - Two-way Bipartition - Text mining
References secured to subscribers.