Microarray gene expression profile data is used to accurately predict different tumor types, which has great value in providing
better treatment and toxicity minimization on the patients. However, it is difficult to classify different tumor types using
microarray data because the number of samples is much smaller than the number of genes. It has been proved that a small feature
gene subset can improve classification accuracy, so feature gene selection and extraction algorithm is very important in tumor
classification. In this paper, a novel hybrid gene selection method is proposed to find a feature gene subset so that the
feature genes related to certain cancer can be kept and the redundant genes can be leave out. In the proposed method, we combine
the advantages of the PCA and the LDA and proposed a novel feature gene extraction scheme. We also compared several kinds
of parametric and non-parametric feature gene selection methods. We use the SVM as the classifier in the experiment and compare
the performance of three common SVM kernels. Their differences are analyzed. Using the n-fold cross validation, the proposed
algorithm is carried out on three published benchmark tumor datasets and experimental results show that this algorithm leads
to better classification performance than other methods.
Keywords Feature Gene Selection - Tumor Classification - PCA - LDA - SVM - k-NN