This paper discusses the application of a novel feature subset selection method in high-dimensional genomic microarray data
on type 2 diabetes based on recent Bayesian network learning techniques. We report experiments on a database that consists
of 22,283 genes and only 143 patients. The method searches the genes that are conjunctly the most associated to the diabetes
status. This is achieved in the context of learning the Markov boundary of the class variable. Since the selected genes are
subsequently analyzed further by biologists, requiring much time and effort, not only model performance but also robustness
of the gene selection process is crucial. Therefore, we assess the variability of our results and propose an ensemble technique
to yield more robust results. Our findings are compared with the genes that were associated with an increased risk of diabetes
in the recent medical literature. The main outcomes of the present research are an improved understanding of the pathophysiology
of obesity, and a clear appreciation of the applicability and limitations of Markov boundary learning techniques to human
gene expression data.