We present a case study on the discovery of clinically relevant domain knowledge in the field of HIV drug resistance. Novel
mutations in the HIV genome associated with treatment failure were identified by mining a relational clinical database. Hierarchical
cluster analysis suggests that two of these mutations form a novel mutational complex, while all others are involved in known
resistance-conferring evolutionary pathways. The clustering is shown to be highly stable in a bootstrap procedure. Multidimensional
scaling in mutation space indicates that certain mutations can occur within multiple pathways. Feature ranking based on support
vector machines and matched genotype-phenotype pairs comprehensively reproduces current domain knowledge. Moreover, it indicates
a prominent role of novel mutations in determining phenotypic resistance and in resensitization effects. These effects may
be exploited deliberately to reopen lost treatment options. Together, these findings provide valuable insight into the interpretation
of genotypic resistance tests.
Keywords HIV - clustering - multidimensional scaling - support vector machines - feature ranking