Welcome!
To use the personalized features of this site, please log in or register.
If you have forgotten your username or password, we can help.
My Menu
Saved Items

Knowledge Discovery in Multi-label Phenotype Data

Amanda ClareContact Information and Ross D. KingContact Information

(3)  Department of Computer Science, University of Wales Aberystwyth, SY23 3DB, UK
Abstract
The biological sciences are undergoing an explosion in the amount of available data. New data analysis methods are needed to deal with the data. We present work using KDD to analyse data from mutant phenotype growth experiments with the yeast S. cerevisiae to predict novel gene functions. The analysis of the data presented a number of challenges: multi-class labels, a large number of sparsely populated classes, the need to learn a set of accurate rules (not a complete classification), and a very large amount of missing values. We developed resampling strategies and modified the algorithm C4.5 to deal with these problems. Rules were learnt which are accurate and biologically meaningful. The rules predict function of 83 putative genes of currently unknown function at an estimated accuracy of > 80%.

Contact Information Amanda Clare
Email: ajc99@aber.ac.uk

Contact Information Ross D. King
Email: rdk@aber.ac.uk
Fulltext Preview (Small, Large)
Image of the first page of the fulltext

References secured to subscribers.



Export this chapter
Export this chapter as RIS | Text
 
Remote Address: 38.107.191.107 • Server: mpweb20
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)