Institutional Login
Welcome!
To use the personalized features of this site, please
log in
or
register
.
If you have forgotten your username or password, we can
help
.
My Menu
Marked Items
Alerts
Order History
Saved Items
All
Favorites
Content Types
All
Publications
Journals
Book Series
Books
Reference Works
Protocols
Subject Collections
Architecture and Design
Behavioral Science
Biomedical and Life Sciences
Business and Economics
Chemistry and Materials Science
Computer Science
Earth and Environmental Science
Engineering
Humanities, Social Sciences and Law
Mathematics and Statistics
Medicine
Physics and Astronomy
Professional and Applied Computing
中文(简体)
中文(繁體)
English
Deutsch
한국어
日本語
Français
Español
العربية
Русский
Book Chapter
Knowledge Discovery in Multi-label Phenotype Data
Book Series
Lecture Notes in Computer Science
Publisher
Springer Berlin / Heidelberg
ISSN
0302-9743 (Print) 1611-3349 (Online)
Volume
Volume 2168/2001
Book
Principles of Data Mining and Knowledge Discovery
DOI
10.1007/3-540-44794-6
Copyright
2001
ISBN
978-3-540-42534-2
DOI
10.1007/3-540-44794-6_4
Pages
42-53
Subject Collection
Computer Science
SpringerLink Date
Monday, January 01, 2001
Add to marked items
Add to shopping cart
Add to saved items
Permissions & Reprints
Recommend this chapter
PDF (151.5 KB)
Free Preview
Knowledge Discovery in Multi-label Phenotype Data
Amanda Clare
3
and Ross D. King
3
(3)
Department of Computer Science, University of Wales Aberystwyth, SY23 3DB, UK
Abstract
The biological sciences are undergoing an explosion in the amount of available data. New data analysis methods are needed to deal with the data. We present work using KDD to analyse data from mutant phenotype growth experiments with the yeast
S. cerevisiae
to predict novel gene functions. The analysis of the data presented a number of challenges: multi-class labels, a large number of sparsely populated classes, the need to learn a set of accurate rules (not a complete classification), and a very large amount of missing values. We developed resampling strategies and modified the algorithm C4.5 to deal with these problems. Rules were learnt which are accurate and biologically meaningful. The rules predict function of 83 putative genes of currently unknown function at an estimated accuracy of > 80%.
Amanda
Clare
Email:
ajc99@aber.ac.uk
Ross
D.
King
Email:
rdk@aber.ac.uk
Fulltext Preview (Small,
Large
)
References secured to subscribers.
more options
Find
Query Builder
Close
|
Clear
Title (ti)
Summary (su)
Author (au)
ISSN (issn)
ISBN (isbn)
DOI (doi)
And
Or
Not
(
)
* (wildcard)
"" (exact)
Within all content
Within this book series
Within this book
Export this chapter
Export this chapter as
RIS
|
Text
Frequently asked questions
|
General information on journals and books
|
Send us your feedback
|
Impressum
|
Contact
© Springer.
Part of Springer Science+Business Media
Privacy, Disclaimer, Terms and Conditions, © Copyright Information
MetaPress Privacy Policy
Remote Address: 38.107.191.107 • Server: mpweb20
HTTP User Agent: CCBot/1.0 (+http://www.commoncrawl.org/bot.html)