Lecture Notes in Computer Science, 2006, Volume 4216/2006, 107-118, DOI: 10.1007/11875741_11

High-Throughput Identification of Chemistry in Life Science Texts

Peter Corbett and Peter Murray-Rust

View Related Documents

Abstract

OSCAR3 is an open extensible system for the automated annotation of chemistry in scientific articles, which can process thousands of articles per hour. This XML annotation supports applications such as interactive browsing and chemically-aware searching, and has been designed for integration with larger text-analysis systems. We report its application to the high-throughput analysis of the small-molecule chemistry content of texts in life sciences, such as PubMed abstracts.

Fulltext Preview

Image of the first page of the fulltext document