Lecture Notes in Computer Science, 2005, Volume 3642/2005, 420-429, DOI: 10.1007/11548706_44

An Ontology-Based Pattern Mining System for Extracting Information from Biological Texts

Muhammad Abulaish and Lipika Dey

View Related Documents

Abstract

Biological information embedded within the large repository of unstructured or semi-structured text documents can be extracted more efficiently through effective semantic analysis of the texts in collaboration with structured domain knowledge. The GENIA corpus houses tagged MEDLINE abstracts, manually annotated according to the GENIA ontology, for this purpose. However, manual tagging of all texts is impossible and special purpose storage and retrieval mechanisms are required to reduce information overload for users. In this paper we have proposed an ontology-based biological Information Extraction and Query Answering (BIEQA) system that has four components: an ontology-based tag analyzer for analyzing tagged texts to extract Biological and lexical patterns, an ontology-based tagger for tagging new texts, a knowledge base enhancer which enhances the ontology, and incorporates new knowledge in the form of biological entities and relationships into the knowledge base, and a query processor for handling user queries.

Keywords  Ontology-based text mining - Biological information extraction - Automatic tagging

Fulltext Preview

Image of the first page of the fulltext document