Clinical data that may be used in a secondary capacity to support research activities are regularly stored in three significantly
different formats: (1) structured, codified data elements; (2) semi-structured or unstructured narrative text; and (3) multi-modal
images. In this manuscript, we will describe the design of a computational system that is intended to support the ontology-anchored
query and integration of such data types from multiple source systems. Additional features of the described system include
(1) the use of Grid services-based electronic data interchange models to enable the use of our system in multi-site settings
and (2) the use of a software framework intended to address both potential security and patient confidentiality concerns that
arise when transmitting or otherwise manipulating potentially privileged personal health information. We will frame our discussion
within the specific experimental context of the concept-oriented query and integration of correlated structured data, narrative
text, and images for cancer research.
Key words Image retrieval - information retrieval - ontologies - text mining - Grid computing