Provenance, the description of the history of a set of data, has become important in the neurosciences with the proliferation
of research consortia-related neuroimaging efforts. Knowledge about the origin, preprocessing, analysis and post hoc processing
of neuroimaging volumes is essential for establishing data and results quality, the reproducibility of findings, and their
scientific interpretation. Neuroimaging provenance also includes the specifics of the software routines, algorithmic parameters,
and operating system settings that were employed in the analysis protocol. The LONI Pipeline (http://pipeline.loni.ucla.edu)
is a Java-based workflow environment for the construction and execution of data processing streams. We have developed a provenance
framework for describing the current and retrospective data state integrated with the LONI Pipeline workflow environment.
Collection of provenance information under this framework alleviates much of the burden of documentation from the user while
still providing a rich description of an image’s characteristics, as well as the description of the programs that interacted
with that data. This combination of ease of use and highly descriptive meta-data will greatly facilitate the collection of
provenance information from brain imaging workflows, encourage subsequent data and meta-data sharing, enhance peer-reviewed
publication, and support multi-center collaboration.
Keywords Provenance - Workflow - Neuroimaging - Grid - Pipeline