Shared-memory multicore computing platforms are becoming commonplace, and loop parallelization with OpenMP offers an easy
way for the user to harness their power. As a result, tools for automatic differentiation (AD) should be able to deal with
such codes in a fashion that preserves their parallel nature also for the derivative evaluation. In this paper, we explore
this issue using a plasma simulation code. Its structure, which in essence is a time stepping loop with several parallelizable
inner loops, is representative of many other computations. Using this code as an example, we develop a strategy for the efficient
implementation of the reverse mode of AD with trace-based AD-tools and implement it with the ADOL-C tool. The strategy combines
checkpointing at the outer level with parallel trace generation and evaluation at the inner level. We discuss the extensions
necessary for ADOL-C to work in a multithreaded environment and the setup necessary for the user code and present performance
results on a shared-memory multiprocessor.
Keywords Parallelism - OpenMP - reverse mode - checkpointing - ADOL-C