OpenMP has become the de-facto standard for shared memory parallel programming. The directive based nature of OpenMP allows
incremental and portable developement of parallel application for a wide range of platforms. The fact that OpenMP is easy
to use implies that a lot of details are hidden from the end user. Therefore, basic factors like the runtime system, compiler
optimizations and other implementation specific issues can have a significant impact on the performance of an OpenMP application.
Frequently, OpenMP constructs can have widely varying performance on different operating platforms and even with different
compilers on the same machine. This makes it very important to have a comparative study of the low-level performance of individual
OpenMP constructs. In this paper, we present an enhanced set of microbenchmarks for OpenMP derived from the EPCC benchmarks
and based on the SKaMPI benchmarking framework. We describe the methodology of evaluation followed by details of some of the
constructs and their performance measurement. Results from experiments conducted on the IBM SP3 and the SUN SunFire systems
are presented for each construct.
This work was partially supported by the U.S. Department of Energy through Los Alamos National Laboratory contract W-7405-ENG-36
and by the Los Alamos Computer Science Institute under grant LANL 03891-99-23.