This paper describes an implementation and a preliminary evaluation of the Omni OpenMP compiler on a parallel computer Cenju-4.
The Cenju-4 is a parallel computer which support hardware distributed shared memory (DSM) system. The shared address space
is explicitly allocated at the initialization phase of the program. The Omni converts all global variable declarations into
indirect references through the pointers, and generates code to allocate those variables in the shared address space at runtime.
The OpenMP programs can execute on a distributed memory machine with hardware DSM by using the Omni. The preliminary results
using benchmark programs show that the performance of OpenMP programs didn’t scales. While its performance of OpenMP benchmark
programs scales poorly, it enables users to execute the same program on both a shared memory machine and a distribute memory
machine.