The lack of performance portability has been disheartening scientific application users to develop portable programs written
in HPF. As the users would like to run the same source code on different parallel machines as fast as possible, we have investigated
the performance portability for Japanese HPF compilers (NEC and Fujitsu) with a special benchmark suite. We got good performance
in most cases with DISTRIBUTE and INDEPENDENT directives on NEC SX-5, but Fujitsu VPP800 required to explicitly force no communication
inside parallel loops with additional LOCAL directives. It was also found that manual optimizations for communication with
HPF/JA extensions were very useful to tune parallel performance.