We have implemented the GEN BLOCK (generalized block) data distribution in PGHPF, our High Performance Fortran implementation.
Compared to a BLOCK or CYCLIC distribution, the more flexible GEN BLOCK distribution allows users to balance the load between
processors. Simple benchmark programs demonstrate the benefits of the new distribution format for unbalanced work loads, getting
speedup of up to 2X over simple distributions. We also show performance results for a whole application.