View Related Documents

Abstract

The distributed shared memory (DSM) architecture simplifies development of parallel programs by relieving a user from the tedious task of distributing data across processors. Furthermore, it allows incremental parallelization using, for example, OpenMP or Java threads. While it is easy to demonstrate good performance on a few processors, achieving good scalability still requires a good understanding of data flow in the application. In this paper we discuss ADAPT, an Automatic Data Alignment and Placement Tool, that detects data congestions in FORTRAN array oriented codes and suggests code transformations to resolve them. We then show how ADAPT suggested transformations, including data blocking, data placement, data transposition and page size control improve performance of the NAS Parallel Benchmarks.

Fulltext Preview

Image of the first page of the fulltext document