This paper demonstrates that significant improvements to automatic parallelization technology require that existing systems
be extended in two ways: (1) they must combine high-quality compile-time analysis with low-cost run-time testing; and, (2)
they must take control flow into account during analysis. We support this claim with the results of an experiment that measures
the safety of parallelization at run time for loops left unparallelized by the Stanford SUIF compiler’s automatic parallelization
system. We present results of measurements on programs from two benchmark suites — Specfp95 and Nas sample benchmarks — which
identify inherently parallel loops in these programs that are missed by the compiler. We characterize remaining parallelization
opportunities, and find that most of the loops require run-time testing, analysis of control flow, or some combination of
the two. We present a new compile-time analysis technique that can be used to parallelize most of these remaining parallel
loops. This technique is designed to not only improve the results of compile-time parallelization, but also to produce low-cost,
directed run-time tests that allow the system to defer binding of parallelization until run-time when safety cannot be proven
statically. We call this approach predicated array data-flow analysis. We augment array data-flow analysis, which the compiler uses to identify independent and privatizable arrays, by associating
with each array data-flow value a predicate. Predicated array data-flow analysis allows the compiler to derive “optimistic”
data-flow values guarded by predicates; these predicates can be used to derive a run-time test guaranteeing the safety of
parallelization.
This work has been supported by DARPA Contract DABT63-95-C-0118, a fellowship from AT&T Bell Laboratories, the Air Force Materiel
Command and DARPA contract F30602-95-C-0098.