This paper gives an overview of locality enhancement techniques used by the Jasmine compiler, currently under development at the University of Toronto. These techniques enhance memory locality, cache locality
across loop nests (inter-loop-nest cache locality) and cache locality within a loop nest (intra-loop-nest cache locality)
in dense-matrix scientific applications. The compiler also exploits machine-specific features to further enhance locality.
Experimental evaluation of these techniques on different multiprocessor platforms indicates that they are effective in improving
overall performance of benchmarks; some of the techniques improve parallel execution time by up to 6 times.