View Related Documents

Abstract

We investigate the performance benefits of a novel recursive formulation of Strassen’s algorithm over highly tuned matrix-multiply (MM) routines, such as the widely used ATLAS for high-performance systems.
We combine Strassen’s recursion with high-tuned version of ATLAS MM and we present a family of recursive algorithms achieving up to 15% speed-up over ATLAS alone. We show experimental results for 7 different systems.

Keywords  dense kernels - matrix-matrix product - performance optimizations

Fulltext Preview

Image of the first page of the fulltext document