View Related Documents

Abstract

Media processing has motivated strong changes in the focus and design of processors. The inclusion of μSIMD multimedia extensions such as MMX is a cost effective option to improve the performance of those regions of the program with large amounts of DLP. This paper provides an initial evaluation of μSIMD and vector-SIMD enhanced VLIW architectures. We show that these two architectures execute respectively an average of 40% and 57% fewer operations than the reference VLIW architecture. However, when most of the available DLP parallelism has been exploited via multimedia extensions or wide-issue static scheduling, the remaining of the program exhibits only modest amounts of ILP (1.40 operations per cycle for a 8-issue width architecture). We claim that, in general, vector-SIMD extensions achieve the highest speed-ups while still reducing the fetch pressure, although for wide-issue μSIMD architectures reach a similar performance at a lower cost.

Fulltext Preview

Image of the first page of the fulltext document