View Related Documents

Abstract

This paper analyzes the impact of hardware multithreading support on the performance of distribute share -memory (DSM) multiprocessors built out of heterogeneous, single-chip computing nodes. Area-efficiency arguments motivate a heterogeneous, hierarchical organization (HDSM) consisting of few processors with extensive support for instruction-level parallelism an large caches, an a larger number of simpler processors with smaller caches for efficient execution of thread- parallel code. Such heterogeneous machine relies on the execution of multiple threads per processor to deliver high performance for unmoified applications. This paper quantitatively studies the performance of HDSMs for software-based an hardware-multithreade scenarios.The simulation-based experiments in this paper consider a 16-node multiprocessor, six homogeneous shared-memory benchmarks from the SPLASH- 2 suite, an a decision-support application (C4.5).Simulation results show that a hardware-based, block-multithreade HDSM configuration outperforms a software-multithreaded counterpart, on average, by 13%.
This work was partially funde by the National Science Foundation under grants CCR-9970728 an EIA-9975275.Renato Figueiredo is also supporte by a CAPES scholarship.

Fulltext Preview

Image of the first page of the fulltext document