In order to meet demanding challenges of increasing computational requirements and stringent power constraints, there is a
gradual trend towards heterogeneous multi-processor system-on-chip (MPSoC) designs integrating application specific acceleration
engines. One major problem faced by the design tools for mapping of algorithms onto MPSoC architectures is the dimensioning
of system components through performance analysis. In this paper, we propose a fast and accurate methodology for rate matching
of statically scheduled acceleration engines using modular performance analysis. Given a set of Pareto-optimal hardware accelerator
designs and an input workload behavior, the proposed methodology determines cost efficient hardware accelerators that can
handle the workload. A motion JPEG case study illustrates the benefit of coupling high level synthesis tools with performance
analysis.