Multicore processors have been utilized in embedded systems and general computing applications for some time. However, these
multicore chips execute multiple applications concurrently, with each core carrying out a particular task in the system. Such
systems can be found in gaming, automotive real-time systems and video / image encoding devices. These system are commonly
deployed to overcome deadline misses, which are primarily due to overloading of a single multitasking core. In this paper,
we explore the use of multiple cores for a single application, as opposed to multiple applications executing in a parallel
fashion. A single application is parallelized using two different methods: one, a master-slave model; and two, a sequential
pipeline model. The systems were implemented using Tensilica’s Xtensa LX processors with queues as the means of communications
between two cores. In a master-slave model, we utilized a course grained approach whereby a main core distributes the workload
to the remaining cores and reads the processed data before writing the results back to file. In the pipeline model, a lower
granularity is used. The application is partitioned into multiple sequential blocks; each block representing a stage in a
sequential pipeline. For both models we applied a number of differing configurations ranging from a single core to a nine-core
system. We found that without any optimization for the seven core system, the sequential pipeline approach has a more efficient
area usage, with an
area increase to speedup ratio of 1.83 compared to the master-slave approach of 4.34. With selective optimization in the pipeline approach, we obtained
speed ups of up to 4.6 × while with an area increase of only 3.1 × (area increase to speedup ratio of just 0.68).
Keywords architecture - ASIPs - design - heterogeneous system - multiprocessor - pipelines - SoC
National ICT Australia is funded through the Australian Government’s Backing Australia’s Ability initiative, in part through the Australian Research Council.