Włodzimierz Bielecki, Krzysztof Kraska
Szczecin University of Technology,
Faculty of Computer Science and Information Technology
Abstract:
Increasing data locality in a program is a necessary factor to improve performance of software parts of embedded systems, to decrease power consumption and reduce memory on chip size. A possibility of applying a method of ąuantifying data locality to a novel method of extracting synchronization-free threads is introduced. It can be used to agglomerate extracted synchronization-free threads for adopting a parallel program to a target architecture of an embedded system under various loop Schedule options (space-time mapping) and the influence of well-known techniąues to improve data locality. The choice of the best combination of loop transformation techniąues regarding to data locality makes possible improving program performance. A way of an analysis of data locality is presented. Experimental results are depicted and discussed. Conclusion and futurę research are outlined.
Keywords:
data locality, compilers, parallel processing, embedded systems
Embedded systems involved in data processing consist of programmable processors, program components processed by the processors and hardware components often rea-lized in FPGA cooperating with software parts of the system. Software components enable making corrections ąuickly, codę reusing, elastic changing a program permitting for reducing the time of delivering product to the market. But programmable processors consume considerably morę energy and they are significantly slower than their hardware counterparts. Hardware Solutions assure greater performance and smaller power consumption however designing time may be long and the design process is expen-sive [9].
Multiprocessor architectures for embedded systems are widespread on the contem-porary electronic market. For example, the Xilinx FPGA Virtex-4FX chip includes up to two PowerPC405 processors, National Semiconductor’s Geode chips enable to join several processors to build a multiprocessor system based on the x86 architecture, the HPOC project (Hundred Processors, One Chip) undertaken at Hewlett Packard attempts to consolidate hundreds of processors on one chip using co-resident on-chip memory [4].