Data-localization for Fortran macro-dataflow computation using partial static task assignment

Akimasa Yoshida, Kenichi Koshizuka, Hironori Kasahara

Research output: Contribution to conferencePaper

Abstract

This paper proposes a data-localization compilation scheme for macro-dataflow computation, in which coarse-grain tasks such as loops, subroutines and basic blocks in a Fortran program are automatically processed in parallel on a multiprocessor system. The data-localization scheme reduces data transfer overhead for passing shared data among coarse-grain tasks composed of Doall loops and sequential loops by using local memory effectively. In this scheme, a compiler partitions coarse-grain tasks, or loops, having data dependences among them into multiple groups by a loop aligned decomposition so that data transfer among groups can be minimum, generates dynamic scheduling routine with partial static task assignment to assign decomposed tasks in a group to the same processor at run-time, and generates parallel machine code to pass shared data inside the group through local memory. A compiler has been implemented for an actual multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that macro-dataflow computation with the proposed data-localization scheme can reduce the execution time by 10% to 20% average compared with ordinary macro-dataflow computation using centralized shared memory.

Original languageEnglish
Pages61-68
Number of pages8
Publication statusPublished - 1996 Jan 1
EventProceedings of the 1996 International Conference on Supercomputing - Philadelphia, PA, USA
Duration: 1996 May 251996 May 28

Other

OtherProceedings of the 1996 International Conference on Supercomputing
CityPhiladelphia, PA, USA
Period96/5/2596/5/28

    Fingerprint

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Yoshida, A., Koshizuka, K., & Kasahara, H. (1996). Data-localization for Fortran macro-dataflow computation using partial static task assignment. 61-68. Paper presented at Proceedings of the 1996 International Conference on Supercomputing, Philadelphia, PA, USA, .