Data-localization scheme using task-fusion for macro-dataflow computation

Akimasa Yoshida, Seiji Maeda, Kensaku Fujimoto, Hironori Kasahara

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    This paper proposes a data-localization scheme for macro-dataflow computation in which coarse-grain tasks such as loops, subroutines and basic blocks in a Fortran program are dynamically scheduled onto processors and executed in parallel. The proposed scheme reduces data transfer overhead via centralized shared memory by using local memory effectively for passing shared data among coarse-grain tasks, especially loops. This compilation scheme decomposes multiple loops with data dependences to enable to localize data by loop-aligned-decomposition method, then fuses decomposed loops requiring a large amount of data transfer among them into a macrotask, which is assigned to a processor at run-time. The scheme has been implemented on an actual multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that the proposed data-localization scheme can reduce the execution time by 21%.

    Original languageEnglish
    Title of host publicationIEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings
    Place of PublicationPiscataway, NJ, United States
    PublisherIEEE
    Pages135-140
    Number of pages6
    Publication statusPublished - 1995
    EventProceedings of the 1995 IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Victoria, BC, Can
    Duration: 1995 May 171995 May 19

    Other

    OtherProceedings of the 1995 IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing
    CityVictoria, BC, Can
    Period95/5/1795/5/19

    Fingerprint

    Macros
    Fusion reactions
    Data storage equipment
    Data transfer
    Subroutines
    Electric fuses
    Decomposition

    ASJC Scopus subject areas

    • Signal Processing

    Cite this

    Yoshida, A., Maeda, S., Fujimoto, K., & Kasahara, H. (1995). Data-localization scheme using task-fusion for macro-dataflow computation. In IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings (pp. 135-140). Piscataway, NJ, United States: IEEE.

    Data-localization scheme using task-fusion for macro-dataflow computation. / Yoshida, Akimasa; Maeda, Seiji; Fujimoto, Kensaku; Kasahara, Hironori.

    IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings. Piscataway, NJ, United States : IEEE, 1995. p. 135-140.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Yoshida, A, Maeda, S, Fujimoto, K & Kasahara, H 1995, Data-localization scheme using task-fusion for macro-dataflow computation. in IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings. IEEE, Piscataway, NJ, United States, pp. 135-140, Proceedings of the 1995 IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing, Victoria, BC, Can, 95/5/17.
    Yoshida A, Maeda S, Fujimoto K, Kasahara H. Data-localization scheme using task-fusion for macro-dataflow computation. In IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings. Piscataway, NJ, United States: IEEE. 1995. p. 135-140
    Yoshida, Akimasa ; Maeda, Seiji ; Fujimoto, Kensaku ; Kasahara, Hironori. / Data-localization scheme using task-fusion for macro-dataflow computation. IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings. Piscataway, NJ, United States : IEEE, 1995. pp. 135-140
    @inproceedings{a9fb97f06b9d4763997bd313859a58c3,
    title = "Data-localization scheme using task-fusion for macro-dataflow computation",
    abstract = "This paper proposes a data-localization scheme for macro-dataflow computation in which coarse-grain tasks such as loops, subroutines and basic blocks in a Fortran program are dynamically scheduled onto processors and executed in parallel. The proposed scheme reduces data transfer overhead via centralized shared memory by using local memory effectively for passing shared data among coarse-grain tasks, especially loops. This compilation scheme decomposes multiple loops with data dependences to enable to localize data by loop-aligned-decomposition method, then fuses decomposed loops requiring a large amount of data transfer among them into a macrotask, which is assigned to a processor at run-time. The scheme has been implemented on an actual multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that the proposed data-localization scheme can reduce the execution time by 21{\%}.",
    author = "Akimasa Yoshida and Seiji Maeda and Kensaku Fujimoto and Hironori Kasahara",
    year = "1995",
    language = "English",
    pages = "135--140",
    booktitle = "IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings",
    publisher = "IEEE",

    }

    TY - GEN

    T1 - Data-localization scheme using task-fusion for macro-dataflow computation

    AU - Yoshida, Akimasa

    AU - Maeda, Seiji

    AU - Fujimoto, Kensaku

    AU - Kasahara, Hironori

    PY - 1995

    Y1 - 1995

    N2 - This paper proposes a data-localization scheme for macro-dataflow computation in which coarse-grain tasks such as loops, subroutines and basic blocks in a Fortran program are dynamically scheduled onto processors and executed in parallel. The proposed scheme reduces data transfer overhead via centralized shared memory by using local memory effectively for passing shared data among coarse-grain tasks, especially loops. This compilation scheme decomposes multiple loops with data dependences to enable to localize data by loop-aligned-decomposition method, then fuses decomposed loops requiring a large amount of data transfer among them into a macrotask, which is assigned to a processor at run-time. The scheme has been implemented on an actual multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that the proposed data-localization scheme can reduce the execution time by 21%.

    AB - This paper proposes a data-localization scheme for macro-dataflow computation in which coarse-grain tasks such as loops, subroutines and basic blocks in a Fortran program are dynamically scheduled onto processors and executed in parallel. The proposed scheme reduces data transfer overhead via centralized shared memory by using local memory effectively for passing shared data among coarse-grain tasks, especially loops. This compilation scheme decomposes multiple loops with data dependences to enable to localize data by loop-aligned-decomposition method, then fuses decomposed loops requiring a large amount of data transfer among them into a macrotask, which is assigned to a processor at run-time. The scheme has been implemented on an actual multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that the proposed data-localization scheme can reduce the execution time by 21%.

    UR - http://www.scopus.com/inward/record.url?scp=0029230364&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=0029230364&partnerID=8YFLogxK

    M3 - Conference contribution

    AN - SCOPUS:0029230364

    SP - 135

    EP - 140

    BT - IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings

    PB - IEEE

    CY - Piscataway, NJ, United States

    ER -