Data localization using loop aligned decomposition for macro-dataflow processing

Akimasa Yoshida, Hironori Kasahara

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    This paper proposes a data-localization compilation scheme for Fortran macro-dataflow processing on a multiprocessor system with local memory and centralized shared memory. The data-localization scheme minimizes data transfer overhead for passing shared data among coarse-grain tasks composed of Doall loops and sequential loops by using local memory on each processor effectively. In this scheme, a compiler firstly partitions coarse-grain tasks like loops having data dependences among them and their data into multiple groups by a loop aligned decomposition so that data transfer among groups can be minimum. Secondly it generates dynamic scheduling routine which assigns decomposed tasks in a group to the same processor at run-time. Thirdly it generates parallel machine code to pass shared data inside the group through local memory. This compiler has been implemented for an multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that macro-dataflow processing with the proposed data-localization scheme can reduce the execution time by 10% to 20% in average compared with macro-dataflow processing without data-localization.

    Original languageEnglish
    Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    PublisherSpringer Verlag
    Pages57-74
    Number of pages18
    Volume1239
    ISBN (Print)3540630910, 9783540630913
    DOIs
    Publication statusPublished - 1997
    Event9th International Workshop on Languages and Compilers for Parallel Computing, LCPC 1996 - San Jose, United States
    Duration: 1996 Aug 81996 Aug 10

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume1239
    ISSN (Print)03029743
    ISSN (Electronic)16113349

    Other

    Other9th International Workshop on Languages and Compilers for Parallel Computing, LCPC 1996
    CountryUnited States
    CitySan Jose
    Period96/8/896/8/10

    Fingerprint

    Data Flow
    Macros
    Decomposition
    Data storage equipment
    Decompose
    Processing
    Multiprocessor Systems
    Data Transfer
    Data transfer
    Shared Memory
    Compiler
    Distributed Shared Memory
    Data Dependence
    Dynamic Scheduling
    Parallel Machines
    Compilation
    Execution Time
    Assign
    Performance Evaluation
    Computer systems

    ASJC Scopus subject areas

    • Computer Science(all)
    • Theoretical Computer Science

    Cite this

    Yoshida, A., & Kasahara, H. (1997). Data localization using loop aligned decomposition for macro-dataflow processing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1239, pp. 57-74). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1239). Springer Verlag. https://doi.org/10.1007/BFb0017245

    Data localization using loop aligned decomposition for macro-dataflow processing. / Yoshida, Akimasa; Kasahara, Hironori.

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 1239 Springer Verlag, 1997. p. 57-74 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 1239).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Yoshida, A & Kasahara, H 1997, Data localization using loop aligned decomposition for macro-dataflow processing. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 1239, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 1239, Springer Verlag, pp. 57-74, 9th International Workshop on Languages and Compilers for Parallel Computing, LCPC 1996, San Jose, United States, 96/8/8. https://doi.org/10.1007/BFb0017245
    Yoshida A, Kasahara H. Data localization using loop aligned decomposition for macro-dataflow processing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 1239. Springer Verlag. 1997. p. 57-74. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/BFb0017245
    Yoshida, Akimasa ; Kasahara, Hironori. / Data localization using loop aligned decomposition for macro-dataflow processing. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 1239 Springer Verlag, 1997. pp. 57-74 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
    @inproceedings{7faeab8d4c6c4f359b842d36f53893a5,
    title = "Data localization using loop aligned decomposition for macro-dataflow processing",
    abstract = "This paper proposes a data-localization compilation scheme for Fortran macro-dataflow processing on a multiprocessor system with local memory and centralized shared memory. The data-localization scheme minimizes data transfer overhead for passing shared data among coarse-grain tasks composed of Doall loops and sequential loops by using local memory on each processor effectively. In this scheme, a compiler firstly partitions coarse-grain tasks like loops having data dependences among them and their data into multiple groups by a loop aligned decomposition so that data transfer among groups can be minimum. Secondly it generates dynamic scheduling routine which assigns decomposed tasks in a group to the same processor at run-time. Thirdly it generates parallel machine code to pass shared data inside the group through local memory. This compiler has been implemented for an multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that macro-dataflow processing with the proposed data-localization scheme can reduce the execution time by 10{\%} to 20{\%} in average compared with macro-dataflow processing without data-localization.",
    author = "Akimasa Yoshida and Hironori Kasahara",
    year = "1997",
    doi = "10.1007/BFb0017245",
    language = "English",
    isbn = "3540630910",
    volume = "1239",
    series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
    publisher = "Springer Verlag",
    pages = "57--74",
    booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

    }

    TY - GEN

    T1 - Data localization using loop aligned decomposition for macro-dataflow processing

    AU - Yoshida, Akimasa

    AU - Kasahara, Hironori

    PY - 1997

    Y1 - 1997

    N2 - This paper proposes a data-localization compilation scheme for Fortran macro-dataflow processing on a multiprocessor system with local memory and centralized shared memory. The data-localization scheme minimizes data transfer overhead for passing shared data among coarse-grain tasks composed of Doall loops and sequential loops by using local memory on each processor effectively. In this scheme, a compiler firstly partitions coarse-grain tasks like loops having data dependences among them and their data into multiple groups by a loop aligned decomposition so that data transfer among groups can be minimum. Secondly it generates dynamic scheduling routine which assigns decomposed tasks in a group to the same processor at run-time. Thirdly it generates parallel machine code to pass shared data inside the group through local memory. This compiler has been implemented for an multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that macro-dataflow processing with the proposed data-localization scheme can reduce the execution time by 10% to 20% in average compared with macro-dataflow processing without data-localization.

    AB - This paper proposes a data-localization compilation scheme for Fortran macro-dataflow processing on a multiprocessor system with local memory and centralized shared memory. The data-localization scheme minimizes data transfer overhead for passing shared data among coarse-grain tasks composed of Doall loops and sequential loops by using local memory on each processor effectively. In this scheme, a compiler firstly partitions coarse-grain tasks like loops having data dependences among them and their data into multiple groups by a loop aligned decomposition so that data transfer among groups can be minimum. Secondly it generates dynamic scheduling routine which assigns decomposed tasks in a group to the same processor at run-time. Thirdly it generates parallel machine code to pass shared data inside the group through local memory. This compiler has been implemented for an multiprocessor system OSCAR having centralized shared memory and distributed shared memory in addition to local memory on each processor. Performance evaluation on OSCAR shows that macro-dataflow processing with the proposed data-localization scheme can reduce the execution time by 10% to 20% in average compared with macro-dataflow processing without data-localization.

    UR - http://www.scopus.com/inward/record.url?scp=84957636321&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84957636321&partnerID=8YFLogxK

    U2 - 10.1007/BFb0017245

    DO - 10.1007/BFb0017245

    M3 - Conference contribution

    AN - SCOPUS:84957636321

    SN - 3540630910

    SN - 9783540630913

    VL - 1239

    T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    SP - 57

    EP - 74

    BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    PB - Springer Verlag

    ER -