Automatic coarse grain task parallel processing on SMP using openMP

Hironori Kasahara, Motoki Obata, Kazuhisa Ishizaka

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    10 Citations (Scopus)

    Abstract

    This paper proposes a simple and efficient implementation method for a hierarchical coarse grain task parallel processing scheme on a SMP machine. OSCAR multigrain parallelizing compiler automatically generates parallelized code including OpenMP directives and its performance is evaluated on a commercial SMP machine. The coarse grain task parallel processing is important to improve the effective performance of wide range of multiprocessor systems from a single chip multiprocessor to a high performance computer beyond the limit of the loop parallelism. The proposed scheme decomposes a Fortran program into coarse grain tasks, analyzes parallelism among tasks by “Earliest Executable Condition Analysis” considering control and data dependencies, statically schedules the coarse grain tasks to threads or generates dynamic task scheduling codes to assign the tasks to threads and generates OpenMP Fortran source code for a SMP machine. The thread parallel code using OpenMP generated by OSCAR compiler forks threads only once at the beginning of the program and joins only once at the end even though the program is processed in parallel based on hierarchical coarse grain task parallel processing concept. The performance of the scheme is evaluated on 8-processor SMP machine, IBM RS6000 SP 604e High Node, using a newly developed OpenMP backend of OSCAR multigrain compiler. The evaluation shows that OSCAR compiler with IBM XL Fortran compiler version 5.1 gives us 1.5 to 3 times larger speedup than the native XL Fortran compiler for SPEC 95fp SWIM, TOMCATV, HYDRO2D, MGRID and Perfect Benchmarks ARC2D.

    Original languageEnglish
    Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    PublisherSpringer Verlag
    Pages189-207
    Number of pages19
    Volume2017
    ISBN (Print)3540428623, 9783540455745
    DOIs
    Publication statusPublished - 2001
    Event13th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2000 - Yorktown Heights, United States
    Duration: 2000 Aug 102000 Aug 12

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume2017
    ISSN (Print)03029743
    ISSN (Electronic)16113349

    Other

    Other13th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2000
    CountryUnited States
    CityYorktown Heights
    Period00/8/1000/8/12

    Fingerprint

    OpenMP
    Parallel Processing
    Compiler
    Thread
    Processing
    Parallelism
    Parallelizing Compilers
    Data Dependency
    Chip multiprocessors
    Dynamic Scheduling
    Task Scheduling
    Multiprocessor Systems
    Efficient Implementation
    Join
    Assign
    Schedule
    Speedup
    High Performance
    Scheduling
    Benchmark

    ASJC Scopus subject areas

    • Computer Science(all)
    • Theoretical Computer Science

    Cite this

    Kasahara, H., Obata, M., & Ishizaka, K. (2001). Automatic coarse grain task parallel processing on SMP using openMP. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2017, pp. 189-207). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2017). Springer Verlag. https://doi.org/10.1007/3-540-45574-4_13

    Automatic coarse grain task parallel processing on SMP using openMP. / Kasahara, Hironori; Obata, Motoki; Ishizaka, Kazuhisa.

    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2017 Springer Verlag, 2001. p. 189-207 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 2017).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Kasahara, H, Obata, M & Ishizaka, K 2001, Automatic coarse grain task parallel processing on SMP using openMP. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 2017, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 2017, Springer Verlag, pp. 189-207, 13th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2000, Yorktown Heights, United States, 00/8/10. https://doi.org/10.1007/3-540-45574-4_13
    Kasahara H, Obata M, Ishizaka K. Automatic coarse grain task parallel processing on SMP using openMP. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2017. Springer Verlag. 2001. p. 189-207. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/3-540-45574-4_13
    Kasahara, Hironori ; Obata, Motoki ; Ishizaka, Kazuhisa. / Automatic coarse grain task parallel processing on SMP using openMP. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 2017 Springer Verlag, 2001. pp. 189-207 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
    @inproceedings{f2577b7aa3f54f39b14e44bbb9e67f59,
    title = "Automatic coarse grain task parallel processing on SMP using openMP",
    abstract = "This paper proposes a simple and efficient implementation method for a hierarchical coarse grain task parallel processing scheme on a SMP machine. OSCAR multigrain parallelizing compiler automatically generates parallelized code including OpenMP directives and its performance is evaluated on a commercial SMP machine. The coarse grain task parallel processing is important to improve the effective performance of wide range of multiprocessor systems from a single chip multiprocessor to a high performance computer beyond the limit of the loop parallelism. The proposed scheme decomposes a Fortran program into coarse grain tasks, analyzes parallelism among tasks by “Earliest Executable Condition Analysis” considering control and data dependencies, statically schedules the coarse grain tasks to threads or generates dynamic task scheduling codes to assign the tasks to threads and generates OpenMP Fortran source code for a SMP machine. The thread parallel code using OpenMP generated by OSCAR compiler forks threads only once at the beginning of the program and joins only once at the end even though the program is processed in parallel based on hierarchical coarse grain task parallel processing concept. The performance of the scheme is evaluated on 8-processor SMP machine, IBM RS6000 SP 604e High Node, using a newly developed OpenMP backend of OSCAR multigrain compiler. The evaluation shows that OSCAR compiler with IBM XL Fortran compiler version 5.1 gives us 1.5 to 3 times larger speedup than the native XL Fortran compiler for SPEC 95fp SWIM, TOMCATV, HYDRO2D, MGRID and Perfect Benchmarks ARC2D.",
    author = "Hironori Kasahara and Motoki Obata and Kazuhisa Ishizaka",
    year = "2001",
    doi = "10.1007/3-540-45574-4_13",
    language = "English",
    isbn = "3540428623",
    volume = "2017",
    series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
    publisher = "Springer Verlag",
    pages = "189--207",
    booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

    }

    TY - GEN

    T1 - Automatic coarse grain task parallel processing on SMP using openMP

    AU - Kasahara, Hironori

    AU - Obata, Motoki

    AU - Ishizaka, Kazuhisa

    PY - 2001

    Y1 - 2001

    N2 - This paper proposes a simple and efficient implementation method for a hierarchical coarse grain task parallel processing scheme on a SMP machine. OSCAR multigrain parallelizing compiler automatically generates parallelized code including OpenMP directives and its performance is evaluated on a commercial SMP machine. The coarse grain task parallel processing is important to improve the effective performance of wide range of multiprocessor systems from a single chip multiprocessor to a high performance computer beyond the limit of the loop parallelism. The proposed scheme decomposes a Fortran program into coarse grain tasks, analyzes parallelism among tasks by “Earliest Executable Condition Analysis” considering control and data dependencies, statically schedules the coarse grain tasks to threads or generates dynamic task scheduling codes to assign the tasks to threads and generates OpenMP Fortran source code for a SMP machine. The thread parallel code using OpenMP generated by OSCAR compiler forks threads only once at the beginning of the program and joins only once at the end even though the program is processed in parallel based on hierarchical coarse grain task parallel processing concept. The performance of the scheme is evaluated on 8-processor SMP machine, IBM RS6000 SP 604e High Node, using a newly developed OpenMP backend of OSCAR multigrain compiler. The evaluation shows that OSCAR compiler with IBM XL Fortran compiler version 5.1 gives us 1.5 to 3 times larger speedup than the native XL Fortran compiler for SPEC 95fp SWIM, TOMCATV, HYDRO2D, MGRID and Perfect Benchmarks ARC2D.

    AB - This paper proposes a simple and efficient implementation method for a hierarchical coarse grain task parallel processing scheme on a SMP machine. OSCAR multigrain parallelizing compiler automatically generates parallelized code including OpenMP directives and its performance is evaluated on a commercial SMP machine. The coarse grain task parallel processing is important to improve the effective performance of wide range of multiprocessor systems from a single chip multiprocessor to a high performance computer beyond the limit of the loop parallelism. The proposed scheme decomposes a Fortran program into coarse grain tasks, analyzes parallelism among tasks by “Earliest Executable Condition Analysis” considering control and data dependencies, statically schedules the coarse grain tasks to threads or generates dynamic task scheduling codes to assign the tasks to threads and generates OpenMP Fortran source code for a SMP machine. The thread parallel code using OpenMP generated by OSCAR compiler forks threads only once at the beginning of the program and joins only once at the end even though the program is processed in parallel based on hierarchical coarse grain task parallel processing concept. The performance of the scheme is evaluated on 8-processor SMP machine, IBM RS6000 SP 604e High Node, using a newly developed OpenMP backend of OSCAR multigrain compiler. The evaluation shows that OSCAR compiler with IBM XL Fortran compiler version 5.1 gives us 1.5 to 3 times larger speedup than the native XL Fortran compiler for SPEC 95fp SWIM, TOMCATV, HYDRO2D, MGRID and Perfect Benchmarks ARC2D.

    UR - http://www.scopus.com/inward/record.url?scp=77954405219&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=77954405219&partnerID=8YFLogxK

    U2 - 10.1007/3-540-45574-4_13

    DO - 10.1007/3-540-45574-4_13

    M3 - Conference contribution

    AN - SCOPUS:77954405219

    SN - 3540428623

    SN - 9783540455745

    VL - 2017

    T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    SP - 189

    EP - 207

    BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

    PB - Springer Verlag

    ER -