Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP

Hirofumi Nakano, Kazuhisa Ishizaka, Motoki Obata, Keiji Kimura, Hironori Kasahara

    Research output: Contribution to journalArticle

    2 Citations (Scopus)

    Abstract

    Effective use of cache memory is getting more important with increasing gap between the processor speed and memory access speed. Also, use of multigrain parallelism is getting more important to improve effective performance beyond the limitation of loop iteration level parallelism. Considering these factors, this paper proposes a coarse grain task static scheduling scheme considering cache optimization, The proposed scheme schedules coarse grain tasks to threads so that shared data among coarse grain tasks can be passed via cache after task and data decomposition considering cache size at compile time. It is implemented on OSCAR Fortran multigrain parallelizing compiler and evaluated on Sun Ultra80 four-processor SMP workstation using Swim and Tomcatv from the SPEC fp 95. As the results, the proposed scheme gives us 4.56 times speedup for Swim and 2.37 times on 4 processors for Tomcatv respectively against the Sun Forte HPC Ver. 6 update 1 loop parallelizing compiler.

    Original languageEnglish
    Pages (from-to)211-223
    Number of pages13
    JournalInternational Journal of Parallel Programming
    Volume31
    Issue number3
    DOIs
    Publication statusPublished - 2003 Jun

    Fingerprint

    OpenMP
    Task Scheduling
    Parallelizing Compilers
    Cache
    Scheduling
    Sun
    Parallelism
    Optimization
    Cache memory
    Thread
    Schedule
    Speedup
    Update
    Decomposition
    Iteration
    Data storage equipment
    Decompose

    Keywords

    • Cache optimization
    • Coarse grain task parallelization
    • OpenMP
    • Scheduling algorithm

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • Computational Theory and Mathematics

    Cite this

    Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP. / Nakano, Hirofumi; Ishizaka, Kazuhisa; Obata, Motoki; Kimura, Keiji; Kasahara, Hironori.

    In: International Journal of Parallel Programming, Vol. 31, No. 3, 06.2003, p. 211-223.

    Research output: Contribution to journalArticle

    @article{b9d81f41a30b4521a7f6a253256f2e96,
    title = "Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP",
    abstract = "Effective use of cache memory is getting more important with increasing gap between the processor speed and memory access speed. Also, use of multigrain parallelism is getting more important to improve effective performance beyond the limitation of loop iteration level parallelism. Considering these factors, this paper proposes a coarse grain task static scheduling scheme considering cache optimization, The proposed scheme schedules coarse grain tasks to threads so that shared data among coarse grain tasks can be passed via cache after task and data decomposition considering cache size at compile time. It is implemented on OSCAR Fortran multigrain parallelizing compiler and evaluated on Sun Ultra80 four-processor SMP workstation using Swim and Tomcatv from the SPEC fp 95. As the results, the proposed scheme gives us 4.56 times speedup for Swim and 2.37 times on 4 processors for Tomcatv respectively against the Sun Forte HPC Ver. 6 update 1 loop parallelizing compiler.",
    keywords = "Cache optimization, Coarse grain task parallelization, OpenMP, Scheduling algorithm",
    author = "Hirofumi Nakano and Kazuhisa Ishizaka and Motoki Obata and Keiji Kimura and Hironori Kasahara",
    year = "2003",
    month = "6",
    doi = "10.1023/A:1023038702472",
    language = "English",
    volume = "31",
    pages = "211--223",
    journal = "International Journal of Parallel Programming",
    issn = "0885-7458",
    publisher = "Springer New York",
    number = "3",

    }

    TY - JOUR

    T1 - Static Coarse Grain Task Scheduling with Cache Optimization Using OpenMP

    AU - Nakano, Hirofumi

    AU - Ishizaka, Kazuhisa

    AU - Obata, Motoki

    AU - Kimura, Keiji

    AU - Kasahara, Hironori

    PY - 2003/6

    Y1 - 2003/6

    N2 - Effective use of cache memory is getting more important with increasing gap between the processor speed and memory access speed. Also, use of multigrain parallelism is getting more important to improve effective performance beyond the limitation of loop iteration level parallelism. Considering these factors, this paper proposes a coarse grain task static scheduling scheme considering cache optimization, The proposed scheme schedules coarse grain tasks to threads so that shared data among coarse grain tasks can be passed via cache after task and data decomposition considering cache size at compile time. It is implemented on OSCAR Fortran multigrain parallelizing compiler and evaluated on Sun Ultra80 four-processor SMP workstation using Swim and Tomcatv from the SPEC fp 95. As the results, the proposed scheme gives us 4.56 times speedup for Swim and 2.37 times on 4 processors for Tomcatv respectively against the Sun Forte HPC Ver. 6 update 1 loop parallelizing compiler.

    AB - Effective use of cache memory is getting more important with increasing gap between the processor speed and memory access speed. Also, use of multigrain parallelism is getting more important to improve effective performance beyond the limitation of loop iteration level parallelism. Considering these factors, this paper proposes a coarse grain task static scheduling scheme considering cache optimization, The proposed scheme schedules coarse grain tasks to threads so that shared data among coarse grain tasks can be passed via cache after task and data decomposition considering cache size at compile time. It is implemented on OSCAR Fortran multigrain parallelizing compiler and evaluated on Sun Ultra80 four-processor SMP workstation using Swim and Tomcatv from the SPEC fp 95. As the results, the proposed scheme gives us 4.56 times speedup for Swim and 2.37 times on 4 processors for Tomcatv respectively against the Sun Forte HPC Ver. 6 update 1 loop parallelizing compiler.

    KW - Cache optimization

    KW - Coarse grain task parallelization

    KW - OpenMP

    KW - Scheduling algorithm

    UR - http://www.scopus.com/inward/record.url?scp=0346502797&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=0346502797&partnerID=8YFLogxK

    U2 - 10.1023/A:1023038702472

    DO - 10.1023/A:1023038702472

    M3 - Article

    VL - 31

    SP - 211

    EP - 223

    JO - International Journal of Parallel Programming

    JF - International Journal of Parallel Programming

    SN - 0885-7458

    IS - 3

    ER -