Performance of OSCAR multigrain parallelizing compiler on SMP servers

Kazuhisa Ishizaka, Takamichi Miyamoto, Jun Shirako, Motoki Obata, Keiji Kimura, Hironori Kasahara

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    5 Citations (Scopus)

    Abstract

    This paper describes performance of OSCAR multigrain parallelizing compiler on various SMP servers, such as IBM pSeries 690, Sun Fire V880, Sun Ultra 80, NEC TX7/i6010 and SGI Altix 3700. The OSCAR compiler hierarchically exploits the coarse grain task parallelism among loops, subroutines and basic blocks and the near fine grain parallelism among statements inside a basic block in addition to the loop parallelism. Also, it allows us global cache optimization over different loops, or coarse grain tasks, based on data localization technique with inter-array padding to reduce memory access overhead. Current performance of OSCAR compiler is evaluated on the above SMP servers. For example, the OSCAR compiler generating OpenMP parallelized programs from ordinary sequential Fortran programs gives us 5.7 times speedup, in the average of seven programs, such as SPEC CFP95 tomcatv, swim, su2cor, hydro2d, mgrid, applu and turb3d, compared with IBM XL Fortran compiler 8.1 on IBM pSeries 690 24 processors SMP server. Also, it gives us 2.6 times speedup compare with Intel Fortran Itanium Compiler 7.1 on SGI Altix 3700 Itanium 2 16 processors server, 1.7 times speedup compared with NEC Fortran Itanium Compiler 3.4 on NEC TX7/i6010 Itanium 2 8 processors server, 2.5 times speedup compared with Sun Forte 7.0 on Sun Ultra 80 UltraSPARC II 4 processors desktop workstation, and 2.1 times speedup compare with Sun Forte compiler 7.1 on Sun Fire V880 UltraSPARC III Cu 8 processors server.

    Original languageEnglish
    Title of host publicationLecture Notes in Computer Science
    EditorsR. Eigenmann, Z. Li, S.P. Midkiff
    Pages319-331
    Number of pages13
    Volume3602
    Publication statusPublished - 2005
    Event17th International Workshop on Languages and Compilers for High Performance Computing, LCPC 2004 - West Lafayette, IN, United States
    Duration: 2004 Sep 222004 Sep 24

    Other

    Other17th International Workshop on Languages and Compilers for High Performance Computing, LCPC 2004
    CountryUnited States
    CityWest Lafayette, IN
    Period04/9/2204/9/24

    Fingerprint

    Sun
    Servers
    Fires
    Subroutines
    Data storage equipment

    ASJC Scopus subject areas

    • Computer Science (miscellaneous)

    Cite this

    Ishizaka, K., Miyamoto, T., Shirako, J., Obata, M., Kimura, K., & Kasahara, H. (2005). Performance of OSCAR multigrain parallelizing compiler on SMP servers. In R. Eigenmann, Z. Li, & S. P. Midkiff (Eds.), Lecture Notes in Computer Science (Vol. 3602, pp. 319-331)

    Performance of OSCAR multigrain parallelizing compiler on SMP servers. / Ishizaka, Kazuhisa; Miyamoto, Takamichi; Shirako, Jun; Obata, Motoki; Kimura, Keiji; Kasahara, Hironori.

    Lecture Notes in Computer Science. ed. / R. Eigenmann; Z. Li; S.P. Midkiff. Vol. 3602 2005. p. 319-331.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Ishizaka, K, Miyamoto, T, Shirako, J, Obata, M, Kimura, K & Kasahara, H 2005, Performance of OSCAR multigrain parallelizing compiler on SMP servers. in R Eigenmann, Z Li & SP Midkiff (eds), Lecture Notes in Computer Science. vol. 3602, pp. 319-331, 17th International Workshop on Languages and Compilers for High Performance Computing, LCPC 2004, West Lafayette, IN, United States, 04/9/22.
    Ishizaka K, Miyamoto T, Shirako J, Obata M, Kimura K, Kasahara H. Performance of OSCAR multigrain parallelizing compiler on SMP servers. In Eigenmann R, Li Z, Midkiff SP, editors, Lecture Notes in Computer Science. Vol. 3602. 2005. p. 319-331
    Ishizaka, Kazuhisa ; Miyamoto, Takamichi ; Shirako, Jun ; Obata, Motoki ; Kimura, Keiji ; Kasahara, Hironori. / Performance of OSCAR multigrain parallelizing compiler on SMP servers. Lecture Notes in Computer Science. editor / R. Eigenmann ; Z. Li ; S.P. Midkiff. Vol. 3602 2005. pp. 319-331
    @inproceedings{8b92709482f04761a8f66d69a1cc9951,
    title = "Performance of OSCAR multigrain parallelizing compiler on SMP servers",
    abstract = "This paper describes performance of OSCAR multigrain parallelizing compiler on various SMP servers, such as IBM pSeries 690, Sun Fire V880, Sun Ultra 80, NEC TX7/i6010 and SGI Altix 3700. The OSCAR compiler hierarchically exploits the coarse grain task parallelism among loops, subroutines and basic blocks and the near fine grain parallelism among statements inside a basic block in addition to the loop parallelism. Also, it allows us global cache optimization over different loops, or coarse grain tasks, based on data localization technique with inter-array padding to reduce memory access overhead. Current performance of OSCAR compiler is evaluated on the above SMP servers. For example, the OSCAR compiler generating OpenMP parallelized programs from ordinary sequential Fortran programs gives us 5.7 times speedup, in the average of seven programs, such as SPEC CFP95 tomcatv, swim, su2cor, hydro2d, mgrid, applu and turb3d, compared with IBM XL Fortran compiler 8.1 on IBM pSeries 690 24 processors SMP server. Also, it gives us 2.6 times speedup compare with Intel Fortran Itanium Compiler 7.1 on SGI Altix 3700 Itanium 2 16 processors server, 1.7 times speedup compared with NEC Fortran Itanium Compiler 3.4 on NEC TX7/i6010 Itanium 2 8 processors server, 2.5 times speedup compared with Sun Forte 7.0 on Sun Ultra 80 UltraSPARC II 4 processors desktop workstation, and 2.1 times speedup compare with Sun Forte compiler 7.1 on Sun Fire V880 UltraSPARC III Cu 8 processors server.",
    author = "Kazuhisa Ishizaka and Takamichi Miyamoto and Jun Shirako and Motoki Obata and Keiji Kimura and Hironori Kasahara",
    year = "2005",
    language = "English",
    volume = "3602",
    pages = "319--331",
    editor = "R. Eigenmann and Z. Li and S.P. Midkiff",
    booktitle = "Lecture Notes in Computer Science",

    }

    TY - GEN

    T1 - Performance of OSCAR multigrain parallelizing compiler on SMP servers

    AU - Ishizaka, Kazuhisa

    AU - Miyamoto, Takamichi

    AU - Shirako, Jun

    AU - Obata, Motoki

    AU - Kimura, Keiji

    AU - Kasahara, Hironori

    PY - 2005

    Y1 - 2005

    N2 - This paper describes performance of OSCAR multigrain parallelizing compiler on various SMP servers, such as IBM pSeries 690, Sun Fire V880, Sun Ultra 80, NEC TX7/i6010 and SGI Altix 3700. The OSCAR compiler hierarchically exploits the coarse grain task parallelism among loops, subroutines and basic blocks and the near fine grain parallelism among statements inside a basic block in addition to the loop parallelism. Also, it allows us global cache optimization over different loops, or coarse grain tasks, based on data localization technique with inter-array padding to reduce memory access overhead. Current performance of OSCAR compiler is evaluated on the above SMP servers. For example, the OSCAR compiler generating OpenMP parallelized programs from ordinary sequential Fortran programs gives us 5.7 times speedup, in the average of seven programs, such as SPEC CFP95 tomcatv, swim, su2cor, hydro2d, mgrid, applu and turb3d, compared with IBM XL Fortran compiler 8.1 on IBM pSeries 690 24 processors SMP server. Also, it gives us 2.6 times speedup compare with Intel Fortran Itanium Compiler 7.1 on SGI Altix 3700 Itanium 2 16 processors server, 1.7 times speedup compared with NEC Fortran Itanium Compiler 3.4 on NEC TX7/i6010 Itanium 2 8 processors server, 2.5 times speedup compared with Sun Forte 7.0 on Sun Ultra 80 UltraSPARC II 4 processors desktop workstation, and 2.1 times speedup compare with Sun Forte compiler 7.1 on Sun Fire V880 UltraSPARC III Cu 8 processors server.

    AB - This paper describes performance of OSCAR multigrain parallelizing compiler on various SMP servers, such as IBM pSeries 690, Sun Fire V880, Sun Ultra 80, NEC TX7/i6010 and SGI Altix 3700. The OSCAR compiler hierarchically exploits the coarse grain task parallelism among loops, subroutines and basic blocks and the near fine grain parallelism among statements inside a basic block in addition to the loop parallelism. Also, it allows us global cache optimization over different loops, or coarse grain tasks, based on data localization technique with inter-array padding to reduce memory access overhead. Current performance of OSCAR compiler is evaluated on the above SMP servers. For example, the OSCAR compiler generating OpenMP parallelized programs from ordinary sequential Fortran programs gives us 5.7 times speedup, in the average of seven programs, such as SPEC CFP95 tomcatv, swim, su2cor, hydro2d, mgrid, applu and turb3d, compared with IBM XL Fortran compiler 8.1 on IBM pSeries 690 24 processors SMP server. Also, it gives us 2.6 times speedup compare with Intel Fortran Itanium Compiler 7.1 on SGI Altix 3700 Itanium 2 16 processors server, 1.7 times speedup compared with NEC Fortran Itanium Compiler 3.4 on NEC TX7/i6010 Itanium 2 8 processors server, 2.5 times speedup compared with Sun Forte 7.0 on Sun Ultra 80 UltraSPARC II 4 processors desktop workstation, and 2.1 times speedup compare with Sun Forte compiler 7.1 on Sun Fire V880 UltraSPARC III Cu 8 processors server.

    UR - http://www.scopus.com/inward/record.url?scp=26444616618&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=26444616618&partnerID=8YFLogxK

    M3 - Conference contribution

    AN - SCOPUS:26444616618

    VL - 3602

    SP - 319

    EP - 331

    BT - Lecture Notes in Computer Science

    A2 - Eigenmann, R.

    A2 - Li, Z.

    A2 - Midkiff, S.P.

    ER -