Multigrain parallel processing on OSCAR CMP

Keiji Kimura, Takeshi Kodaka, Motoki Obata, Hironori Kasahara

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    6 Citations (Scopus)

    Abstract

    It seems that Instruction Level Parallelism (ILP) approach, which has been used by various superscalar processors and VLIW processors for a long time, reaches its limitation of performance improvement. To obtain scalable performance improvement, cost effectiveness and high productivity even in the era of one billion transistors, the cooperative work between software and hardware is getting increasingly important. For this reason, the authors have developed OSCAR (Optimally Scheduled Advanced multiprocessoR) Chip Multiprocessor (OSCAR CMP) and OSCAR multigrain compiler simultaneously. To preserve the scalability in the future, OSCAR CMP has mechanisms for efficient use of parallelism and data locality, and for hiding data transfer overhead. These mechanisms can be fully controlled by the OSCAR multigrain compiler. In this paper, the authors focus on multigrain parallel processing on OSCAR CMP, which enables us to exploit loop iteration level parallelism and coarse grain task parallelism in addition to ILP from the entire of a program. Performance of multigrain parallel processing on OSCAR CMP architecture is evaluated using SPEC fp 2000/95 benchmark suite. When microSPARC like single issue core is used, OSCAR CMP gives us from 1.77 to 3.96 times speedup for four processors against single processor. In addition, OSCAR CMP is compared with Sun UltraSPARC II like processor to evaluate cost effectiveness. As a result, OSCAR CMP gives us 1.66 times better performance on the average under the condition that OSCAR CMP and UltraSPARC II are built from almost same number of transistors.

    Original languageEnglish
    Title of host publicationProceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems
    PublisherIEEE Computer Society
    Pages56-65
    Number of pages10
    Volume2003-January
    ISBN (Print)0769520197
    DOIs
    Publication statusPublished - 2003
    EventInnovative Architecture for Future Generation High-Performance Processors and Systems, IWIA 2003 - Kauai, United States
    Duration: 2003 Jul 27 → …

    Other

    OtherInnovative Architecture for Future Generation High-Performance Processors and Systems, IWIA 2003
    CountryUnited States
    CityKauai
    Period03/7/27 → …

    Fingerprint

    Cost effectiveness
    Transistors
    Data transfer
    Processing
    Sun
    Scalability
    Productivity
    Hardware

    ASJC Scopus subject areas

    • Hardware and Architecture

    Cite this

    Kimura, K., Kodaka, T., Obata, M., & Kasahara, H. (2003). Multigrain parallel processing on OSCAR CMP. In Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems (Vol. 2003-January, pp. 56-65). [1262783] IEEE Computer Society. https://doi.org/10.1109/IWIA.2003.1262783

    Multigrain parallel processing on OSCAR CMP. / Kimura, Keiji; Kodaka, Takeshi; Obata, Motoki; Kasahara, Hironori.

    Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems. Vol. 2003-January IEEE Computer Society, 2003. p. 56-65 1262783.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Kimura, K, Kodaka, T, Obata, M & Kasahara, H 2003, Multigrain parallel processing on OSCAR CMP. in Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems. vol. 2003-January, 1262783, IEEE Computer Society, pp. 56-65, Innovative Architecture for Future Generation High-Performance Processors and Systems, IWIA 2003, Kauai, United States, 03/7/27. https://doi.org/10.1109/IWIA.2003.1262783
    Kimura K, Kodaka T, Obata M, Kasahara H. Multigrain parallel processing on OSCAR CMP. In Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems. Vol. 2003-January. IEEE Computer Society. 2003. p. 56-65. 1262783 https://doi.org/10.1109/IWIA.2003.1262783
    Kimura, Keiji ; Kodaka, Takeshi ; Obata, Motoki ; Kasahara, Hironori. / Multigrain parallel processing on OSCAR CMP. Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems. Vol. 2003-January IEEE Computer Society, 2003. pp. 56-65
    @inproceedings{b7442a0d0aec47d08fb2d78387fdcdf5,
    title = "Multigrain parallel processing on OSCAR CMP",
    abstract = "It seems that Instruction Level Parallelism (ILP) approach, which has been used by various superscalar processors and VLIW processors for a long time, reaches its limitation of performance improvement. To obtain scalable performance improvement, cost effectiveness and high productivity even in the era of one billion transistors, the cooperative work between software and hardware is getting increasingly important. For this reason, the authors have developed OSCAR (Optimally Scheduled Advanced multiprocessoR) Chip Multiprocessor (OSCAR CMP) and OSCAR multigrain compiler simultaneously. To preserve the scalability in the future, OSCAR CMP has mechanisms for efficient use of parallelism and data locality, and for hiding data transfer overhead. These mechanisms can be fully controlled by the OSCAR multigrain compiler. In this paper, the authors focus on multigrain parallel processing on OSCAR CMP, which enables us to exploit loop iteration level parallelism and coarse grain task parallelism in addition to ILP from the entire of a program. Performance of multigrain parallel processing on OSCAR CMP architecture is evaluated using SPEC fp 2000/95 benchmark suite. When microSPARC like single issue core is used, OSCAR CMP gives us from 1.77 to 3.96 times speedup for four processors against single processor. In addition, OSCAR CMP is compared with Sun UltraSPARC II like processor to evaluate cost effectiveness. As a result, OSCAR CMP gives us 1.66 times better performance on the average under the condition that OSCAR CMP and UltraSPARC II are built from almost same number of transistors.",
    author = "Keiji Kimura and Takeshi Kodaka and Motoki Obata and Hironori Kasahara",
    year = "2003",
    doi = "10.1109/IWIA.2003.1262783",
    language = "English",
    isbn = "0769520197",
    volume = "2003-January",
    pages = "56--65",
    booktitle = "Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems",
    publisher = "IEEE Computer Society",

    }

    TY - GEN

    T1 - Multigrain parallel processing on OSCAR CMP

    AU - Kimura, Keiji

    AU - Kodaka, Takeshi

    AU - Obata, Motoki

    AU - Kasahara, Hironori

    PY - 2003

    Y1 - 2003

    N2 - It seems that Instruction Level Parallelism (ILP) approach, which has been used by various superscalar processors and VLIW processors for a long time, reaches its limitation of performance improvement. To obtain scalable performance improvement, cost effectiveness and high productivity even in the era of one billion transistors, the cooperative work between software and hardware is getting increasingly important. For this reason, the authors have developed OSCAR (Optimally Scheduled Advanced multiprocessoR) Chip Multiprocessor (OSCAR CMP) and OSCAR multigrain compiler simultaneously. To preserve the scalability in the future, OSCAR CMP has mechanisms for efficient use of parallelism and data locality, and for hiding data transfer overhead. These mechanisms can be fully controlled by the OSCAR multigrain compiler. In this paper, the authors focus on multigrain parallel processing on OSCAR CMP, which enables us to exploit loop iteration level parallelism and coarse grain task parallelism in addition to ILP from the entire of a program. Performance of multigrain parallel processing on OSCAR CMP architecture is evaluated using SPEC fp 2000/95 benchmark suite. When microSPARC like single issue core is used, OSCAR CMP gives us from 1.77 to 3.96 times speedup for four processors against single processor. In addition, OSCAR CMP is compared with Sun UltraSPARC II like processor to evaluate cost effectiveness. As a result, OSCAR CMP gives us 1.66 times better performance on the average under the condition that OSCAR CMP and UltraSPARC II are built from almost same number of transistors.

    AB - It seems that Instruction Level Parallelism (ILP) approach, which has been used by various superscalar processors and VLIW processors for a long time, reaches its limitation of performance improvement. To obtain scalable performance improvement, cost effectiveness and high productivity even in the era of one billion transistors, the cooperative work between software and hardware is getting increasingly important. For this reason, the authors have developed OSCAR (Optimally Scheduled Advanced multiprocessoR) Chip Multiprocessor (OSCAR CMP) and OSCAR multigrain compiler simultaneously. To preserve the scalability in the future, OSCAR CMP has mechanisms for efficient use of parallelism and data locality, and for hiding data transfer overhead. These mechanisms can be fully controlled by the OSCAR multigrain compiler. In this paper, the authors focus on multigrain parallel processing on OSCAR CMP, which enables us to exploit loop iteration level parallelism and coarse grain task parallelism in addition to ILP from the entire of a program. Performance of multigrain parallel processing on OSCAR CMP architecture is evaluated using SPEC fp 2000/95 benchmark suite. When microSPARC like single issue core is used, OSCAR CMP gives us from 1.77 to 3.96 times speedup for four processors against single processor. In addition, OSCAR CMP is compared with Sun UltraSPARC II like processor to evaluate cost effectiveness. As a result, OSCAR CMP gives us 1.66 times better performance on the average under the condition that OSCAR CMP and UltraSPARC II are built from almost same number of transistors.

    UR - http://www.scopus.com/inward/record.url?scp=13444275328&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=13444275328&partnerID=8YFLogxK

    U2 - 10.1109/IWIA.2003.1262783

    DO - 10.1109/IWIA.2003.1262783

    M3 - Conference contribution

    AN - SCOPUS:13444275328

    SN - 0769520197

    VL - 2003-January

    SP - 56

    EP - 65

    BT - Proceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems

    PB - IEEE Computer Society

    ER -