Green multicore-SoC software-execution framework with timely-power-gating scheme

Masafumi Onouchi, Keisuke Toyama, Toru Nojiri, Makoto Sato, Masayoshi Mase, Jun Shirako, Mikiko Sato, Masashi Takada, Masayuki Ito, Hiroyuki Mizuno, Mitaro Namiki, Keiji Kimura, Hironori Kasahara

    研究成果: Conference contribution

    1 引用 (Scopus)

    抄録

    We are developing a software-execution framework based on an octo-core chip multiprocessor named RP2 and an automatic multigrain-parallelizing compiler named OSCAR. The main purpose of this framework is to maintain good speed scalability and power efficiency over the number of processor cores under severe hardware restrictions for embedded use. Key to the speed scalability is reduction of a communication overhead with parallelized tasks. A data-categorization scheme enables small-overhead cache-coherency maintenance by using directives and instructions from the compiler. In this scheme, the number of cache-flushing time is minimized and parallelized tasks are quickly synchronized by using flags in local memory. As regards power efficiency, to reduce power consumption, power supply to processor cores waiting for other cores is timely and frequently cut off, even in the middle of an application, by using a timelypower- gating scheme. In this scheme, to achieve quick mode transition between "NORMAL" mode and "RESUME POWEROFF" mode, register values of the processor core are stored in core-local memory, which is active even in "RESUME POWEROFF" mode and can be accessed in one or two clock cycles. Measured speed and power of an application show good speed scalability in execution time and high power efficiency, simultaneously. In the case of a secure AAC-LC encoding program, execution speed when eight processor cores are used can be increased by 4.85 times compared to that of sequential execution. Moreover, power consumption under the same condition can be reduced by 51.0% by parallelizing and timely-power gating. The time for mode transition is less than 20 μsec, which is only 2.5% of the "RESUME POWER-OFF" period.

    元の言語English
    ホスト出版物のタイトルProceedings of the International Conference on Parallel Processing
    ページ510-517
    ページ数8
    DOI
    出版物ステータスPublished - 2009
    イベント38th International Conference on Parallel Processing, ICPP-2009 - Vienna
    継続期間: 2009 9 222009 9 25

    Other

    Other38th International Conference on Parallel Processing, ICPP-2009
    Vienna
    期間09/9/2209/9/25

    Fingerprint

    Software
    Scalability
    Cache
    Power Consumption
    Electric power utilization
    Parallelizing Compilers
    Data storage equipment
    Chip multiprocessors
    Categorization
    Compiler
    High Power
    Execution Time
    Clocks
    Maintenance
    Encoding
    System-on-chip
    Framework
    Hardware
    Restriction
    Cycle

    ASJC Scopus subject areas

    • Software
    • Mathematics(all)
    • Hardware and Architecture

    これを引用

    Onouchi, M., Toyama, K., Nojiri, T., Sato, M., Mase, M., Shirako, J., ... Kasahara, H. (2009). Green multicore-SoC software-execution framework with timely-power-gating scheme. : Proceedings of the International Conference on Parallel Processing (pp. 510-517). [5362472] https://doi.org/10.1109/ICPP.2009.6

    Green multicore-SoC software-execution framework with timely-power-gating scheme. / Onouchi, Masafumi; Toyama, Keisuke; Nojiri, Toru; Sato, Makoto; Mase, Masayoshi; Shirako, Jun; Sato, Mikiko; Takada, Masashi; Ito, Masayuki; Mizuno, Hiroyuki; Namiki, Mitaro; Kimura, Keiji; Kasahara, Hironori.

    Proceedings of the International Conference on Parallel Processing. 2009. p. 510-517 5362472.

    研究成果: Conference contribution

    Onouchi, M, Toyama, K, Nojiri, T, Sato, M, Mase, M, Shirako, J, Sato, M, Takada, M, Ito, M, Mizuno, H, Namiki, M, Kimura, K & Kasahara, H 2009, Green multicore-SoC software-execution framework with timely-power-gating scheme. : Proceedings of the International Conference on Parallel Processing., 5362472, pp. 510-517, 38th International Conference on Parallel Processing, ICPP-2009, Vienna, 09/9/22. https://doi.org/10.1109/ICPP.2009.6
    Onouchi M, Toyama K, Nojiri T, Sato M, Mase M, Shirako J その他. Green multicore-SoC software-execution framework with timely-power-gating scheme. : Proceedings of the International Conference on Parallel Processing. 2009. p. 510-517. 5362472 https://doi.org/10.1109/ICPP.2009.6
    Onouchi, Masafumi ; Toyama, Keisuke ; Nojiri, Toru ; Sato, Makoto ; Mase, Masayoshi ; Shirako, Jun ; Sato, Mikiko ; Takada, Masashi ; Ito, Masayuki ; Mizuno, Hiroyuki ; Namiki, Mitaro ; Kimura, Keiji ; Kasahara, Hironori. / Green multicore-SoC software-execution framework with timely-power-gating scheme. Proceedings of the International Conference on Parallel Processing. 2009. pp. 510-517
    @inproceedings{fb8410d62f444a14a50e129f95da44d3,
    title = "Green multicore-SoC software-execution framework with timely-power-gating scheme",
    abstract = "We are developing a software-execution framework based on an octo-core chip multiprocessor named RP2 and an automatic multigrain-parallelizing compiler named OSCAR. The main purpose of this framework is to maintain good speed scalability and power efficiency over the number of processor cores under severe hardware restrictions for embedded use. Key to the speed scalability is reduction of a communication overhead with parallelized tasks. A data-categorization scheme enables small-overhead cache-coherency maintenance by using directives and instructions from the compiler. In this scheme, the number of cache-flushing time is minimized and parallelized tasks are quickly synchronized by using flags in local memory. As regards power efficiency, to reduce power consumption, power supply to processor cores waiting for other cores is timely and frequently cut off, even in the middle of an application, by using a timelypower- gating scheme. In this scheme, to achieve quick mode transition between {"}NORMAL{"} mode and {"}RESUME POWEROFF{"} mode, register values of the processor core are stored in core-local memory, which is active even in {"}RESUME POWEROFF{"} mode and can be accessed in one or two clock cycles. Measured speed and power of an application show good speed scalability in execution time and high power efficiency, simultaneously. In the case of a secure AAC-LC encoding program, execution speed when eight processor cores are used can be increased by 4.85 times compared to that of sequential execution. Moreover, power consumption under the same condition can be reduced by 51.0{\%} by parallelizing and timely-power gating. The time for mode transition is less than 20 μsec, which is only 2.5{\%} of the {"}RESUME POWER-OFF{"} period.",
    author = "Masafumi Onouchi and Keisuke Toyama and Toru Nojiri and Makoto Sato and Masayoshi Mase and Jun Shirako and Mikiko Sato and Masashi Takada and Masayuki Ito and Hiroyuki Mizuno and Mitaro Namiki and Keiji Kimura and Hironori Kasahara",
    year = "2009",
    doi = "10.1109/ICPP.2009.6",
    language = "English",
    isbn = "9780769538020",
    pages = "510--517",
    booktitle = "Proceedings of the International Conference on Parallel Processing",

    }

    TY - GEN

    T1 - Green multicore-SoC software-execution framework with timely-power-gating scheme

    AU - Onouchi, Masafumi

    AU - Toyama, Keisuke

    AU - Nojiri, Toru

    AU - Sato, Makoto

    AU - Mase, Masayoshi

    AU - Shirako, Jun

    AU - Sato, Mikiko

    AU - Takada, Masashi

    AU - Ito, Masayuki

    AU - Mizuno, Hiroyuki

    AU - Namiki, Mitaro

    AU - Kimura, Keiji

    AU - Kasahara, Hironori

    PY - 2009

    Y1 - 2009

    N2 - We are developing a software-execution framework based on an octo-core chip multiprocessor named RP2 and an automatic multigrain-parallelizing compiler named OSCAR. The main purpose of this framework is to maintain good speed scalability and power efficiency over the number of processor cores under severe hardware restrictions for embedded use. Key to the speed scalability is reduction of a communication overhead with parallelized tasks. A data-categorization scheme enables small-overhead cache-coherency maintenance by using directives and instructions from the compiler. In this scheme, the number of cache-flushing time is minimized and parallelized tasks are quickly synchronized by using flags in local memory. As regards power efficiency, to reduce power consumption, power supply to processor cores waiting for other cores is timely and frequently cut off, even in the middle of an application, by using a timelypower- gating scheme. In this scheme, to achieve quick mode transition between "NORMAL" mode and "RESUME POWEROFF" mode, register values of the processor core are stored in core-local memory, which is active even in "RESUME POWEROFF" mode and can be accessed in one or two clock cycles. Measured speed and power of an application show good speed scalability in execution time and high power efficiency, simultaneously. In the case of a secure AAC-LC encoding program, execution speed when eight processor cores are used can be increased by 4.85 times compared to that of sequential execution. Moreover, power consumption under the same condition can be reduced by 51.0% by parallelizing and timely-power gating. The time for mode transition is less than 20 μsec, which is only 2.5% of the "RESUME POWER-OFF" period.

    AB - We are developing a software-execution framework based on an octo-core chip multiprocessor named RP2 and an automatic multigrain-parallelizing compiler named OSCAR. The main purpose of this framework is to maintain good speed scalability and power efficiency over the number of processor cores under severe hardware restrictions for embedded use. Key to the speed scalability is reduction of a communication overhead with parallelized tasks. A data-categorization scheme enables small-overhead cache-coherency maintenance by using directives and instructions from the compiler. In this scheme, the number of cache-flushing time is minimized and parallelized tasks are quickly synchronized by using flags in local memory. As regards power efficiency, to reduce power consumption, power supply to processor cores waiting for other cores is timely and frequently cut off, even in the middle of an application, by using a timelypower- gating scheme. In this scheme, to achieve quick mode transition between "NORMAL" mode and "RESUME POWEROFF" mode, register values of the processor core are stored in core-local memory, which is active even in "RESUME POWEROFF" mode and can be accessed in one or two clock cycles. Measured speed and power of an application show good speed scalability in execution time and high power efficiency, simultaneously. In the case of a secure AAC-LC encoding program, execution speed when eight processor cores are used can be increased by 4.85 times compared to that of sequential execution. Moreover, power consumption under the same condition can be reduced by 51.0% by parallelizing and timely-power gating. The time for mode transition is less than 20 μsec, which is only 2.5% of the "RESUME POWER-OFF" period.

    UR - http://www.scopus.com/inward/record.url?scp=77951486753&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=77951486753&partnerID=8YFLogxK

    U2 - 10.1109/ICPP.2009.6

    DO - 10.1109/ICPP.2009.6

    M3 - Conference contribution

    AN - SCOPUS:77951486753

    SN - 9780769538020

    SP - 510

    EP - 517

    BT - Proceedings of the International Conference on Parallel Processing

    ER -