Sub-operation parallelism optimization in SIMD processor core synthesis

Hideki Kawazu, Jumpei Uchida, Yuichiro Miyaoka, Nozomu Togawa, Masao Yanagisawa, Tatsuo Ohtsuki

    Research output: Contribution to journalArticle

    Abstract

    A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k × n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.

    Original languageEnglish
    Pages (from-to)876-883
    Number of pages8
    JournalIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences
    VolumeE88-A
    Issue number4
    DOIs
    Publication statusPublished - 2005

    Fingerprint

    Parallelism
    Synthesis
    Unit
    Optimization
    Timing
    Application programs
    Execution Time
    Optimization Algorithm
    Configuration
    Experimental Results

    Keywords

    • Hardware/software cosynthesis
    • Hardware/software partitioning
    • Packed SIMD type operation
    • Processor synthesis
    • Sub-operation parallelism

    ASJC Scopus subject areas

    • Electrical and Electronic Engineering
    • Hardware and Architecture
    • Information Systems

    Cite this

    Sub-operation parallelism optimization in SIMD processor core synthesis. / Kawazu, Hideki; Uchida, Jumpei; Miyaoka, Yuichiro; Togawa, Nozomu; Yanagisawa, Masao; Ohtsuki, Tatsuo.

    In: IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol. E88-A, No. 4, 2005, p. 876-883.

    Research output: Contribution to journalArticle

    @article{c076c6906a1a4ff392dd13c834b41511,
    title = "Sub-operation parallelism optimization in SIMD processor core synthesis",
    abstract = "A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k × n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.",
    keywords = "Hardware/software cosynthesis, Hardware/software partitioning, Packed SIMD type operation, Processor synthesis, Sub-operation parallelism",
    author = "Hideki Kawazu and Jumpei Uchida and Yuichiro Miyaoka and Nozomu Togawa and Masao Yanagisawa and Tatsuo Ohtsuki",
    year = "2005",
    doi = "10.1093/ietfec/e88-a.4.876",
    language = "English",
    volume = "E88-A",
    pages = "876--883",
    journal = "IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences",
    issn = "0916-8508",
    publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
    number = "4",

    }

    TY - JOUR

    T1 - Sub-operation parallelism optimization in SIMD processor core synthesis

    AU - Kawazu, Hideki

    AU - Uchida, Jumpei

    AU - Miyaoka, Yuichiro

    AU - Togawa, Nozomu

    AU - Yanagisawa, Masao

    AU - Ohtsuki, Tatsuo

    PY - 2005

    Y1 - 2005

    N2 - A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k × n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.

    AB - A b-bit SIMD functional unit has n k-bit sub-functional units in itself, where b = k × n. It can execute n-parallel k-bit operations. However, all the b-bit functional units in a processor core do not necessarily execute n-parallel operations. Depending on an application program, some of them just execute n/2-parallel operations or even n/4-parallel operations. This means that we can modify a b-bit SIMD functional unit so that it has n/2 k-bit sub-functional units or n/4 k-bit sub-functional units. The number of k-bit sub-functional units in a SIMD functional unit is called sub-operation parallelism. We incorporate a sub-operation parallelism optimization algorithm into SIMD functional unit optimization. Our proposed algorithm gradually reduces sub-operation parallelism of a SIMD functional unit while the timing constraint of execution time satisfied. Thereby, we can finally find a processor core with small area under the given timing constraint. We expect that we can obtain processor core configurations of smaller area in the same timing constraint rather than a conventional system. The promising experimental results are also shown.

    KW - Hardware/software cosynthesis

    KW - Hardware/software partitioning

    KW - Packed SIMD type operation

    KW - Processor synthesis

    KW - Sub-operation parallelism

    UR - http://www.scopus.com/inward/record.url?scp=24144453118&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=24144453118&partnerID=8YFLogxK

    U2 - 10.1093/ietfec/e88-a.4.876

    DO - 10.1093/ietfec/e88-a.4.876

    M3 - Article

    VL - E88-A

    SP - 876

    EP - 883

    JO - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

    JF - IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

    SN - 0916-8508

    IS - 4

    ER -