Energy-efficient scheduling method with cross-loop model for resource-limited CNN accelerator designs

Kaiyi Yang, Shihao Wang, Jianbin Zhou, Takeshi Yoshimura

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The state-of-the-art customized accelerators of convolution neural networks (CNN) have achieved high throughput while the huge amount of data movements still remains as the dominant part of the total energy costs. In this paper, we propose an energy-efficient scheduling approach to find an efficient dataflow that minimizes data movements with limited hardware resource budgets. In detail, two-level nested loop transformations are proposed to separate memory and computing resource constraints. This allows us to fully exploit the potential of available memory resources for reducing off-chip memory traffic. Further, the proposed cross-loop model is capable of figuring out the data locality across nested loops in CNN algorithms. Finally, energy-delay production is employed as the evaluation criteria to balancing energy and throughput performance. The experimental results show our cross-loop model can reduce the off-chip data movements by 11-21% and achieve the theoretical optimum. Therefore, the proposed scheduling method can increase the energy efficiency by at least 8.7 times.

Original languageEnglish
Title of host publicationIEEE International Symposium on Circuits and Systems
Subtitle of host publicationFrom Dreams to Innovation, ISCAS 2017 - Conference Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781467368520
DOIs
Publication statusPublished - 2017 Sep 25
Event50th IEEE International Symposium on Circuits and Systems, ISCAS 2017 - Baltimore, United States
Duration: 2017 May 282017 May 31

Other

Other50th IEEE International Symposium on Circuits and Systems, ISCAS 2017
CountryUnited States
CityBaltimore
Period17/5/2817/5/31

Fingerprint

Convolution
Particle accelerators
Scheduling
Neural networks
Data storage equipment
Throughput
Energy efficiency
Hardware
Costs

Keywords

  • accelerator
  • convolutional neural network
  • energy efficiency
  • loop tiling
  • loop transformation
  • memory bandwidth

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Yang, K., Wang, S., Zhou, J., & Yoshimura, T. (2017). Energy-efficient scheduling method with cross-loop model for resource-limited CNN accelerator designs. In IEEE International Symposium on Circuits and Systems: From Dreams to Innovation, ISCAS 2017 - Conference Proceedings [8050800] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISCAS.2017.8050800

Energy-efficient scheduling method with cross-loop model for resource-limited CNN accelerator designs. / Yang, Kaiyi; Wang, Shihao; Zhou, Jianbin; Yoshimura, Takeshi.

IEEE International Symposium on Circuits and Systems: From Dreams to Innovation, ISCAS 2017 - Conference Proceedings. Institute of Electrical and Electronics Engineers Inc., 2017. 8050800.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Yang, K, Wang, S, Zhou, J & Yoshimura, T 2017, Energy-efficient scheduling method with cross-loop model for resource-limited CNN accelerator designs. in IEEE International Symposium on Circuits and Systems: From Dreams to Innovation, ISCAS 2017 - Conference Proceedings., 8050800, Institute of Electrical and Electronics Engineers Inc., 50th IEEE International Symposium on Circuits and Systems, ISCAS 2017, Baltimore, United States, 17/5/28. https://doi.org/10.1109/ISCAS.2017.8050800
Yang K, Wang S, Zhou J, Yoshimura T. Energy-efficient scheduling method with cross-loop model for resource-limited CNN accelerator designs. In IEEE International Symposium on Circuits and Systems: From Dreams to Innovation, ISCAS 2017 - Conference Proceedings. Institute of Electrical and Electronics Engineers Inc. 2017. 8050800 https://doi.org/10.1109/ISCAS.2017.8050800
Yang, Kaiyi ; Wang, Shihao ; Zhou, Jianbin ; Yoshimura, Takeshi. / Energy-efficient scheduling method with cross-loop model for resource-limited CNN accelerator designs. IEEE International Symposium on Circuits and Systems: From Dreams to Innovation, ISCAS 2017 - Conference Proceedings. Institute of Electrical and Electronics Engineers Inc., 2017.
@inproceedings{b04323163599453ea75022c49920db03,
title = "Energy-efficient scheduling method with cross-loop model for resource-limited CNN accelerator designs",
abstract = "The state-of-the-art customized accelerators of convolution neural networks (CNN) have achieved high throughput while the huge amount of data movements still remains as the dominant part of the total energy costs. In this paper, we propose an energy-efficient scheduling approach to find an efficient dataflow that minimizes data movements with limited hardware resource budgets. In detail, two-level nested loop transformations are proposed to separate memory and computing resource constraints. This allows us to fully exploit the potential of available memory resources for reducing off-chip memory traffic. Further, the proposed cross-loop model is capable of figuring out the data locality across nested loops in CNN algorithms. Finally, energy-delay production is employed as the evaluation criteria to balancing energy and throughput performance. The experimental results show our cross-loop model can reduce the off-chip data movements by 11-21{\%} and achieve the theoretical optimum. Therefore, the proposed scheduling method can increase the energy efficiency by at least 8.7 times.",
keywords = "accelerator, convolutional neural network, energy efficiency, loop tiling, loop transformation, memory bandwidth",
author = "Kaiyi Yang and Shihao Wang and Jianbin Zhou and Takeshi Yoshimura",
year = "2017",
month = "9",
day = "25",
doi = "10.1109/ISCAS.2017.8050800",
language = "English",
booktitle = "IEEE International Symposium on Circuits and Systems",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

TY - GEN

T1 - Energy-efficient scheduling method with cross-loop model for resource-limited CNN accelerator designs

AU - Yang, Kaiyi

AU - Wang, Shihao

AU - Zhou, Jianbin

AU - Yoshimura, Takeshi

PY - 2017/9/25

Y1 - 2017/9/25

N2 - The state-of-the-art customized accelerators of convolution neural networks (CNN) have achieved high throughput while the huge amount of data movements still remains as the dominant part of the total energy costs. In this paper, we propose an energy-efficient scheduling approach to find an efficient dataflow that minimizes data movements with limited hardware resource budgets. In detail, two-level nested loop transformations are proposed to separate memory and computing resource constraints. This allows us to fully exploit the potential of available memory resources for reducing off-chip memory traffic. Further, the proposed cross-loop model is capable of figuring out the data locality across nested loops in CNN algorithms. Finally, energy-delay production is employed as the evaluation criteria to balancing energy and throughput performance. The experimental results show our cross-loop model can reduce the off-chip data movements by 11-21% and achieve the theoretical optimum. Therefore, the proposed scheduling method can increase the energy efficiency by at least 8.7 times.

AB - The state-of-the-art customized accelerators of convolution neural networks (CNN) have achieved high throughput while the huge amount of data movements still remains as the dominant part of the total energy costs. In this paper, we propose an energy-efficient scheduling approach to find an efficient dataflow that minimizes data movements with limited hardware resource budgets. In detail, two-level nested loop transformations are proposed to separate memory and computing resource constraints. This allows us to fully exploit the potential of available memory resources for reducing off-chip memory traffic. Further, the proposed cross-loop model is capable of figuring out the data locality across nested loops in CNN algorithms. Finally, energy-delay production is employed as the evaluation criteria to balancing energy and throughput performance. The experimental results show our cross-loop model can reduce the off-chip data movements by 11-21% and achieve the theoretical optimum. Therefore, the proposed scheduling method can increase the energy efficiency by at least 8.7 times.

KW - accelerator

KW - convolutional neural network

KW - energy efficiency

KW - loop tiling

KW - loop transformation

KW - memory bandwidth

UR - http://www.scopus.com/inward/record.url?scp=85032703621&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85032703621&partnerID=8YFLogxK

U2 - 10.1109/ISCAS.2017.8050800

DO - 10.1109/ISCAS.2017.8050800

M3 - Conference contribution

AN - SCOPUS:85032703621

BT - IEEE International Symposium on Circuits and Systems

PB - Institute of Electrical and Electronics Engineers Inc.

ER -