1.2GFLOPS neural network chip exhibiting fast convergence

Yoshikazu Kondo, Yuichi Koshiba, Yutaka Arima, Mitsuhiro Murasaki, Tuyoshi Yamada, Hiroyuki Amishiro, Hirofumi Shinohara, Hakuro Mori

Research output: Chapter in Book/Report/Conference proceedingConference contribution

14 Citations (Scopus)

Abstract

This paper describes a digital neural network chip for use as core in neural network accelerators employs a single-instruction multi-data-stream (SIMD architecture and includes twelve 24b floating-point processing units (PUs), a nonlinear function unit (NFU), and a control unit (CU). Each PU includes 24b×1.28kw local memory and communicates with its neighbor through a shift register ring. This configuration permits both feed-forward and error back propagation (BP) processes to be executed efficiently. The CU, which includes a three stage pipelined sequencer, a 24b×1kw instruction code memory (ICM) and a 144b×256w microcode memory (MCM), broadcasts network parameters (e.g., learning coefficients or temperature parameters) or addresses for local memories through a data and an address bus. Two external memory ports and a ring expansion-port permit large networks to be constructed. The external memory can be expanded by up to 768kW using the two ports.

Original languageEnglish
Title of host publicationDigest of Technical Papers - IEEE International Solid-State Circuits Conference
Editors Anon
PublisherPubl by IEEE
Pages218-219
Number of pages2
ISBN (Print)0780318455
Publication statusPublished - 1994
Externally publishedYes
EventProceedings of the 1994 IEEE International Solid-State Circuits Conference - San Francisco, CA, USA
Duration: 1994 Feb 161994 Feb 18

Other

OtherProceedings of the 1994 IEEE International Solid-State Circuits Conference
CitySan Francisco, CA, USA
Period94/2/1694/2/18

Fingerprint

Neural networks
Data storage equipment
Firmware
Shift registers
Processing
Backpropagation
Particle accelerators
Temperature

ASJC Scopus subject areas

  • Hardware and Architecture
  • Electrical and Electronic Engineering
  • Engineering(all)

Cite this

Kondo, Y., Koshiba, Y., Arima, Y., Murasaki, M., Yamada, T., Amishiro, H., ... Mori, H. (1994). 1.2GFLOPS neural network chip exhibiting fast convergence. In Anon (Ed.), Digest of Technical Papers - IEEE International Solid-State Circuits Conference (pp. 218-219). Publ by IEEE.

1.2GFLOPS neural network chip exhibiting fast convergence. / Kondo, Yoshikazu; Koshiba, Yuichi; Arima, Yutaka; Murasaki, Mitsuhiro; Yamada, Tuyoshi; Amishiro, Hiroyuki; Shinohara, Hirofumi; Mori, Hakuro.

Digest of Technical Papers - IEEE International Solid-State Circuits Conference. ed. / Anon. Publ by IEEE, 1994. p. 218-219.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kondo, Y, Koshiba, Y, Arima, Y, Murasaki, M, Yamada, T, Amishiro, H, Shinohara, H & Mori, H 1994, 1.2GFLOPS neural network chip exhibiting fast convergence. in Anon (ed.), Digest of Technical Papers - IEEE International Solid-State Circuits Conference. Publ by IEEE, pp. 218-219, Proceedings of the 1994 IEEE International Solid-State Circuits Conference, San Francisco, CA, USA, 94/2/16.
Kondo Y, Koshiba Y, Arima Y, Murasaki M, Yamada T, Amishiro H et al. 1.2GFLOPS neural network chip exhibiting fast convergence. In Anon, editor, Digest of Technical Papers - IEEE International Solid-State Circuits Conference. Publ by IEEE. 1994. p. 218-219
Kondo, Yoshikazu ; Koshiba, Yuichi ; Arima, Yutaka ; Murasaki, Mitsuhiro ; Yamada, Tuyoshi ; Amishiro, Hiroyuki ; Shinohara, Hirofumi ; Mori, Hakuro. / 1.2GFLOPS neural network chip exhibiting fast convergence. Digest of Technical Papers - IEEE International Solid-State Circuits Conference. editor / Anon. Publ by IEEE, 1994. pp. 218-219
@inproceedings{4c033be97eaa49d29944f746363d935b,
title = "1.2GFLOPS neural network chip exhibiting fast convergence",
abstract = "This paper describes a digital neural network chip for use as core in neural network accelerators employs a single-instruction multi-data-stream (SIMD architecture and includes twelve 24b floating-point processing units (PUs), a nonlinear function unit (NFU), and a control unit (CU). Each PU includes 24b×1.28kw local memory and communicates with its neighbor through a shift register ring. This configuration permits both feed-forward and error back propagation (BP) processes to be executed efficiently. The CU, which includes a three stage pipelined sequencer, a 24b×1kw instruction code memory (ICM) and a 144b×256w microcode memory (MCM), broadcasts network parameters (e.g., learning coefficients or temperature parameters) or addresses for local memories through a data and an address bus. Two external memory ports and a ring expansion-port permit large networks to be constructed. The external memory can be expanded by up to 768kW using the two ports.",
author = "Yoshikazu Kondo and Yuichi Koshiba and Yutaka Arima and Mitsuhiro Murasaki and Tuyoshi Yamada and Hiroyuki Amishiro and Hirofumi Shinohara and Hakuro Mori",
year = "1994",
language = "English",
isbn = "0780318455",
pages = "218--219",
editor = "Anon",
booktitle = "Digest of Technical Papers - IEEE International Solid-State Circuits Conference",
publisher = "Publ by IEEE",

}

TY - GEN

T1 - 1.2GFLOPS neural network chip exhibiting fast convergence

AU - Kondo, Yoshikazu

AU - Koshiba, Yuichi

AU - Arima, Yutaka

AU - Murasaki, Mitsuhiro

AU - Yamada, Tuyoshi

AU - Amishiro, Hiroyuki

AU - Shinohara, Hirofumi

AU - Mori, Hakuro

PY - 1994

Y1 - 1994

N2 - This paper describes a digital neural network chip for use as core in neural network accelerators employs a single-instruction multi-data-stream (SIMD architecture and includes twelve 24b floating-point processing units (PUs), a nonlinear function unit (NFU), and a control unit (CU). Each PU includes 24b×1.28kw local memory and communicates with its neighbor through a shift register ring. This configuration permits both feed-forward and error back propagation (BP) processes to be executed efficiently. The CU, which includes a three stage pipelined sequencer, a 24b×1kw instruction code memory (ICM) and a 144b×256w microcode memory (MCM), broadcasts network parameters (e.g., learning coefficients or temperature parameters) or addresses for local memories through a data and an address bus. Two external memory ports and a ring expansion-port permit large networks to be constructed. The external memory can be expanded by up to 768kW using the two ports.

AB - This paper describes a digital neural network chip for use as core in neural network accelerators employs a single-instruction multi-data-stream (SIMD architecture and includes twelve 24b floating-point processing units (PUs), a nonlinear function unit (NFU), and a control unit (CU). Each PU includes 24b×1.28kw local memory and communicates with its neighbor through a shift register ring. This configuration permits both feed-forward and error back propagation (BP) processes to be executed efficiently. The CU, which includes a three stage pipelined sequencer, a 24b×1kw instruction code memory (ICM) and a 144b×256w microcode memory (MCM), broadcasts network parameters (e.g., learning coefficients or temperature parameters) or addresses for local memories through a data and an address bus. Two external memory ports and a ring expansion-port permit large networks to be constructed. The external memory can be expanded by up to 768kW using the two ports.

UR - http://www.scopus.com/inward/record.url?scp=0028124015&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0028124015&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:0028124015

SN - 0780318455

SP - 218

EP - 219

BT - Digest of Technical Papers - IEEE International Solid-State Circuits Conference

A2 - Anon, null

PB - Publ by IEEE

ER -