This paper describes a digital neural network chip for use as core in neural network accelerators employs a single-instruction multi-data-stream (SIMD architecture and includes twelve 24b floating-point processing units (PUs), a nonlinear function unit (NFU), and a control unit (CU). Each PU includes 24b×1.28kw local memory and communicates with its neighbor through a shift register ring. This configuration permits both feed-forward and error back propagation (BP) processes to be executed efficiently. The CU, which includes a three stage pipelined sequencer, a 24b×1kw instruction code memory (ICM) and a 144b×256w microcode memory (MCM), broadcasts network parameters (e.g., learning coefficients or temperature parameters) or addresses for local memories through a data and an address bus. Two external memory ports and a ring expansion-port permit large networks to be constructed. The external memory can be expanded by up to 768kW using the two ports.