Learning dynamics of neural networks with singularity - Standard gradient vs. natural gradient

Hyeyoung Park, Masato Inoue, Masato Okada

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In hierarchical models, such as neural networks, there exist complex singular structures. The singularity is known to affect estimation performances and learning dynamics of the models. Recently, there have been a number of studies on properties of obtained estimators for the models, but there are few studies on the dynamical properties of learning used for obtaining the estimators. Using two-layer neural networks, we investigate influences of singularities on dynamics of standard gradient learning and natural gradient learning under various learning conditions. In the standard gradient learning, we found a quasi-plateau phenomenon, which is severer than the well known plateau in some cases. The slow convergence due to the quasi-plateau and plateau becomes extremely serious when an optimal point is in a neighborhood of a singularity. In the natural gradient learning, however, the quasi-plateau and plateau are not observed and convergence speed is hardly affected by singularity.

Original languageEnglish
Title of host publicationLecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)
EditorsC. Zhang, H.W. Guesgen, W.K. Yeap
Pages282-291
Number of pages10
Volume3157
Publication statusPublished - 2004
Externally publishedYes
Event8th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2004: Trends in Artificial Intelligence - Auckland, New Zealand
Duration: 2004 Aug 92004 Aug 13

Other

Other8th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2004: Trends in Artificial Intelligence
CountryNew Zealand
CityAuckland
Period04/8/904/8/13

Fingerprint

Neural networks

ASJC Scopus subject areas

  • Hardware and Architecture

Cite this

Park, H., Inoue, M., & Okada, M. (2004). Learning dynamics of neural networks with singularity - Standard gradient vs. natural gradient. In C. Zhang, H. W. Guesgen, & W. K. Yeap (Eds.), Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3157, pp. 282-291)

Learning dynamics of neural networks with singularity - Standard gradient vs. natural gradient. / Park, Hyeyoung; Inoue, Masato; Okada, Masato.

Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). ed. / C. Zhang; H.W. Guesgen; W.K. Yeap. Vol. 3157 2004. p. 282-291.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Park, H, Inoue, M & Okada, M 2004, Learning dynamics of neural networks with singularity - Standard gradient vs. natural gradient. in C Zhang, HW Guesgen & WK Yeap (eds), Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). vol. 3157, pp. 282-291, 8th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2004: Trends in Artificial Intelligence, Auckland, New Zealand, 04/8/9.
Park H, Inoue M, Okada M. Learning dynamics of neural networks with singularity - Standard gradient vs. natural gradient. In Zhang C, Guesgen HW, Yeap WK, editors, Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). Vol. 3157. 2004. p. 282-291
Park, Hyeyoung ; Inoue, Masato ; Okada, Masato. / Learning dynamics of neural networks with singularity - Standard gradient vs. natural gradient. Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science). editor / C. Zhang ; H.W. Guesgen ; W.K. Yeap. Vol. 3157 2004. pp. 282-291
@inproceedings{b0d3921db2e140099afd892aca9866db,
title = "Learning dynamics of neural networks with singularity - Standard gradient vs. natural gradient",
abstract = "In hierarchical models, such as neural networks, there exist complex singular structures. The singularity is known to affect estimation performances and learning dynamics of the models. Recently, there have been a number of studies on properties of obtained estimators for the models, but there are few studies on the dynamical properties of learning used for obtaining the estimators. Using two-layer neural networks, we investigate influences of singularities on dynamics of standard gradient learning and natural gradient learning under various learning conditions. In the standard gradient learning, we found a quasi-plateau phenomenon, which is severer than the well known plateau in some cases. The slow convergence due to the quasi-plateau and plateau becomes extremely serious when an optimal point is in a neighborhood of a singularity. In the natural gradient learning, however, the quasi-plateau and plateau are not observed and convergence speed is hardly affected by singularity.",
author = "Hyeyoung Park and Masato Inoue and Masato Okada",
year = "2004",
language = "English",
volume = "3157",
pages = "282--291",
editor = "C. Zhang and H.W. Guesgen and W.K. Yeap",
booktitle = "Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)",

}

TY - GEN

T1 - Learning dynamics of neural networks with singularity - Standard gradient vs. natural gradient

AU - Park, Hyeyoung

AU - Inoue, Masato

AU - Okada, Masato

PY - 2004

Y1 - 2004

N2 - In hierarchical models, such as neural networks, there exist complex singular structures. The singularity is known to affect estimation performances and learning dynamics of the models. Recently, there have been a number of studies on properties of obtained estimators for the models, but there are few studies on the dynamical properties of learning used for obtaining the estimators. Using two-layer neural networks, we investigate influences of singularities on dynamics of standard gradient learning and natural gradient learning under various learning conditions. In the standard gradient learning, we found a quasi-plateau phenomenon, which is severer than the well known plateau in some cases. The slow convergence due to the quasi-plateau and plateau becomes extremely serious when an optimal point is in a neighborhood of a singularity. In the natural gradient learning, however, the quasi-plateau and plateau are not observed and convergence speed is hardly affected by singularity.

AB - In hierarchical models, such as neural networks, there exist complex singular structures. The singularity is known to affect estimation performances and learning dynamics of the models. Recently, there have been a number of studies on properties of obtained estimators for the models, but there are few studies on the dynamical properties of learning used for obtaining the estimators. Using two-layer neural networks, we investigate influences of singularities on dynamics of standard gradient learning and natural gradient learning under various learning conditions. In the standard gradient learning, we found a quasi-plateau phenomenon, which is severer than the well known plateau in some cases. The slow convergence due to the quasi-plateau and plateau becomes extremely serious when an optimal point is in a neighborhood of a singularity. In the natural gradient learning, however, the quasi-plateau and plateau are not observed and convergence speed is hardly affected by singularity.

UR - http://www.scopus.com/inward/record.url?scp=22944448668&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=22944448668&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:22944448668

VL - 3157

SP - 282

EP - 291

BT - Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)

A2 - Zhang, C.

A2 - Guesgen, H.W.

A2 - Yeap, W.K.

ER -