TY - GEN
T1 - Reinforcement learning algorithm with CTRNN in continuous action space
AU - Arie, Hiroaki
AU - Namikawa, Jun
AU - Ogata, Tetsuya
AU - Tani, Jun
AU - Sugano, Shigeki
PY - 2006/1/1
Y1 - 2006/1/1
N2 - There are some difficulties in applying traditional reinforcement learning algorithms to motion control tasks of robot. Because most algorithms are concerned with discrete actions and based on the assumption of complete observability of the state. This paper deals with these two problems by combining the reinforcement learning algorithm and CTRNN learning algorithm. We carried out an experiment on the pendulum swing-up task without rotational speed information. It is shown that the information about the rotational speed, which is considered as a hidden state, is estimated and encoded on the activation of a context neuron. As a result, this task is accomplished in several hundred trials using the proposed algorithm.
AB - There are some difficulties in applying traditional reinforcement learning algorithms to motion control tasks of robot. Because most algorithms are concerned with discrete actions and based on the assumption of complete observability of the state. This paper deals with these two problems by combining the reinforcement learning algorithm and CTRNN learning algorithm. We carried out an experiment on the pendulum swing-up task without rotational speed information. It is shown that the information about the rotational speed, which is considered as a hidden state, is estimated and encoded on the activation of a context neuron. As a result, this task is accomplished in several hundred trials using the proposed algorithm.
UR - http://www.scopus.com/inward/record.url?scp=33750590179&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33750590179&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:33750590179
SN - 3540464794
SN - 9783540464792
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 387
EP - 396
BT - Neural Information Processing - 13th International Conference, ICONIP 2006, Proceedings
PB - Springer Verlag
T2 - 13th International Conference on Neural Information Processing, ICONIP 2006
Y2 - 3 October 2006 through 6 October 2006
ER -