A continuous estimation of distribution algorithm by evolving graph structures using reinforcement learning

Xianneng Li, Bing Li, Shingo Mabu, Kotaro Hirasawa

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

A novel graph-based Estimation of Distribution Algorithm (EDA) named Probabilistic Model Building Genetic Network Programming (PMBGNP) has been proposed. Inspired by classical EDAs, PMBGNP memorizes the current best individuals and uses them to estimate a distribution for the generation of the new population. However, PMBGNP can evolve compact programs by representing its solutions as graph structures. Therefore, it can solve a range of problems different from conventional ones in EDA literature, such as data mining and Reinforcement Learning (RL) problems. This paper extends PMBGNP from discrete to continuous search space, which is named PMBGNP-AC. Besides evolving the node connections to determine the optimal graph structures using conventional PMBGNP, Gaussian distribution is used for the distribution of continuous variables of nodes. The mean value μ and standard deviation σ are constructed like those of classical continuous Population-based incremental learning (PBILc). However, a RL technique, i.e., Actor-Critic (AC), is designed to update the parameters (μ and σ). AC allows us to calculate the Temporal-Difference (TD) error to evaluate whether the selection of the continuous value is better or worse than expected. This scalar reinforcement signal can decide whether the tendency to select this continuous value should be strengthened or weakened, allowing us to determine the shape of the probability density functions of the Gaussian distribution. The proposed algorithm is applied to a RL problem, i.e., autonomous robot control, where the robot's wheel speeds and sensor values are continuous. The experimental results show the superiority of PMBGNP-AC comparing with the conventional algorithms.

Original languageEnglish
Title of host publication2012 IEEE Congress on Evolutionary Computation, CEC 2012
DOIs
Publication statusPublished - 2012
Event2012 IEEE Congress on Evolutionary Computation, CEC 2012 - Brisbane, QLD
Duration: 2012 Jun 102012 Jun 15

Other

Other2012 IEEE Congress on Evolutionary Computation, CEC 2012
CityBrisbane, QLD
Period12/6/1012/6/15

Fingerprint

Network Programming
Structure Learning
Genetic Network
Reinforcement learning
Reinforcement Learning
Genetic Programming
Probabilistic Model
Graph in graph theory
Gaussian distribution
Robots
Incremental Learning
Autonomous Robots
Robot Control
Continuous Variables
Reinforcement
Vertex of a graph
Statistical Models
Mean Value
Wheel
Standard deviation

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Theoretical Computer Science

Cite this

Li, X., Li, B., Mabu, S., & Hirasawa, K. (2012). A continuous estimation of distribution algorithm by evolving graph structures using reinforcement learning. In 2012 IEEE Congress on Evolutionary Computation, CEC 2012 [6256481] https://doi.org/10.1109/CEC.2012.6256481

A continuous estimation of distribution algorithm by evolving graph structures using reinforcement learning. / Li, Xianneng; Li, Bing; Mabu, Shingo; Hirasawa, Kotaro.

2012 IEEE Congress on Evolutionary Computation, CEC 2012. 2012. 6256481.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, X, Li, B, Mabu, S & Hirasawa, K 2012, A continuous estimation of distribution algorithm by evolving graph structures using reinforcement learning. in 2012 IEEE Congress on Evolutionary Computation, CEC 2012., 6256481, 2012 IEEE Congress on Evolutionary Computation, CEC 2012, Brisbane, QLD, 12/6/10. https://doi.org/10.1109/CEC.2012.6256481
Li X, Li B, Mabu S, Hirasawa K. A continuous estimation of distribution algorithm by evolving graph structures using reinforcement learning. In 2012 IEEE Congress on Evolutionary Computation, CEC 2012. 2012. 6256481 https://doi.org/10.1109/CEC.2012.6256481
Li, Xianneng ; Li, Bing ; Mabu, Shingo ; Hirasawa, Kotaro. / A continuous estimation of distribution algorithm by evolving graph structures using reinforcement learning. 2012 IEEE Congress on Evolutionary Computation, CEC 2012. 2012.
@inproceedings{4eb0e35e8089442485548b24942e49eb,
title = "A continuous estimation of distribution algorithm by evolving graph structures using reinforcement learning",
abstract = "A novel graph-based Estimation of Distribution Algorithm (EDA) named Probabilistic Model Building Genetic Network Programming (PMBGNP) has been proposed. Inspired by classical EDAs, PMBGNP memorizes the current best individuals and uses them to estimate a distribution for the generation of the new population. However, PMBGNP can evolve compact programs by representing its solutions as graph structures. Therefore, it can solve a range of problems different from conventional ones in EDA literature, such as data mining and Reinforcement Learning (RL) problems. This paper extends PMBGNP from discrete to continuous search space, which is named PMBGNP-AC. Besides evolving the node connections to determine the optimal graph structures using conventional PMBGNP, Gaussian distribution is used for the distribution of continuous variables of nodes. The mean value μ and standard deviation σ are constructed like those of classical continuous Population-based incremental learning (PBILc). However, a RL technique, i.e., Actor-Critic (AC), is designed to update the parameters (μ and σ). AC allows us to calculate the Temporal-Difference (TD) error to evaluate whether the selection of the continuous value is better or worse than expected. This scalar reinforcement signal can decide whether the tendency to select this continuous value should be strengthened or weakened, allowing us to determine the shape of the probability density functions of the Gaussian distribution. The proposed algorithm is applied to a RL problem, i.e., autonomous robot control, where the robot's wheel speeds and sensor values are continuous. The experimental results show the superiority of PMBGNP-AC comparing with the conventional algorithms.",
author = "Xianneng Li and Bing Li and Shingo Mabu and Kotaro Hirasawa",
year = "2012",
doi = "10.1109/CEC.2012.6256481",
language = "English",
isbn = "9781467315098",
booktitle = "2012 IEEE Congress on Evolutionary Computation, CEC 2012",

}

TY - GEN

T1 - A continuous estimation of distribution algorithm by evolving graph structures using reinforcement learning

AU - Li, Xianneng

AU - Li, Bing

AU - Mabu, Shingo

AU - Hirasawa, Kotaro

PY - 2012

Y1 - 2012

N2 - A novel graph-based Estimation of Distribution Algorithm (EDA) named Probabilistic Model Building Genetic Network Programming (PMBGNP) has been proposed. Inspired by classical EDAs, PMBGNP memorizes the current best individuals and uses them to estimate a distribution for the generation of the new population. However, PMBGNP can evolve compact programs by representing its solutions as graph structures. Therefore, it can solve a range of problems different from conventional ones in EDA literature, such as data mining and Reinforcement Learning (RL) problems. This paper extends PMBGNP from discrete to continuous search space, which is named PMBGNP-AC. Besides evolving the node connections to determine the optimal graph structures using conventional PMBGNP, Gaussian distribution is used for the distribution of continuous variables of nodes. The mean value μ and standard deviation σ are constructed like those of classical continuous Population-based incremental learning (PBILc). However, a RL technique, i.e., Actor-Critic (AC), is designed to update the parameters (μ and σ). AC allows us to calculate the Temporal-Difference (TD) error to evaluate whether the selection of the continuous value is better or worse than expected. This scalar reinforcement signal can decide whether the tendency to select this continuous value should be strengthened or weakened, allowing us to determine the shape of the probability density functions of the Gaussian distribution. The proposed algorithm is applied to a RL problem, i.e., autonomous robot control, where the robot's wheel speeds and sensor values are continuous. The experimental results show the superiority of PMBGNP-AC comparing with the conventional algorithms.

AB - A novel graph-based Estimation of Distribution Algorithm (EDA) named Probabilistic Model Building Genetic Network Programming (PMBGNP) has been proposed. Inspired by classical EDAs, PMBGNP memorizes the current best individuals and uses them to estimate a distribution for the generation of the new population. However, PMBGNP can evolve compact programs by representing its solutions as graph structures. Therefore, it can solve a range of problems different from conventional ones in EDA literature, such as data mining and Reinforcement Learning (RL) problems. This paper extends PMBGNP from discrete to continuous search space, which is named PMBGNP-AC. Besides evolving the node connections to determine the optimal graph structures using conventional PMBGNP, Gaussian distribution is used for the distribution of continuous variables of nodes. The mean value μ and standard deviation σ are constructed like those of classical continuous Population-based incremental learning (PBILc). However, a RL technique, i.e., Actor-Critic (AC), is designed to update the parameters (μ and σ). AC allows us to calculate the Temporal-Difference (TD) error to evaluate whether the selection of the continuous value is better or worse than expected. This scalar reinforcement signal can decide whether the tendency to select this continuous value should be strengthened or weakened, allowing us to determine the shape of the probability density functions of the Gaussian distribution. The proposed algorithm is applied to a RL problem, i.e., autonomous robot control, where the robot's wheel speeds and sensor values are continuous. The experimental results show the superiority of PMBGNP-AC comparing with the conventional algorithms.

UR - http://www.scopus.com/inward/record.url?scp=84866856884&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84866856884&partnerID=8YFLogxK

U2 - 10.1109/CEC.2012.6256481

DO - 10.1109/CEC.2012.6256481

M3 - Conference contribution

SN - 9781467315098

BT - 2012 IEEE Congress on Evolutionary Computation, CEC 2012

ER -