Meta-Reward Model Based on Trajectory Data with k-Nearest Neighbors Method

Xiaohui Zhu, Toshiharu Sugawara

研究成果: Conference contribution

抄録

Reward shaping is a crucial method to speed up the process of reinforcement learning (RL). However, designing reward shaping functions usually requires many expert demonstrations and much hand-engineering. Moreover, by using the potential function to shape the training rewards, an RL agent can perform Q-learning well to converge the associated Q-table faster without using the expert data, but in deep reinforcement learning (DRL), which is RL using neural networks, Q-learning is sometimes slow to learn the parameters of networks, especially in a long horizon and sparse reward environment. In this paper, we propose a reward model to shape the training rewards for DRL in real time to learn the agent's motions with a discrete action space. This model and reward shaping method use a combination of agent self-demonstrations and a potential-based reward shaping method to make the neural networks converge faster in every task and can be used in both deep Q-learning and actor-critic methods. We experimentally showed that our proposed method could speed up the DRL in the classic control problems of an agent in various environments.

本文言語English
ホスト出版物のタイトル2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings
出版社Institute of Electrical and Electronics Engineers Inc.
ISBN(電子版)9781728169262
DOI
出版ステータスPublished - 2020 7月
イベント2020 International Joint Conference on Neural Networks, IJCNN 2020 - Virtual, Glasgow, United Kingdom
継続期間: 2020 7月 192020 7月 24

出版物シリーズ

名前Proceedings of the International Joint Conference on Neural Networks

Conference

Conference2020 International Joint Conference on Neural Networks, IJCNN 2020
国/地域United Kingdom
CityVirtual, Glasgow
Period20/7/1920/7/24

ASJC Scopus subject areas

  • ソフトウェア
  • 人工知能

フィンガープリント

「Meta-Reward Model Based on Trajectory Data with k-Nearest Neighbors Method」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル