Shifting Reward Assignment for Learning Coordinated Behavior in Time-Limited Ordered Tasks

Yoshihiro Oguni*, Yuki Miyashita, Toshiharu Sugawara

*この研究の対応する著者

研究成果: Conference contribution

抄録

We propose a variable reward scheme in decentralized multi-agent deep reinforcement learning for a sequential task consisting of a number of subtasks which can be completed when all subtasks are executed in a certain order before a deadline by agents with different capabilities. Developments in computer science and robotics are drawing attention to multi-agent systems for complex tasks. However, coordinated behavior among agents requires sophistication and is highly dependent on the structures of tasks and environments; thus, it is preferable to individually learn appropriate coordination depending on specific tasks. This study focuses on the learning of a sequential task by cooperative agents from a practical perspective. In such tasks, agents must learn both efficiency for their own subtasks and coordinated behavior for other agents because the former provides more chances for the subsequent agents to learn, while the latter facilitates the execution of subsequent subtasks. Our proposed reward scheme enables agents to learn these behaviors in a balanced manner. We then experimentally show that agents in the proposed reward scheme can achieve more efficient task execution compared to baseline methods based on static reward schemes. We also analyzed the learned coordinated behavior to see the reasons of efficiency.

本文言語English
ホスト出版物のタイトルAdvances in Practical Applications of Agents, Multi-Agent Systems, and Complex Systems Simulation. The PAAMS Collection - 20th International Conference, PAAMS 2022, Proceedings
編集者Frank Dignum, Philippe Mathieu, Juan Manuel Corchado, Fernando De La Prieta, Juan Manuel Corchado
出版社Springer Science and Business Media Deutschland GmbH
ページ294-306
ページ数13
ISBN(印刷版)9783031181917
DOI
出版ステータスPublished - 2022
イベント20th International Conference on Practical Applications of Agents and Multi-Agent Systems , PAAMS 2022 - L'Aquila, Italy
継続期間: 2022 7月 132022 7月 15

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
13616 LNAI
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

Conference

Conference20th International Conference on Practical Applications of Agents and Multi-Agent Systems , PAAMS 2022
国/地域Italy
CityL'Aquila
Period22/7/1322/7/15

ASJC Scopus subject areas

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「Shifting Reward Assignment for Learning Coordinated Behavior in Time-Limited Ordered Tasks」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル