Coordination in adversarial multi-agent with deep reinforcement learning under partial observability

Elhadji Amadou Oury Diallo, Toshiharu Sugawara

研究成果: Conference contribution

4 被引用数 (Scopus)

抄録

We propose a method using several variants of deep Q-network for learning strategic formations in large-scale adversarial multi-agent systems. The goal is to learn how to maximize the probability of acting jointly as coordinated as possible. Our method is called the centralized training and decentralized testing (CTDT) framework that is based on the POMDP during training and dec-POMDP during testing. During the training phase, the centralized neural network's inputs are the collections of local observations of agents of the same team. Although agents only know their action, the centralized network decides the joint action and subsequently distributes these actions to the individual agents. During the test, however, each agent uses a copy of the centralized network and independently decides its action based on its policy and local view. We show that deep reinforcement learning techniques using the CTDT framework can converge and generate several strategic group formations in large-scale multi-agent systems. We also compare the results using the CTDT with those derived from a centralized shared DQN and then we investigate the characteristics of the learned behaviors.

本文言語English
ホスト出版物のタイトルProceedings - IEEE 31st International Conference on Tools with Artificial Intelligence, ICTAI 2019
出版社IEEE Computer Society
ページ198-205
ページ数8
ISBN(電子版)9781728137988
DOI
出版ステータスPublished - 2019 11
イベント31st IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2019 - Portland, United States
継続期間: 2019 11 42019 11 6

出版物シリーズ

名前Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
2019-November
ISSN(印刷版)1082-3409

Conference

Conference31st IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2019
国/地域United States
CityPortland
Period19/11/419/11/6

ASJC Scopus subject areas

  • ソフトウェア
  • 人工知能
  • コンピュータ サイエンスの応用

フィンガープリント

「Coordination in adversarial multi-agent with deep reinforcement learning under partial observability」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル