Coordination in adversarial multi-agent with deep reinforcement learning under partial observability

Elhadji Amadou Oury Diallo, Toshiharu Sugawara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

We propose a method using several variants of deep Q-network for learning strategic formations in large-scale adversarial multi-agent systems. The goal is to learn how to maximize the probability of acting jointly as coordinated as possible. Our method is called the centralized training and decentralized testing (CTDT) framework that is based on the POMDP during training and dec-POMDP during testing. During the training phase, the centralized neural network's inputs are the collections of local observations of agents of the same team. Although agents only know their action, the centralized network decides the joint action and subsequently distributes these actions to the individual agents. During the test, however, each agent uses a copy of the centralized network and independently decides its action based on its policy and local view. We show that deep reinforcement learning techniques using the CTDT framework can converge and generate several strategic group formations in large-scale multi-agent systems. We also compare the results using the CTDT with those derived from a centralized shared DQN and then we investigate the characteristics of the learned behaviors.

Original languageEnglish
Title of host publicationProceedings - IEEE 31st International Conference on Tools with Artificial Intelligence, ICTAI 2019
PublisherIEEE Computer Society
Pages198-205
Number of pages8
ISBN (Electronic)9781728137988
DOIs
Publication statusPublished - 2019 Nov
Event31st IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2019 - Portland, United States
Duration: 2019 Nov 42019 Nov 6

Publication series

NameProceedings - International Conference on Tools with Artificial Intelligence, ICTAI
Volume2019-November
ISSN (Print)1082-3409

Conference

Conference31st IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2019
CountryUnited States
CityPortland
Period19/11/419/11/6

Keywords

  • Coordination and cooperation
  • Dec POMDP
  • Deep reinforcement learning
  • Multi agent learning
  • Multi agent systems

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Coordination in adversarial multi-agent with deep reinforcement learning under partial observability'. Together they form a unique fingerprint.

Cite this