An Improved QMIX Network Based on Gradient Entropy Regularization

LU Rui; PENG Pengfei

doi:10.3969/j.issn.1671-637x.2023.04.015

Electronics Optics & Control, Volume. 30, Issue 4, 78(2023)

An Improved QMIX Network Based on Gradient Entropy Regularization

LU Rui and PENG Pengfei

Author Affiliations

[in Chinese]

show less

When cooperative multi-agent system lacks individual reward signals, the contribution of different agents cannot be distinguished, which leads to low cooperation efficiency.To solve the problem, the discriminability evaluation index of credit allocation is introduced by using the value decomposition paradigm, and a method based on gradient entropy regularization is proposed to achieve highly discriminable credit allocation.Based on this, an improved QMIX network is proposed by using the multi-agent deep reinforcement learning algorithm.Through SMAC multi-agent learning environment and Starcraft2s built-in map editor, the corresponding simulation environment is established.The results show that the learning efficiency and overall performance of the improved QMIX network are improved compared with that of QMIX network, and it is more suitable for cooperative multi-agent reinforcement learning in partially observable environment.

Keywords

credit allocation gradient entropy multi-agent reinforcement learning

Tools

Get Citation

Copy Citation Text

LU Rui, PENG Pengfei. An Improved QMIX Network Based on Gradient Entropy Regularization[J]. Electronics Optics & Control, 2023, 30(4): 78

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites