Electronics Optics & Control, Volume. 30, Issue 4, 78(2023)
An Improved QMIX Network Based on Gradient Entropy Regularization
[1] [1] MNIH V, KAVUKCUOGLU K, SILVER D, et al.Human-level control through deep reinforcement learning[J].Nature, 2015, 518(7540): 529-533.
[2] [2] WATTER M, SPRINGENBERG J T, BOEDECKER J, et al.Embed to control: a locally linear latent dynamics model for control from raw images[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.Cambridge:MIT Press, 2015:2746-2754.
[3] [3] ZHANG F Y, LEITNER J, MILFORD M, et al.Towards vision-based deep reinforcement learning for robotic motion control[C]//Australasian Conference on Robotics and Automation.Canberra:Australian Robotics and Automation Association, 2015:1-8.
[4] [4] SILVER D, HUANG A, MADDISON C J, et al.Mastering the game of Go with deep neural networks and tree search[J].Nature, 2016, 529(7587): 484-489.
[5] [5] TAN M.Multi-agent reinforcement learning: independent vs.cooperative agents[C]//Proceedings of the Tenth International Conference on Machine Learning.Amherst:Elsevier Inc., 1993: 330-337.
[6] [6] FOERSTER J N, ASSAEL Y M, DE FREITAS N, et al.Learning to communicate to solve riddles with deep distributed recurrent Q-networks[EB/OL].(2016-02-08)[2022-03-10].https://arxiv.org/abs/1602.02672.
[7] [7] SUNEHAG P, LEVER G, GRUSLYS A, et al.Value-decomposition networks for cooperative multi-agent learning based on team reward[C]//Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems.Richland:International Foundation for Autonomous Agents and Multiagent Systems, 2018:2085-2087.
Get Citation
Copy Citation Text
LU Rui, PENG Pengfei. An Improved QMIX Network Based on Gradient Entropy Regularization[J]. Electronics Optics & Control, 2023, 30(4): 78
Category:
Received: Mar. 10, 2022
Accepted: --
Published Online: Jun. 12, 2023
The Author Email: