Chinese Journal of Ship Research, Volume. 16, Issue 6, 99(2021)
Intelligent decision technology in combat deduction based on soft actor-critic algorithm
[8] [8] HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actcritic: offpolicy maximum entropy deep reinfcement learning with a stochastic act[C]Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: ACM Press, 2018.
[9] [9] SUTTON R S, BARTO A G. Reinfcement Learning: An Introduction[M]. Cambridge: MIT Press, 1998.
[10] [10] SPIELBERG S, GOPALUNI R, LOEWEN P. Deep reinfcement learning approaches f process control[C]2017 6th International Symposium on Advanced Control of Industrial Processes, [S. 1. ]: IEEE, 2017: 201–203.
[11] [11] HAARNOJA T, ZHOU A, HARTIKAINEN K, et al. Soft actcritic algithms applications [EBOL]. ArXiv: 1812.05905, 2018(20181213)[20200830]. https:arxiv.gabs1812.05905.
[13] [13] SCHULMAN J, CHEN X, ABBEEL P. Equivalence between policy gradients soft Qlearning[EBOL]. ArXiv: 1704.06440, 2017. (2017421)[20200830]. https:arxiv.gpdf1704.06440.pdf.
[14] [14] HAARNOJA T, TANG H, ABBEEL P, et al, Reinfcement learning with deep energybased policies[C]Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: ACM Press: MLR. g, 2017: 1352–1361.
[15] [15] LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinfcement learning[C]Proceedings of the 4th International Conference on Learning Representations. San Juan, Puerto Rico: Elsevier, 2016.
Get Citation
Copy Citation Text
Xingzhong WANG, Min WANG, Wei LUO. Intelligent decision technology in combat deduction based on soft actor-critic algorithm[J]. Chinese Journal of Ship Research, 2021, 16(6): 99
Category: Weapon, Electronic and Information System
Received: Aug. 31, 2020
Accepted: --
Published Online: Mar. 28, 2025
The Author Email: