Chinese Journal of Ship Research, Volume. 20, Issue 1, 350(2025)

Design of AUV controller based on improved PPO algorithm

Desheng XU1,2,3 and Chunhui XU1,2
Author Affiliations
  • 1State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
  • 2Key Laboratory of Marine Robotics, Liaoning Province, Shenyang 110169, China
  • 3University of Chinese Academy of Sciences, Beijing 100049, China
  • show less
    References(12)

    [10] [10] BALL P J, LU C, PARKERHOLDER J, et al. Augmented wld models facilitate zeroshot dynamics generalization from a single offline environment[C]Proceedings of the 38th International Conference on Machine Learning. PMLR, 2021: 619629.

    [12] [12] PINTO L, DAVIDSON J, SUKTHANKAR R, et al. Robust adversarial reinfcement learning[C]Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: PMLR, 2017.

    [13] [13] DENNIS M, JAQUES N, VINITSKY E, et al. Emergent complexity zeroshot transfer via unsupervised environment design[C]Proceedings of the 34th International Conference on Neural Infmation Processing Systems. Vancouver: Curran Associates Inc. , 2020.

    [14] [14] KAKADE S, LANGFD J. Approximately optimal approximate reinfcement learning[C]Proceedings of the Nieenth International Conference on Machine Learning. San Francisco: Mgan Kaufmann Publishers Inc. , 2002.

    [15] [15] SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algithms[JOL]. arXiv: 1707.06347, 2017 [20240121]. http:arxiv.gabs1707.06347.

    [16] [16] FOSSEN T I. Marine control systems–guidance. navigation, control of ships, rigs underwater vehicles[R]. Trondheim: Marine Cyberics, 2002.

    [17] [17] YU W H, TAN J, KAREN LIU C, et al. Preparing f the unknown: learning a universal policy with online system identification[JOL]. Robotics: Science Systems, 2017 [20240123]. https:tml.stanfd.edupublications2017preparingunknownlearninguniversalpolicyonlinesystemidentification.

    [18] [18] ZHOU W X, PINTO L, GUPTA A. Environment probing interaction policies[C]7th International Conference on Learning Representations. New leans: ICLR, 2019.

    [19] [19] LEE K, SEO Y, LEE S, et al. Contextaware dynamics model f generalization in modelbased reinfcement learning[C]Proceedings of the 37th International Conference on Machine Learning. 2020: 5757−5766.

    Tools

    Get Citation

    Copy Citation Text

    Desheng XU, Chunhui XU. Design of AUV controller based on improved PPO algorithm[J]. Chinese Journal of Ship Research, 2025, 20(1): 350

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Motion Control

    Received: Jul. 1, 2024

    Accepted: --

    Published Online: Mar. 13, 2025

    The Author Email:

    DOI:10.19693/j.issn.1673-3185.04031

    Topics