Chinese Journal of Ship Research, Volume. 20, Issue 1, 350(2025)

Design of AUV controller based on improved PPO algorithm

Desheng XU1,2,3 and Chunhui XU1,2
Author Affiliations
  • 1State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China
  • 2Key Laboratory of Marine Robotics, Liaoning Province, Shenyang 110169, China
  • 3University of Chinese Academy of Sciences, Beijing 100049, China
  • show less
    Figures & Tables(16)
    [in Chinese]
    [in Chinese]
    [in Chinese]
    [in Chinese]
    [in Chinese]
    [in Chinese]
    [in Chinese]
    [in Chinese]
    [in Chinese]
    [in Chinese]
    [in Chinese]
    • Table 1. Robot parameters

      View table
      View in Article

      Table 1. Robot parameters

      参数数值
      长度/mm457
      宽度/mm338
      高度/mm254
      重量/kg12.5
      浮力中性浮力(略上浮)
      锂电池18 Ah 14.8 V(4 S)
      摄像头1 080 p 实时传输
      最高速度/kn3
      通信协议以太网+光纤电缆
      传感器IMU/GPS/深度计
    • Table 2. Algorithm parameter setting

      View table
      View in Article

      Table 2. Algorithm parameter setting

      参数设定值
      Actor&Critic网络隐层数3
      隐层神经元数64
      Actor激活函数Tanh
      Critic激活函数ReLU
      网络学习率$ \alpha $0.000 2
      折扣因子$ \gamma $0.98
      截断常数$ \varepsilon $0.2
      批量大小(Batch-size)32
      回合最大步数M100
      最大回合数Q500
      窗口大小N5
      采样区间扩张值$ \Delta \tau $2
      采样区间初始端点值$ \tau ' $2
    • Table 3. Comparison of training process indicators

      View table
      View in Article

      Table 3. Comparison of training process indicators

      算法收敛回合数奖励稳态值奖励值标准差
      本文算法90−153.3411.46
      域随机化算法120−171.9217.94
      RARL陷入极小
    • Table 4. Control indicators in simulation enviroment

      View table
      View in Article

      Table 4. Control indicators in simulation enviroment

      算法干扰类型深度跟踪误差均值深度跟踪误差标准差
      本文算法恒值0.220.78
      正弦0.240.56
      域随机化算法恒值0.780.91
      正弦0.920.99
      基线算法恒值1.031.00
      正弦0.830.97
      RARL恒值4.582.24
      正弦5.660.78
    • Table 5. Mean and standard deviation of tracking depth position error for each algorithm for tank experiments

      View table
      View in Article

      Table 5. Mean and standard deviation of tracking depth position error for each algorithm for tank experiments

      算法深度跟踪误差均值深度跟踪误差标准差
      本文算法0.0840.060
      域随机化算法0.1550.094
      基线算法0.2270.127
      RARL0.5000
    Tools

    Get Citation

    Copy Citation Text

    Desheng XU, Chunhui XU. Design of AUV controller based on improved PPO algorithm[J]. Chinese Journal of Ship Research, 2025, 20(1): 350

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Motion Control

    Received: Jul. 1, 2024

    Accepted: --

    Published Online: Mar. 13, 2025

    The Author Email:

    DOI:10.19693/j.issn.1673-3185.04031

    Topics