Researching | Design of AUV controller based on improved PPO algorithm

Chinese Journal of Ship Research, Volume. 20, Issue 1, 350(2025)

Design of AUV controller based on improved PPO algorithm

Desheng XU^1,2,3 and Chunhui XU^1,2

Author Affiliations

¹State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China

²Key Laboratory of Marine Robotics, Liaoning Province, Shenyang 110169, China

³University of Chinese Academy of Sciences, Beijing 100049, China

show less

Abstract Get PDF(in Chinese)

Figures&Tables (16)

References (12)

Paper Information

Figures & Tables(16)

Fig. 1. [in Chinese]

Download full size

View in Article

Fig. 2. [in Chinese]

Download full size

View in Article

Fig. 3. [in Chinese]

Download full size

View in Article

Fig. 4. [in Chinese]

Download full size

View in Article

Fig. 5. [in Chinese]

Download full size

View in Article

Fig. 6. [in Chinese]

Download full size

View in Article

Fig. 7. [in Chinese]

Download full size

View in Article

Fig. 8. [in Chinese]

Download full size

View in Article

Fig. 9. [in Chinese]

Download full size

View in Article

Fig. 10. [in Chinese]

Download full size

View in Article

Fig. 11. [in Chinese]

Download full size

View in Article

Table 1. Robot parameters

View in Article

Table 1. Robot parameters

参数	数值
长度/mm	457
宽度/mm	338
高度/mm	254
重量/kg	12.5
浮力	中性浮力（略上浮）
锂电池	18 Ah 14.8 V（4 S）
摄像头	1 080 p 实时传输
最高速度/kn	3
通信协议	以太网+光纤电缆
传感器	IMU/GPS/深度计

Table 2. Algorithm parameter setting

View in Article

Table 2. Algorithm parameter setting

参数	设定值
Actor＆Critic网络隐层数	3
隐层神经元数	64
Actor激活函数	Tanh
Critic激活函数	ReLU
网络学习率$ \alpha $	0.000 2
折扣因子$ \gamma $	0.98
截断常数$ \varepsilon $	0.2
批量大小（Batch-size）	32
回合最大步数M	100
最大回合数Q	500
窗口大小N	5
采样区间扩张值$ \Delta \tau $	2
采样区间初始端点值$ \tau ' $	2

Table 3. Comparison of training process indicators
View table
View in Article
Table 3. Comparison of training process indicators
算法收敛回合数奖励稳态值奖励值标准差
本文算法 90 −153.34 11.46
域随机化算法 120 −171.92 17.94
RARL 陷入极小

Table 4. Control indicators in simulation enviroment

View in Article

Table 4. Control indicators in simulation enviroment

算法	干扰类型	深度跟踪误差均值	深度跟踪误差标准差
本文算法	恒值	0.22	0.78
本文算法	正弦	0.24	0.56
域随机化算法	恒值	0.78	0.91
域随机化算法	正弦	0.92	0.99
基线算法	恒值	1.03	1.00
基线算法	正弦	0.83	0.97
RARL	恒值	4.58	2.24
RARL	正弦	5.66	0.78

Table 5. Mean and standard deviation of tracking depth position error for each algorithm for tank experiments
View table
View in Article
Table 5. Mean and standard deviation of tracking depth position error for each algorithm for tank experiments
算法深度跟踪误差均值深度跟踪误差标准差
本文算法 0.084 0.060
域随机化算法 0.155 0.094
基线算法 0.227 0.127
RARL 0.500 0

Tools

Get Citation

Copy Citation Text

Desheng XU, Chunhui XU. Design of AUV controller based on improved PPO algorithm[J]. Chinese Journal of Ship Research, 2025, 20(1): 350

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Motion Control

Received: Jul. 1, 2024

Accepted: --

Published Online: Mar. 13, 2025

The Author Email:

DOI:10.19693/j.issn.1673-3185.04031

Topics

laser devices and laser physics

Lasers and Laser Optics

laser manufacturing

Instrumentation, Measurement and Metrology