Electronics Optics & Control, Volume. 30, Issue 8, 1(2023)
UAV Path Planning Based on Reverse Reinforcement Learning
In the planning of UAV safe collision avoidance path,Deep Deterministic Policy Gradient (DDPG) algorithm suffers from slow convergence rate and reward function setting difficulties.To solve the problems,based on reverse reinforcement learning,a UAV path planning algorithm that integrates expert demonstration trajectories is proposed.Firstly,based on the simulator software,the demostration trajectory dataset of the expert manipulating the UAV to avoid obstacles is collected.Secondly,the hybrid sampling mechanism is used to update the network parameters by integrating high-quality expert demonstration trajectory data in the self-exploration data to reduce the cost of algorithm exploration.Finally,according to the maximum entropy reverse reinforcement learning algorithm,the optimal reward function implied in the experience of experts is calculated,which solves the problem that the reward function is difficult to design in complex tasks.Comparative experimental results show that the improved algorithm can effectively improve the efficiency of algorithm training and the obstacle avoidance performance is better.
Get Citation
Copy Citation Text
YANG Xiuxia, WANG Chenlei, ZHANG Yi, YU Hao, JIANG Zijie. UAV Path Planning Based on Reverse Reinforcement Learning[J]. Electronics Optics & Control, 2023, 30(8): 1
Category:
Received: Jul. 18, 2022
Accepted: --
Published Online: Jan. 17, 2024
The Author Email: