Electronics Optics & Control, Volume. 30, Issue 8, 1(2023)

UAV Path Planning Based on Reverse Reinforcement Learning

YANG Xiuxia... WANG Chenlei, ZHANG Yi, YU Hao and JIANG Zijie |Show fewer author(s)
Author Affiliations
  • [in Chinese]
  • show less

    In the planning of UAV safe collision avoidance path,Deep Deterministic Policy Gradient (DDPG) algorithm suffers from slow convergence rate and reward function setting difficulties.To solve the problems,based on reverse reinforcement learning,a UAV path planning algorithm that integrates expert demonstration trajectories is proposed.Firstly,based on the simulator software,the demostration trajectory dataset of the expert manipulating the UAV to avoid obstacles is collected.Secondly,the hybrid sampling mechanism is used to update the network parameters by integrating high-quality expert demonstration trajectory data in the self-exploration data to reduce the cost of algorithm exploration.Finally,according to the maximum entropy reverse reinforcement learning algorithm,the optimal reward function implied in the experience of experts is calculated,which solves the problem that the reward function is difficult to design in complex tasks.Comparative experimental results show that the improved algorithm can effectively improve the efficiency of algorithm training and the obstacle avoidance performance is better.

    Tools

    Get Citation

    Copy Citation Text

    YANG Xiuxia, WANG Chenlei, ZHANG Yi, YU Hao, JIANG Zijie. UAV Path Planning Based on Reverse Reinforcement Learning[J]. Electronics Optics & Control, 2023, 30(8): 1

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Jul. 18, 2022

    Accepted: --

    Published Online: Jan. 17, 2024

    The Author Email:

    DOI:10.3969/j.issn.1671-637x.2023.08.001

    Topics