Optics and Precision Engineering, Volume. 33, Issue 6, 979(2025)

Autonomous decision-making for spacecraft close approaches in the Earth-Moon environment

Cheng HUANG*... Zhicong QIU and Jiazhong XU |Show fewer author(s)
Author Affiliations
  • College of Automation, Harbin University of Science and Technology., Harbin150080, China
  • show less
    Figures & Tables(21)
    Spacecraft close approaches
    Relative synodic reference coordinate system
    LSTM network structure diagram
    Classification of K-NN algorithms
    Strategy network structure diagram of LSTM module
    Exploration of state-based internal rewards
    Schematic diagram of the improved PPO algorithm decision
    Comparison of average reward curve
    Average reward curve in LunarLanderContinuous-v2 environment
    Spacecraft close approach mission test trajectory
    Relative position change
    Relative speed change
    Coplanar motion of the sun around the center of mass of the earth and the moon
    • Table 1. Training pseudocode

      View table
      View in Article

      Table 1. Training pseudocode

      环境设置:设置仿真步长、目标状态、推力上限、运动范围、训练数据长度…

      fork=1:Nstep

      (1) 初始化策略网络和价值网络参数;

      (2) 产生动作: 将航天器质量、相对位置和速度信息,输入到新网络输出航天器的推力;
      (3) 状态更新: 将推力输入到环境中得到奖励r和下一时刻状态s-
      (4) 状态收集: 收集交互的经验数据s1,a1,r1,,sT,aT,rT,放入数据缓存区;
      if数据长度n=buffer_size
      (5) 利用本文第3.2.2节提出的内部奖励探索机制计算新的奖励值作为后续折扣奖励计算的数据;
      (6) 将存储的状态s集合输入Critic网络,得到对应所有状态的状态价值函数Vs,结合折扣奖励Rt计算优势函数估计值At
    • Table 1. Training pseudocode

      View table
      View in Article

      Table 1. Training pseudocode

      环境设置:设置仿真步长、目标状态、推力上限、运动范围、训练数据长度…
      (7) 计算Critic网络的c_loss函数,然后反向传播更新Critic网络的参数;
      (8) 将存储的动作a组合输入两个策略网络,分别得到一个正态分布,进而求得重要性抽样比率rθk
      (9) 根据式(24)计算Actor_old网络的loss函数,反向传播更新Actor_new网络参数;
      (10) 将Actor_new网络的参数更新至Actor_old网络;
      else
      (11) 返回步骤(2),继续采集数据;
      end.
    • Table 2. Earth-Moon system parameters

      View table
      View in Article

      Table 2. Earth-Moon system parameters

      参数
      质量参数0.012 150 585 6
      系统质量6.045 8×1 024 kg
      地月距离3.844×108 m
      系统周期375 200 s
    • Table 3. Improved neural network structure for PPO algorithm

      View table
      View in Article

      Table 3. Improved neural network structure for PPO algorithm

      神经元个数(A/C)激活函数类型
      输入层16/16Linear
      隐藏层1256/256TanhLSTM
      隐藏层2256/256TanhLSTM
      隐藏层364/64TanhMLP
      隐藏层464/64TanhMLP
      输出层3/1Linear
    • Table 4. Improving the parameters of the PPO algorithm

      View table
      View in Article

      Table 4. Improving the parameters of the PPO algorithm

      参数值(Actor)值(Critic)
      折扣因子γ0.99-
      GAE超参数λ1-
      Clip函数参数ε0.1-
      学习率α0.000 050.000 05
      交叉熵系数0.000 03-
      梯度裁剪参数0.10.1
      批次大小6464
      训练轮数1010
      重要性抽样比率阈值1.5-
    • Table 5. Comparison of 50 m approach task test

      View table
      View in Article

      Table 5. Comparison of 50 m approach task test

      算法成功率/%最终位置ρf/m最终速度ρ˙f/(m·s-1燃料消耗ΔV/(m·s-1飞行时间T/s
      PPO1000.415±0.0110.093±0.00411.218±1.07129.537±0.215
      改进PPO1000.543±0.0610.086±0.0036.735±0.12240.158±1.092
    • Table 6. Comparison of 200 m approach task test

      View table
      View in Article

      Table 6. Comparison of 200 m approach task test

      算法成功率/%最终位置ρf/m最终速度ρ˙f/(m·s-1燃料消耗ΔV/(m·s-1飞行时间T/s
      PPO1000.694±0.0140.095±0.00119.133±1.21360.331±1.472
      改进PPO1000.612±0.0900.090±0.00511.234±0.52188.733±3.557
    • Table 7. Results of approach task test with interference

      View table
      View in Article

      Table 7. Results of approach task test with interference

      算法任务成功率/%最终位置ρf/m最终速度ρ˙f/(m·s-1燃料消耗ΔV/(m·s-1飞行时间T/s
      PPO算法50 m1000.997±0.0020.008±0.00615.464±0.54447.167±0.955
      200 m990.683±0.0260.097±0.00119.952±1.03263.173±1.071
      改进PPO算法50 m1000.896±0.0500.097±0.0017.064±0.23076.738±1.942
      200 m1000.723±0.0970.082±0.00710.994±0.30193.992±1.422
    Tools

    Get Citation

    Copy Citation Text

    Cheng HUANG, Zhicong QIU, Jiazhong XU. Autonomous decision-making for spacecraft close approaches in the Earth-Moon environment[J]. Optics and Precision Engineering, 2025, 33(6): 979

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Aug. 14, 2024

    Accepted: --

    Published Online: Jun. 16, 2025

    The Author Email: Cheng HUANG (huangchengsunxi@163.com)

    DOI:10.37188/OPE.20253306.0979

    Topics