Photonics Research, Volume. 10, Issue 6, 1491(2022)

Deep reinforcement with spectrum series learning control for a mode-locked fiber laser

Zhan Li1,2, Shuaishuai Yang1,3, Qi Xiao1,2, Tianyu Zhang1,2, Yong Li1,2, Lu Han1,2, Dean Liu1,4、*, Xiaoping Ouyang1,5、*, and Jianqiang Zhu1
Author Affiliations
  • 1Key Laboratory of High Power Laser and Physics, Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai 201800, China
  • 2Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
  • 3Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
  • 4e-mail: liudean@siom.ac.cn
  • 5e-mail: oyxp@siom.ac.cn
  • show less
    Figures & Tables(9)
    GNLSE simulation result from the NPE-based mode-locking laser system. (a) Spectral evolution when EPC is in TL. (b) Spectral evolution when EPC is in TH. (c) Spectral evolution when EPC is in TL initially and then converted to TH after 400 round trips. (d) Light transmittance caused by NPE when EPC is in TL (orange line) and TH (purple line). (e) Spectrum output after 800 round trips when EPC is in TL (orange line), TH (purple line), and Tm (green line). (f) Temporal output after 800 round trips when EPC is in TL (orange line), TH (purple line), and Tm (green line).
    Feedback time-series spectrum control model.
    MDRL agent layout.
    MDRL environment layout. LD, laser diode; WDM, 980/1060 nm wavelength division multiplexer; YDF, ytterbium-doped fiber; C, coupler; SMF, single-mode fiber; P, polarizer; I, isolator; EPC, electrical polarization controller; SF, optical spectrum filter; D, diagnostic optical spectrum analyzer.
    Spectrum and time-wave evolution during MDRL search. (a) Spectrum evolution data from the spectrum analyzer. (b) Time-wave evolution data from the high-speed photodetector and oscilloscope. (c) Obtained reward at each search step. (d) Direct autocorrelation output (blue line) and autocorrelation output after dispersion compensation (orange square, purple line).
    Mode-locked state switch by MSP. (a) Mode-locked state switch by minimizing the difference between PMSP(Wt) (purple line) and PMSP(Wc). (b) Pump power control error LMSP(Wc) (blue line) and MSP predicted error (green dashed line). (c), (g) Typical spectrum and temporal output in FML state. (d), (h) Typical spectrum and temporal output in HML state. (e), (i) Typical spectrum and temporal output in QML state. (f), (j) Typical spectrum and temporal output in QS output.
    Algorithm performance. (a) Total search step from 100 random initial states to the mode-locked state using MDRL (purple solid circle), DDPG (orange solid square), and genetic algorithm (green solid triangle). (b) Search stability test at different temperatures with MDRL (purple), DDPG (orange), and genetic algorithm (green).
    Search stability test at different temperatures with MDRL (purple), DDPG (orange), and genetic algorithm (green).
    • Table 1. Time Consumption Comparison with Recent Works

      View table
      View in Article

      Table 1. Time Consumption Comparison with Recent Works

      AlgorithmAverage TimeAverage Search Step
      Genetic algorithm [7]30 min6000
      HLA [6]3.1 s3100
      DDPG [18]1.948 s
      DDPG in this environment5.8 s116.1
      MDRL in this environment0.69 s13.8
    Tools

    Get Citation

    Copy Citation Text

    Zhan Li, Shuaishuai Yang, Qi Xiao, Tianyu Zhang, Yong Li, Lu Han, Dean Liu, Xiaoping Ouyang, Jianqiang Zhu. Deep reinforcement with spectrum series learning control for a mode-locked fiber laser[J]. Photonics Research, 2022, 10(6): 1491

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Lasers and Laser Optics

    Received: Feb. 4, 2022

    Accepted: Apr. 29, 2022

    Published Online: May. 25, 2022

    The Author Email: Dean Liu (liudean@siom.ac.cn), Xiaoping Ouyang (oyxp@siom.ac.cn)

    DOI:10.1364/PRJ.455493

    Topics