Dual feed-forward neural network for predicting complex nonlinear dynamics of mode-locked fiber laser under variable cavity parameters

Haoyang Yu; Siyu Lai; Qiuying Ma; Zhaohui Jiang; Dong Pan; Weihua Gui

doi:10.3788/COL202523.031401

1. Introduction

Passively mode-locked fiber lasers, renowned for their stable output of ultra-short pulses with high peak power, are widely utilized in various fields, such as precision measurements with optical frequency combs^[1], optical communication^[2], and biomedical diagnostics^[3]. Furthermore, the modeling of mode-locked fiber lasers has a significant impact on fundamental scientific research in the field of soliton physics. As a typical nonlinear Schrödinger equation (NLSE)-governed system, the dynamics of the mode-locked fiber laser is affected by gain, loss, dispersion, self-phase modulation, and nonlinear effects. The highly nonlinear propagation of ultra-short pulses within the cavity provides a valuable platform for studying the nonlinear dynamics of soliton evolution processes^[4,5]. However, the conventional research paradigm commonly uses the numerical split-step Fourier method (SSFM) to iteratively solve the NLSE, separately handling dispersion and nonlinear effects to study pulse dynamics in mode-locked lasers. To ensure accuracy, a small iteration step is typically required, making this method computationally demanding and time-consuming. This poses a significant obstacle to the real-time control of nonlinear dynamic processes, as well as to the experimental design and optimization of mode-locked lasers^[6].

In recent years, artificial intelligence has made significant breakthroughs in fields such as ultrafast photonics^[7], nonlinear dynamics^[8], and nonlinear system identification and control^[9–13]. Evolutionary algorithms based on natural selection have been widely employed for experimental control and parameter optimization of mode-locked fiber laser systems^[14–18], the intelligent generation of breathing solitons in ultrafast fiber lasers^[19], and the optimization of spectral-flatness in optical frequency combs^[20]. The relationship between the cavity parameter settings of a mode-locked laser and the characteristics of single-pulse output (pulse duration and peak power) has been demonstrated by the feed-forward neural network (FNN)^[21,22]. Additionally, numerous networks have been employed to study the complete nonlinear dynamical evolution process of pulse propagation controlled by the NLSE or the generalized nonlinear Schrödinger equation (GNLSE). For modeling the dynamics of optical fiber propagation, a long short-term memory recurrent neural network (LSTM) and an FNN have been proposed for predicting the evolution of higher-order soliton compression associated with the generation of Peregrine solitons and supercontinuum generation^[23,24]. Physics-informed neural networks (PINNs) have demonstrated excellent performance in simulating soliton propagation, multi-pulse propagation, and vector soliton evolution in optical fibers^[25,26].

Compared to optical fiber propagation, employing machine learning to model mode-locked lasers poses a greater challenge. Pulses circulate back and forth within the resonant cavity, continuously influenced by various physical effects. This significantly increases the complexity of modeling the dynamic processes. Moreover, the dynamics of pulses are highly sensitive to the input pulse and cavity parameters. Even slight variations in cavity parameters can lead to entirely different dynamic processes and output characteristics. Pu et al. proposed a dimension-extension-based recurrent neural network with prior information feeding to accurately model the femtosecond mode-locked laser^[27]. Fang et al. enhanced the model with a bidirectional LSTM and attention mechanism to better capture the dynamics of mode-locked laser soliton generation^[28]. The dynamics of vector-soliton pulsations (VSP) in various complex states have also been successfully predicted by two parallel bidirectional long short-term memory recurrent neural networks (TP-Bi-LSTMs)^[29]. The detailed pulse characteristic evolution of different solitons along the cavity has also been predicted by the sparrow search algorithm long short-term memory recurrent neural network (SSA-LSTM)^[30]. Although the prediction speed of recurrent neural networks has improved by several orders of magnitude compared to the SSFM, the large number of parameters in the LSTM results in high computational complexity, as well as long training and prediction time. These factors pose limitations for real-time control and optimization of the dynamics of mode-locked lasers.

Here, we develop a faster and simpler dual feed-forward neural network (DFNN) to predict the pulse full-field evolution dynamics of a mode-locked laser, encompassing three scenarios: no soliton formation, single soliton formation, and soliton molecule formation with different temporal separations. By employing the cavity parameter feature expander (CPFE) for feature expansion of the cavity parameters (small signal gain, erbium-doped fiber length, and single-mode fiber length) and feeding them into the subsequent FNN, the model’s expressive capability is enhanced, thereby improving prediction accuracy and generalization ability with respect to the cavity parameters. The dynamic process predictor (DPP) is utilized for predicting the dynamic processes. The DFNN, while maintaining similar accuracy, outperforms LSTM models in terms of speed and complexity. We anticipate that our findings will probably hold implications for real-time control and optimization of the dynamics in mode-locked lasers and other nonlinear optical systems.

2. Theory and Model

The dynamics of pulses in mode-locked lasers can be characterized by a series of complex variations in the electric field, governed by the generalized nonlinear Schrödinger equation as represented in Eq. (1), $\frac{\partial A}{\partial Z} + \frac{α - g}{2} A + \frac{i β_{2}}{2} \frac{\partial^{2} A}{\partial T^{2}} = i γ {| A |}^{2} A + \frac{g}{2 Ω_{g}^{2}} \frac{\partial^{2} A}{\partial T^{2}} .$ (1)

Here, $A$ represents the intensity of the pulse electric field, while $Z$ and $T$ , respectively, denote the distance and time, and $Ω_{g}$ stands for the gain bandwidth. The parameters $α$ , $g$ , $β_{2}$ , and $γ$ represent the fiber loss, gain, second-order dispersion, and nonlinearity coefficient, respectively. The gain coefficient $g$ of the erbium-doped fiber can be further expressed in Eq. (2), $g = \frac{g_{0}}{1 + \frac{E_{p}}{E_{s}}},$ (2)where $g_{0}$ , $E_{p}$ , and $E_{s}$ represent the small signal gain, pulse energy, and saturation energy, respectively. Traditional methods model the dynamic processes by iteratively solving the NLSE using the SSFM. In order to ensure accuracy, a step size of 0.5 cm was chosen for the simulation. Unfortunately, the SSFM is highly time-consuming. Therefore, the use of neural networks as a replacement for numerical solving to accelerate the dynamic modeling process is proposed, aiming for better control and optimization of the dynamic process in mode-locked lasers. The workflow structure is illustrated in Fig. 1. Precise modeling of the mode-locked lasers is achieved through SSFM, generating a large amount of data on the variation of the pulse complex electric field amplitude in the mode-locked lasers under different cavity parameter settings, which serves as the training and testing sets for the neural network model.

Figure 1.Neural network dataset generation process. WDM, wavelength division multiplexer; EDF, erbium-doped fiber; SMF, single-mode fiber; SA, saturable absorber; OC, 10%/90% coupler.

Download full size

View all figures

The simulated mode-locked laser comprises a wavelength division multiplexer, an erbium-doped fiber, a single-mode fiber, a saturable absorber, and a 10%/90% coupler. The second-order dispersion and nonlinear coefficients of the erbium-doped fiber and single-mode fiber are denoted as $β_{2_{EDF}} = 25 {ps}^{2} {km}^{- 1}$ , $β_{2_{SMF}} = - 23 {ps}^{2} {km}^{- 1}$ , $γ_{EDF} = 3.6 W^{- 1} {km}^{- 1}$ , and $γ_{SMF} = 1.3 W^{- 1} {km}^{- 1}$ . The model of the saturable absorber can be derived as Eq. (3), $T (u) = q_{0} - \frac{Δ T}{1 + \frac{| u |^{2}}{P_{sat}}},$ (3)where $T (u)$ is the transmission coefficient, $q_{0}$ is the nonsaturable transmition coefficient, $Δ T$ represents the modulation depth, $| u |^{2}$ represents the light intensity, and $P_{sat}$ is the saturation intensity. In this Letter, $q_{0}$ , $Δ T$ , and $P_{sat}$ are set to 0.8, 0.1, and 8, respectively. To prevent overfitting, the early stopping technique is implemented. Once training is complete, only three cavity parameters and an initial pulse are needed as inputs to predict the dynamic evolution of the mode-locked laser by rolling through the iterations. The evolution of the optical field through the fiber can be solved iteratively using the SSFM, while when passing through devices like the saturable absorber, it can be multiplied by the Jones matrix of the device. To expedite the convergence process of the pulse, a pulse seeding simulation method is employed. The initial pulse is a Gaussian pulse with a full width at half-maximum (FWHM) duration of 5 ps and a peak power of 0.05 W. The simulation time window is set at 20 ps, with the complex electric field amplitude of the pulse sampled at 256 points. For each point, both the real and imaginary parts are recorded to predict the evolution of the temporal and spectral intensities simultaneously. For each roundtrip of the pulse within the cavity, the complex amplitude frame data is recorded from the coupler output, performing a down-sampling operation with a sampling interval of one roundtrip. The maximum number of roundtrips is fixed at 500. During the preparation of the neural network dataset, the small signal gain $g_{0}$ is randomly selected within the range of $1 - 5 m^{- 1}$ . The lengths of the erbium-doped fiber and single-mode fiber are randomly chosen within the ranges of 0.2–0.4 m and 1.17–1.67 m, respectively, corresponding to the repetition frequencies of the mode-locked laser ranging from 100 to 150 MHz, which is common for applications such as the optical frequency comb^[31]. The parameter range of this cavity encompasses three scenarios: no soliton formation, single soliton formation, and soliton molecule formation with different temporal separations. Therefore, for the generation of specific data, including dynamic data of the pulse complex amplitude variations over 500 roundtrips ( $500 \times 512$ ), as well as three static features of small signal gain, erbium-doped fiber length, and single-mode fiber length. In order to avoid the problem of overfitting, sufficient training data should be included. In our experiment, a total of 1000 datasets were generated by the SSFM, with 960 sets used for training, 20 sets for validation, and 20 sets for testing.

Figure 2 illustrates the structure of utilizing a DFNN to predict the dynamic processes of mode-locked lasers under different cavity parameters. Due to the implementation of a pulse seeding simulation method, the initial pulses are identical for all datasets. Furthermore, the pulse dynamics of the mode-locked lasers are highly sensitive to the cavity parameters. Therefore, the degree of utilizing the information from the cavity parameters to determine the predictive accuracy of the dynamic processes. Instead of directly using the conventional FNN, we propose a modified DFNN structure. In this framework, a fully connected layer known as CPFE, which consists of two hidden layers containing 512 nodes each and utilizing the ReLU activation function, is utilized to conduct feature expansion on the cavity parameter information. Subsequently, layer normalization is applied, and the resulting output is added to the dynamic data input for the DPP. The task of the DPP is to predict the complex amplitude data for one roundtrip ahead, given the pulse complex electric field amplitude data that incorporates static cavity parameter features. This DPP consists of four hidden layers with 1000 nodes each, utilizing the ReLU as the activation function, and a Sigmoid output layer with 512 nodes. The Kaiming normal initialization method is used to initialize the neural network parameters, and 0.0003 was selected as the initial learning rate. During the training phase, the predictions of the DFNN are compared with the numerical simulation results of the SSFM using the mean square error (MSE) loss function. Adam optimizers are employed for 500 epochs of training to adjust the weights and biases of each node to minimize the error. The updating principle of the parameters is shown in Eq. (4), $θ_{t + 1} = θ_{t} - \frac{η}{\sqrt{{\hat{v}}_{t}} + ε} {\hat{m}}_{t},$ (4)where $θ$ is the parameter to be updated, $η$ is the learning rate, ${\hat{m}}_{t}$ and ${\hat{v}}_{t}$ represent the first-order momentum and second-order momentum of the deviation-corrected gradient, respectively, and $ε$ is a small constant. To prevent overfitting, the early stopping technique is implemented. Once training is complete, only three cavity parameters and initial pulses are needed as inputs to predict the dynamic evolution of the mode-locked laser by rolling through the iterations. The SSFM simulation results are used as the ground truth for neural network training and testing.

Figure 2.Schematic diagram of the DFNN for mode-locked fiber laser dynamic prediction.

Download full size

View all figures

The normalized root mean square error (NRMSE), as shown in Eq. (5), is utilized to quantitatively assess the accuracy of the model predictions, where a smaller NRMSE indicates higher prediction accuracy, $NRMSE = \sqrt{\frac{\sum_{i}^{n} {(x_{i} - {\hat{x}}_{i})}^{2}}{\sum_{i}^{n} x_{i}^{2}}},$ (5)where $x$ and $\hat{x}$ represent the values from the SSFM simulation and the DFNN prediction, respectively, while $n$ denotes the total number of test sets.

3. Results

Figure 3 shows the loss figure of the DFNN training process. With the increase of epochs, the loss of the training and validation sets decreases and gradually converges, and the training set finally converges to an accuracy of around $10^{- 7}$ , which proves that the trained DFNN does not suffer from overfitting or underfitting.

Figure 3.Training, validation, and test losses across epochs.

Download full size

View all figures

3.1. No soliton formation

The formation of conventional solitons results from the interaction among gain, loss, dispersion, and nonlinear effects. When the small signal gain is low or the erbium-doped fiber length is short, the initial pulse gains less energy than it loses after completing one roundtrip in the cavity. Consequently, the pulse energy diminishes with successive propagation cycles, ultimately failing to form a soliton. Figure 4(a) illustrates the temporal dynamic evolution simulated by the SSFM and predicted by the DFNN and the LSTM under the conditions of $g_{0} = 1.145 m^{- 1}$ , $L_{EDF} = 0.263 m$ , and $L_{SMF} = 1.563 m$ . It can be observed that the DFNN accurately predicts the inability to form solitons. As shown in Fig. 4(b), the absolute intensity prediction error is $10^{- 6} W$ , which is sufficiently small for common scenarios. By improving the imbalance of the dataset, the prediction error of the pulse intensity can be further decreased.

Figure 4.Temporal evolution modeling of no soliton formation propagation dynamics under g₀ = 1.145 m⁻¹, L_EDF = 0.263 m, and L_SMF = 1.563 m. (a) The temporal evolution dynamics of the SSFM (top), the DFNN (middle), and the LSTM (bottom). (b) Temporal intensity at selected roundtrips predicted by the DFNN (dashed black lines), simulated with the SSFM (solid red lines).

Download full size

View all figures

3.2. Single soliton formation

Sufficient gain compensation for losses within the resonant cavity is a necessary condition for soliton formation. When dispersion and nonlinear effects reach equilibrium, stable solitons can be generated. The temporal dynamics of single soliton formation under the conditions of $g_{0} = 2.658 m^{- 1}$ , $L_{EDF} = 0.293 m$ , and $L_{SMF} = 1.552 m$ , simulated by the SSFM, the DFNN, and the LSTM are illustrated in Fig. 5(a). It is evident that there is good consistency between the SSFM simulation and the DFNN prediction of the temporal dynamics of single soliton formation. Initially, the pulse undergoes a stage of energy increase and pulse broadening (1–30 roundtrips). This is followed by a stage of cyclic oscillations characterized by energy increase with pulse compression and energy decrease with pulse broadening (31–110 roundtrips). During this stage, the bottom width of the pulse gradually narrows, and the oscillation periods shorten with smaller oscillation amplitudes. Ultimately, when gain, loss, dispersion, and nonlinear effects are balanced, a stable single soliton is formed. Figure 5(b) demonstrates that the proposed model can effectively predict the signal evolution during the initial and cyclic oscillation stages, providing a rapid and accurate method for analyzing the process of single soliton formation in experiments. The final pulse durations simulated by the SSFM and DFNN are 291.86 and 292 fs, respectively, showing close agreement.

Figure 5.Temporal evolution modeling of single soliton formation propagation dynamics under g₀ = 2.658 m⁻¹, L_EDF = 0.293 m, and L_SMF = 1.552 m. (a) The temporal evolution dynamics of the SSFM (top), the DFNN (middle), and the LSTM (bottom). (b) Temporal intensity at selected roundtrips for detuned steady state predicted by the DFNN (dashed black lines), simulated with the SSFM (solid red lines).

Download full size

View all figures

3.3. Soliton molecule formation with different temporal separations

When the gain is further increased, situations of pulse splitting and coexistence of multiple pulses will occur. Compared to single soliton formation, the formation of soliton molecules is a more complex dynamic process. Soliton molecules, as a bound state of solitons, result from the balance between attractive and repulsive forces between solitons. Figures 6(a) and 7(a), respectively, illustrate the dynamic process of soliton molecule formation with different temporal separations predicted by the SSFM, the DFNN, and the LSTM. The formation of soliton molecules mainly consists of three stages: transient single soliton, soliton molecules moving apart, and stable soliton molecules. Due to the high gain, pulse energy amplification and compression are completed within a very short number of roundtrips (around 20 roundtrips), forming transient single solitons. Subsequently, due to the fact that the mode-locked laser can only tolerate a certain degree of nonlinear phase shift, the stable single pulse begins to split. During this stage, the repulsive force provided by the gain is greater than the attractive force, causing the soliton molecules to move apart (around 80 roundtrips). The gain and dispersion significantly influence the temporal separation between soliton molecules. Finally, stable soliton molecules are formed, with constant temporal separation and phase difference between molecules. The temporal separation between soliton molecules in Figs. 6 and 7 are 2.509 and 5.176 ps, respectively. Overall, the proposed model can successfully predict the formation mechanism of soliton molecules and their stable temporal separations, providing theoretical guidance for the experimentally tailoring soliton molecules.

Figure 6.Temporal evolution modeling of soliton molecule formation with narrow temporal separation propagation dynamics under g₀ = 4.030 m⁻¹, L_EDF = 0.364 m, and L_SMF = 1.178 m. (a) The temporal evolution dynamics of the SSFM (top), the DFNN (middle), and the LSTM (bottom). (b) Temporal intensity at selected roundtrips from detuned steady state to steady state predicted by the DFNN (dashed black lines), simulated with the SSFM (solid red lines).

Download full size

View all figures

Figure 7.Temporal evolution modeling of soliton molecule formation with wide temporal separation propagation dynamics under g₀ = 4.322 m⁻¹, L_EDF = 0.357 m, and L_SMF = 1.406 m. (a) The temporal evolution dynamics of the SSFM (top), the DFNN (middle), and the LSTM (bottom). (b) Temporal intensity at selected roundtrips from detuned steady state to steady state predicted by the DFNN (dashed black lines), simulated with the SSFM (solid red lines).

Download full size

View all figures

4. Discussion

In the context of the same dataset, a comparison was conducted between the DFNN and the LSTM with a similar structure as referenced in Ref. [27], in terms of prediction accuracy, training time, simulation time, model memory, FLOPs, as well as the simulation time of the SSFM. The results are presented in Table 1. In terms of accuracy, the LSTM slightly outperformed the DFNN because it uses more sequential information for training and prediction. However, there is no significant visual difference between the mode-locked laser dynamics predicted by the DFNN and the SSFM simulation. The DFNN, characterized by a simpler structure, demonstrated advantages over the LSTM in both complexity and speed aspects. FLOPs and model memory are selected to represent the complexity of the model. FLOPs focus mainly on the amount of computation, while model memory focuses on the memory resource requirements. In terms of both time complexity (FLOPs) and space complexity (model memory), the DFNN outperforms the LSTM. This advantage proves why the DFNN has faster training and simulation speeds compared to the LSTM, which provides a promising new method for predicting the dynamics of mode-locked lasers, especially in resource-constrained situations. After training, the simulation speed is approximately 152 times faster than the SSFM and 4 times faster than the LSTM. With an increase in the size of the simulated dataset, the network’s parallel prediction capability will further widen the speed gap with the SSFM. Hence, although a certain number of SSFM calculations are still required due to the limitation of supervised learning, once the deep learning model is trained, the simulation speed can be improved by several orders of magnitude.

Table 1. Performance Comparison Between the DFNN, the LSTM^[27], and the SSFM

View table
View all Tables
Table 1. Performance Comparison Between the DFNN, the LSTM^[27], and the SSFM

DFNN LSTM SSFM
NRMSE 0.36 0.27 N/A
Training time (s) 9108 59544 N/A
Simulation time (s)^a 0.44 1.92 67
Model memory (MB) 17.38 58.08 N/A
FLOPs (Mac) 4.57 × 10⁶ 2.07 × 10⁸ N/A

5. Conclusion

In summary, we utilized a DFNN to model the dynamics of a mode-locked fiber laser. Initial pulse intensity distribution, small signal gain, erbium-doped fiber length, and single-mode fiber length were used as network inputs. The DFNN can predict the dynamic characteristics of three scenarios: no soliton formation, single soliton formation, and soliton molecule formation with different temporal separations, transitioning from a detuned steady state to a steady state. Compared to the LSTM, the DFNN is simpler and has advantages in terms of speed and complexity. This shows the potential of our method for real-time control and optimization of ultrafast laser dynamics. Moreover, our method is suitable for ultrafast laser applications requiring extensive numerical simulations such as the inverse design of mode-locked lasers^[32] and rare dynamic phenomena studies^[19,29]. Using the powerful nonlinear fitting ability and high-speed prediction ability, the proposed DFNN method also has the potential to enable other physical systems governed by partial differential equations such as hydrodynamic waves.

Category: Lasers, Optical Amplifiers, and Laser Optics

Received: Jun. 22, 2024

Accepted: Sep. 3, 2024

Posted: Sep. 4, 2024

Published Online: Mar. 14, 2025

The Author Email: Qiuying Ma (mqy23@mails.tsinghua.edu.cn)

DOI:10.3788/COL202523.031401

CSTR:32184.14.COL202523.031401

Table 1. Performance Comparison Between the DFNN, the LSTM[27], and the SSFM

Table 1. Performance Comparison Between the DFNN, the LSTM[27], and the SSFM

Table 1. Performance Comparison Between the DFNN, the LSTM^[27], and the SSFM

Table 1. Performance Comparison Between the DFNN, the LSTM^[27], and the SSFM