Shanghai Jiao Tong University, School of Information Science and Electronic Engineering, State Key Laboratory of Photonics and Communications, Shanghai, China
To capture the nonlinear dynamics and gain evolution in chirped pulse amplification (CPA) systems, the split-step Fourier method and the fourth-order Runge–Kutta method are integrated to iteratively address the generalized nonlinear Schrödinger equation and the rate equations. However, this approach is burdened by substantial computational demands, resulting in significant time expenditures. In the context of intelligent laser optimization and inverse design, the necessity for numerous simulations further exacerbates this issue, highlighting the need for fast and accurate simulation methodologies. Here, we introduce an end-to-end model augmented with active learning (E2E-AL) with decent generalization through different dedicated embedding methods over various parameters. On an identical computational platform, the artificial intelligence–driven model is 2000 times faster than the conventional simulation method. Benefiting from the active learning strategy, the E2E-AL model achieves decent precision with only two-thirds of the training samples compared with the case without such a strategy. Furthermore, we demonstrate a multi-objective inverse design of the CPA systems enabled by the E2E-AL model. The E2E-AL framework manifests the potential of becoming a standard approach for the rapid and accurate modeling of ultrafast lasers and is readily extended to simulate other complex systems.
【AIGC One Sentence Reading】:To tackle computational demands in chirped pulse amplification, an end-to-end model enhanced with active learning (E2E-AL) is introduced. It offers rapid, precise simulations, being 2000 times faster than traditional methods. Active learning reduces training samples needed while maintaining precision, enabling multi-objective inverse design of CPA systems.
【AIGC Short Abstract】:To address computational burdens in chirped pulse amplification (CPA) systems, an end-to-end model enhanced with active learning (E2E-AL) is introduced. This model integrates different embedding methods for better generalization. It outpaces traditional simulations by 2000 times on the same platform. With active learning, the E2E-AL model attains precision using only two-thirds of the training samples. The framework also facilitates multi-objective inverse design in CPA systems, suggesting its potential as a standard for fast, accurate ultrafast laser modeling and extension to other complex systems.
Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.
Femtosecond lasers, characterized by high peak power, substantial energy output, and high repetition rates, are utilized across various fields, including microfabrication for precision manufacturing,1 medical treatment,2 and particle physics.3 Chirped pulse amplification (CPA),4 awarded the Nobel Prize in Physics in 2018, is a primary method for achieving high-power ultrafast lasers.5 Modeling the CPA system is essential for providing theoretical guidance in the design of experimental ultrafast laser systems and enabling accurate inverse design.6 However, conventional CPA modeling employs the generalized nonlinear Schrödinger equation (GNLSE) to describe the nonlinear and linear effects and rate equations (REs) to characterize the gain dynamics within the gain fiber.7–9 The two equations, along with their coupling effects, are solved through a numerical iterative method that combines the split-step Fourier method (SSFM) with the fourth-order Runge–Kutta (RK4) algorithm.10 However, the approach is computationally expensive and time-consuming, significantly hindering the efficiency of the simulation applications in the CPA system.
Recently, artificial intelligence (AI) has emerged as a powerful tool for multi-parameter system simulation and inverse design. AI provides an efficient method to design and optimize photonic crystal structures through the multilevel abstraction of data using hierarchically structured layers.11 In the field of fiber optic communications,12 AI models are employed for tasks such as transmission quality estimation,13 building digital twins for transmission links,14 signal equalization in short-reach applications,15 and mitigating nonlinear noise in long-haul transmission systems.16 In laser optics, fast and accurate mode-locked fiber lasers have been achieved by recurrent neural networks,17 and AI has been utilized to reconstruct ultrashort optical pulses with decent robustness against noise.18
However, applying AI to model CPA systems remains quite challenging. First, femtosecond pulse propagation in gain fibers involves complex dynamics due to the interplay among gain, loss, dispersion, and nonlinearities.19 Second, many external parameters (e.g., stretcher dispersions and pump powers) significantly influence the ultimate compressed pulse, necessitating robust multi-parameter generalization in CPA system modeling. Last but not least, due to the large dispersion introduced in the stretcher, the temporal simulation window can extend to the nanosecond level to prevent overlap, and the temporal resolution remains at the femtosecond level to guarantee a broad spectral span during the whole process. In our previous work,20 we demonstrate a long short-term memory (LSTM) structure to simulate the CPA process with durations ranging from 800 to 1.6 ps, and the temporal resolution is as large as 200 fs. To simulate a shorter pulse, the temporal resolution needs to be improved, and a larger temporal window is also required due to even severe temporal stretching of the shorter pulses in the chirp process, resulting in a surge of sample points. When maintaining a high temporal precision for shorter-pulse modeling, LSTM caused the sharp increment in spatial complexity to be unaffordable.
Sign up for Advanced Photonics Nexus TOC Get the latest issue of Advanced Photonics delivered right to you!Sign up now
In this study, an end-to-end model with an active-learning strategy (E2E-AL) is proposed to simulate the complex CPA system rapidly and accurately. The end-to-end framework only focuses on the pulses preceding the stretcher and after the compressor. Therefore, the temporal dimension problem induced by the stretching and step size variations in SSFM is circumvented. The generalization over stretcher dispersions and pump powers is realized through different dedicated embedding methods, which allows the model to converge fast and accurately in the training process. The model is over 2000 times faster than the SSFM-and-RK4–based conventional method (SSFM-RK4) while maintaining decent modeling accuracy. The active learning strategy reduces one-third of the data required to achieve comparable precision. Furthermore, we demonstrate a fast inverse design of the CPA system by combining the genetic algorithm (GA) and the proposed E2E-AL model, where the desired CPA system parameters are automatically searched according to the given target pulse. We anticipate that the proposed method can be adapted for the rapid and accurate modeling of complex physical systems, as well as for the inverse optimization of multi-parameter physical systems.
2 Principles
2.1 Setup of the CPA Simulation
We first establish a CPA simulation system, as concisely illustrated in Fig. 1. The system consists of a seed laser, a chirped fiber Bragg grating (CFBG), two segments of ytterbium-doped fiber (YDF) for pre-amplification, an NKT rod fiber for main amplification, and a pair of gratings for pulse compression. The key parameters of the system are detailed in Table 1.
Figure 1.Concise illustration of the CPA simulation system.
The gain process in the YDF, which involves both linear and nonlinear effects as well as photon energy level propagation, can be described by combining two physical models10: the GNLSE and the REs. The GNLSE can be iteratively solved by the SSFM, and the fourth-order RK4 is used to solve the REs. The GNLSE describes the evolution of the complex electric field envelope of light in a passive fiber.21 A form of the GNLSE that incorporates dispersion, self-phase modulation, self-steepening, and simulated Raman scattering is given in Eq. (1) where is the optical field envelope and represents the central angular frequency. The variables and correspond to the distance and time, respectively. The parameters , , , and denote the second-order dispersion, third-order dispersion, loss, and nonlinear coefficient, respectively. is associated with the slope of the Raman gain.
In the gain fiber, the wavelength-dependent gain is determined by the temporal dynamics of the energy level population. REs account for absorption as well as stimulated and spontaneous emission of photons in the gain process.7 The REs describing the YDF with quasi-three energy22 levels are presented in Eqs. (2)–(4)
The variables , , and represent the spatial coordinate, time, and wavelength, respectively. The variable simultaneously corresponds to the pump, signal, and amplified spontaneous emission (ASE). The symbol denotes the Planck’s constant, whereas is the speed of light in a vacuum. The parameter refers to the upper-state lifetime of the excited energy level. The total dopant number density is represented by , with and indicating the space-time–dependent populations of the lower and upper energy states, respectively. The notation is used to denote the optical power, where the superscripts + and − correspond to the forward and backward propagating beams, respectively. The absorption and emission cross-sections are represented by and , respectively. The geometric overlap factor is , whereas the background losses are characterized by , the group velocity is represented by , and the wavelength resolution is defined as .
The simulation system parameters of the conventional simulation method built on SSFM-RK4 are illustrated in Table 1. The temporal window is 4 ns with a temporal resolution of 20 fs, which is able to capture the dynamic variation of the pulses before and after stretching. The distance of the grating pair is fixed at 0.358 m, providing identical dispersion compensation for all pulses, which aims at simplifying the CPA system and thereby reducing the difficulty of AI training. The compensated second-order dispersion is proportional to the distance of the grating pair as shown in Eq. (5), where is the phase shift, is the angular frequency, is the center wavelength, is the gratings distance, is the speed of the light, and and denote the grating period and the angle of incidence, respectively.
The output pulse obtained from the fixed-length grating pair still retains residual dispersion. Consequently, we further fine-tune the distance of the grating pair for each individual pulse as depicted in Fig. 2(a) to obtain high-quality ultrafast pulses. The dataset for training AI is obtained from the conventional simulation method built on SSFM-RK4. The dataset generation incorporates a range of essential parameters for generalization. These parameters include the duration (), energy (), and chirped coefficients ( and ) of the seed pulse, as well as the dispersion ( and ) induced by the CFBG stretcher and the gains (, , and ) of the three-stage amplification systems, as summarized in Table 1.
Figure 2.(a) Proposed E2E model for simulating the CPA system. (b) Workflow of the active learning strategy.
The structure of the E2E-AL model for modeling the CPA system is shown in Fig. 2(a). The E2E-AL model only focuses on the pulses prior to stretching (the input) and subsequent to compression (the label), where the significant information is concentrated near the center. This allows employing a temporal window of 2 ps for center extraction, thereby greatly reducing the dimensionality of the pulse vector.
First, the pulse complex electric field is converted into a vector by interleaving its real and imaginary components. Second, to achieve fast and effective convergence and better accuracy, different parameters are fed to the model using different embedding layers via different embedding methods (e.g., concatenation and addition). In detail, the dispersion parameters of the stretcher are concatenated to the end of the pulse vector via the dispersion embedding, which is a learnable dense layer. The dispersion is the input in the subsequent amplification processes. Therefore, the concatenation operation is chosen to highlight the influence of input features. Then, the three-stage gains are added to the concatenated vector through the gain embedding, which is another learnable dense layer dedicated to gain generalization. The gains are system parameters in the amplification processes, and the addition operation can emphasize the overall impact of the gains on the input waveform. Third, the cumulative vector is fed into the following dense layers. Afterward, the predicted pulse of the E2E-AL model is padded with zeros at both ends to ensure the spectral resolution consistent with the SSFM-RK4 model. Finally, to decrease the complexity of the model and accelerate the convergence with better performance, the distance of the grating pair is fixed in the dataset, and the label is the pulse after the initial compression. Hence, the predicted pulse undergoes a fine-tuning compression to approach its Fourier transform-limited version.
Active learning is a strategic approach that significantly alleviates the challenges associated with acquiring extensive datasets in data-driven models.23–25 Active learning is particularly advantageous in scenarios where labeled data are scarce or costly to obtain, as it facilitates the efficient acquisition of the most informative data for the learning process. Hence, an active learning strategy is introduced to reduce the demand for the amount of data. Figure 2(b) illustrates the workflow of the active learning strategy based on pool-based sampling methodology.26 The active training begins with a warm-up training. Randomly select a single batch dataset from all training datasets comprising batches to serve as the warming-up dataset for initial training. The trained warm-up model is utilized as the initial evaluation model. Subsequently, we start a loop with times as the solid line in Fig. 2(b). The remaining batches of the dataset are fed to the evaluation model obtained from the previous cycle training, and the batch with the highest error is selected, which is then appended to the augmented training dataset for the next round of training.
To assess the intensity distribution and phase information of the output pulse predicted by the AI model, the normalized root-mean-square error (NRMSE) shown in Eq. (6) is the loss for training the AI model. Here, represents the pulses predicted by the AI model in the interleaved format of the real and imaginary parts, and is the label derived from the conventional model correspondingly. and denote the numbers of test datasets and the vector dimension, respectively. is the difference between the maximum and minimum values of all datasets.
As waveform NRMSE cannot directly reveal the prediction accuracy of key characters (e.g., duration and energy), the performance of the model on predicting these key characters is evaluated using an intuitive metric, mean absolute percentage error (MAPE). The MAPE is shown in Eq. (7), where denotes the number of test datasets, and and denote the key characteristics of the pulses during the inverse design (e.g., the pulse energy and pulse duration) generated from the AI model and the conventional simulation model, respectively.
3 Results
3.1 Performance of the E2E-AL Model
Figure 3 compares the intensity and phase of the output pulses of SSFM-RK4 and the E2E-AL model in both the temporal and spectral domains. Three samples are selected to validate the robustness of the proposed model, with the duration of the output pulses arranged in ascending order from the shortest to the longest. Figures 3(a) and 3(d) demonstrate the results of CPA for an input pulse with a duration of 109 fs, an energy of 662 pJ, and chirped coefficients of and . The pulse is stretched via a CFBG with dispersion coefficients of and and amplified through a three-stage gain fiber with pump powers of 0.74, 5.6, and 57.9 W. It is obvious that the E2E-AL model accurately predicts the intensity and phase after the coarse compression both in the spectral and temporal domains as shown in Figs. 3(a) and 3(d). Then, the distance of the grating pairs is optimized to 0.3593 m in the fine-tuning compression curve to obtain a shorter pulse. As depicted in the Figs. 3(g) and 3(j), the results after fine-tuning compression show a high degree of agreement in terms of the intensity and phase in the temporal domain and spectral domain with the predictions from SSFM-RK4. The temporal intensity curve exhibits residual peak pedestals, which results from residual third-order dispersion. The input pulse durations of two additional samples are 308 and 483 fs. Likewise, the predictions and the corresponding results after fine-tuning compression for the two samples are respectively illustrated in Figs. 3(b)–3(l), all of which demonstrate excellent performance of the E2E-AL model.
Figure 3.Output pulses of the E2E-AL model and SSFM-RK4. (a)–(c) Comparison of the intensity and phase in the spectral domain. (d)–(f) Comparison of the intensity and phase in the temporal domain. (g)–(i) Comparison of the intensity and phase in the temporal domain after fine-tuning compression. (j)–(l) Comparison of the intensity and phase in the spectral domain after fine-tuning compression.
The step size adopted in the distribution process significantly affects the precision of the conventional simulation method. Figure 4 shows a comparative analysis between the SSFM-RK4 with incremental step sizes and the E2E-AL model in terms of intensity error and simulation time. We have set the baseline step size of SSFM-RK4 to 1 mm and then gradually increased it by 1 mm, with a maximum step size of 16 mm. As depicted in Fig. 4, the pulse intensity predicted by the E2E-AL model matches the performance of the SSFM-RK4 at the step size of 6 mm, with an NRMSE of 0.0047 compared with the SSFM-RK4 with the baseline step size. Moreover, in terms of simulation time, the E2E-AL model significantly outperforms the conventional simulation model, which is 13,200 times faster than the baseline step size set in the SSFM-RK4 and 2133 times faster than the step size with equivalent precision used in SSFM-RK4. The E2E-AL model adeptly reconciles the trade-off between accuracy and time in the CPA modeling. Furthermore, Table 2 lists the comparative analysis between the SSFM-RK4 with 1-, 6-, and 16-mm step size sets, and the E2E-AL model, where multiple performance metrics are encompassed, including the computational complexity as reflected by simulation time and memory and the accuracy of the model in terms of the intensity NRMSE, phase NRMSE, energy MAPE, and duration MAPE of the pulse. The E2E-AL model demonstrates superior performance in terms of simulation time, memory efficiency, and precision metrics as listed in Table 2, despite the need for extensive training.
Figure 4.Intensity NRMSE and time consumption comparison between the SSFM-RK4 and E2E-AL model.
Table 2. Comparison between the SSFM-RK4 and E2E-AL model.
SSFM-RK4
SSFM-RK4
SSFM-RK4
E2E-AL
Step size
1 mm
6 mm
16 mm
N/A
Simulation timea
198 s
32 s
13 s
0.015 s
Memoryb
9509 MB
4885 MB
1054 MB
62 MB
Intensity NRMSE
N/A
0.00145
0.02232
0.00108
Phase NRMSE
N/A
0.00104
0.12088
0.00199
Energy MAPE
N/A
0.47%
7.24%
0.51%
Duration MAPE
N/A
0
0
0
Training time
N/A
N/A
N/A
11,254 s
Training sample
N/A
N/A
N/A
45,000
3.2 Effect of Active Learning
We train the E2E model using an active learning strategy (E2E-AL) and without an active learning strategy (E2E), respectively, and compare the predictive accuracy of the models trained by two training strategies on the test dataset. Figure 5 illustrates the error of these two models with SSFM-RK4 for the different total numbers of training samples. As the training data are randomly re-divided and hyperparameters of the model remain fixed under different sample amounts, the error curves in Fig. 5 are not strictly monotonically decreasing with the increase in sample amounts. As the size of the training dataset increases, the waveform NRMSE in testing gradually decreases, and finally, the performance improvement arising from the augmented data becomes negligible. The results indicate that to achieve an equivalent NRMSE, the E2E model necessitates a training dataset always larger than that required by the E2E-AL model. In the E2E model, the highest accuracy is achieved by training with 45,000 sets of samples, whereas with the E2E-AL model, the number of samples required to achieve the same accuracy is reduced to 30,000 sets, a reduction of one-third. Furthermore, under identical conditions of training sample size, the E2E-AL model incorporating active learning consistently outperforms its counterpart without this enhancement.
Figure 5.Results of the active learning. The NRMSE of the E2E (red solid line) and E2E-AL (blue solid line) models under different train samples. The active learning strategy reduces the number of training samples as indicated by the yellow dashed line.
The workflow of the inverse design of the CPA system is shown in Fig. 6. We first generate a Gaussian pulse as the target. Subsequently, the GA is employed to optimize nine parameters simultaneously. Notably, the search ranges of the parameters are physically realizable in the system. Each individual in the population represents a combination of the parameters to be optimized. In each generation of GA, these parameter combinations are fed into the CPA model and the subsequent fine-tuning compression to yield the corresponding compressed pulses. Next, the compressed pulses and target pulse are sent to the evaluation module for fitness calculation, and then, the fitness values are fed back to the GA to proceed with genetic operations, including selection, crossover, and mutation, for generating the new population. Then, the next generation is initiated. The loop is terminated when the number of generations reaches the maximum or when the optimization process converges as the error between the target and actual pulses is reduced to within the acceptable threshold range.
Figure 6.Inverse design workflow of the CPA system.
We compare the E2E-AL model and the SSFM-RK4 with a step size of 6 mm in the inverse design of the CPA system. As GA inherently possesses a certain degree of randomness, the quality of the pulse obtained from the inverse design optimization may vary in different trials. First, we conduct a single-objective optimization design, focusing solely on the duration of the output pulse. The target pulse has a duration of 180 fs. Figure 7(a) presents the comparison among the target pulse, the searched pulse from the inverse design built on the E2E-AL model, and the searched pulse from the inverse design built on SSFM-RK4. The durations of the pulses obtained from the E2E-AL and GA and the SSFM-RK4 and GA are both 180 fs as the target requires. Figure 7(b) shows the error between the most matching pulse of the current generation and the target pulse as iterations increase.
Figure 7.Inverse design comparison between the SSFM-RK4 and E2E-AL under different target pulses. (a) and (b) Single-objective inverse design of the pulse duration. (c) and (d) Multi-objective inverse design of the pulse duration and energy. (a) and (c) Temporal intensity comparison among the target pulse, the result of the SSFM-RK4 and GA, and the result of the E2E-AL and GA. (b) and (d) Error comparison between the most matching pulse of the SSFM-RK4 and GA and the E2E-AL and GA with the target pulse in each generation.
Further, a multi-objective inverse design of both the duration and the energy is performed. As the weight allocation of the two indices is significantly influential to the GA results, we determine the weights via simplified traversing. Concretely, the weight of the pulse duration increases from 0.1 to 0.9, whereas the weight of the energy decreases from 0.9 to 0.1. Then, we use the weights with the smallest error for the multi-objective optimization. The duration and the energy of the target pulse are set to 300 fs and , respectively. The fitness includes not only the RMSE of the temporal intensity between the target and the model-generated pulses but also the error in the sum of the temporal intensities and the error in the maximum temporal intensity. Different weights are assigned to these metrics to reflect their relative importance. The multi-objective inverse design results are shown in Figs. 7(c) and 7(d). The duration and energy of the pulse obtained from the E2E-AL and GA are 240 fs and , respectively. Meanwhile, for the pulse obtained from the SSFM-RK4 and GA, its duration and energy are 200 fs and , respectively. It should be noted that in the context of the two-objective inverse design, we place greater emphasis on the energy aspect. As a result, the energy errors are smaller than the duration errors after inverse design.
Benefiting from the acceleration of the E2E-AL model in every CPA simulation, the E2E-AL-enabled inverse design is an average of 27 times faster than its SSFM-RK4 alternative. However, compared with the incredible acceleration enabled by the E2E-AL model in a single simulation, the superior running speed of the E2E-AL-enabled inverse design seems relatively inferior. This is because the fine-tuning compression occupies considerable running time in the E2E-AL-enabled inverse design, as shown in Table 3.
Table 3. Time consumption comparison of the CPA system inverse design.a
Table 3. Time consumption comparison of the CPA system inverse design.a
E2E-AL and GA
SSFM-RK4 and GA
Model time (s)
0.0025
34.8068
Fine-tuning compression time (s)
1.3189
1.4543
Total time (s)
1.3215
36.2611
4 Conclusion
To summarize, we have proposed an E2E model augmented with active learning (E2E-AL) to simulate the CPA system quickly and accurately. The E2E-AL demonstrates excellent generalization ability across various parameters through different dedicated embedding methods. Under the equivalent modeling precision, the average simulation time of the E2E-AL model is merely , which is an average of 2000 times faster than the conventional SSFM-RK4 method. Moreover, the memory occupied by the E2E-AL model is only 62 MB, which is 153 times smaller than the conventional method. Benefiting from the active learning strategy, the demand for the number of training samples is greatly reduced. Finally, based on the strong generalization ability of the E2E-AL model, the multi-objective inverse design of the CPA system, combining the E2E-AL model with GA, is demonstrated, which is 27 times faster than its SSFM-RK4 alternative. The proposed E2E-AL model with dedicated embedding design holds great potential in realizing the rapid and accurate modeling of complex systems and serving for their multi-objective inverse design.
Acknowledgments
Acknowledgment. This work was supported by the National Natural Science Foundation of China (Grant Nos. 62227821, 62025503, and 62205199).