Generation of multi-channel chaotic signals with time delay signature concealment and ultrafast photonic decision making based on a globally-coupled semiconductor laser network

Yanan Han; Shuiying Xiang; Yang Wang; Yuanting Ma; Bo Wang; Aijun Wen; Yue Hao

doi:10.1364/PRJ.403319

1. INTRODUCTION

Since its advent, the laser has been applied in many fields due to the advantages of rapid response and rich dynamics [1]. For example, it is used in high-speed random bit generators [2, 3], optical secure communication, and secret key distribution that requires synchronized chaotic signals [4 - 7]. Recently, photonic technologies have also been developed as efficient ways of solving some conventional problems in the area of artificial intelligence (AI) calculation such as reservoir computing [8, 9], reinforcement learning [10 - 12], and brain-inspired photonic neuromorphic computing [13 - 16].

The security of information transmission has always been a focus of attention. In optical communication systems, chaotic signals can be generated by means of delayed optical feedback, optical injection, and other external disturbances [17 - 22]. However, a time delay signature (TDS) can be introduced (typically by external cavity feedback) and cause internal periodicity of chaotic oscillations [23, 24]. This feature can be analyzed by methods like permutation entropy (PE), delayed mutual information, autocorrelation functions (ACF), etc., and utilized for reconstruction of chaotic systems [25 - 29], which seriously threaten the security of communication. Many methods have been reported to complicate and suppress the TDS. For example, Lee et al. first proposed to complicate the TDS in a semiconductor laser (SL) subject to double optical feedback [30], and the result was experimentally demonstrated later by Wu et al. [31]. We also numerically achieved the suppression of TDS in a mutually coupled ring network with heterogeneous time delays [32]. Very recently, Jiang et al. proposed a new scheme for the generation of wideband laser chaos with excellent TDS suppression by using parallel-coupling ring resonators as reflector [33].

As one of the fundamental problems in reinforcement learning, adequate decision making in a dynamically changing environment is also required in frequency and channel assignments in communication networks [12, 34, 35]. The multiarmed bandit (MAB) problem is one of the most important issues in decision making. One remarkable method to solve the MAB problem was proposed by Kim et al., called the tug-of-war (TOW) method, which was inspired by the unicellular amoeba of true slime mold [36, 37]. In recent years, several works on ultrafast decision making have been reported based on the TOW method [38 - 41]. In our previous work, we have already proposed to solve a four-armed bandit problem in parallel by sampling dual-channel TDS-concealed chaotic signals simultaneously and found it works more efficiently [42]. However, the threshold value (TV) for each channel is set and adjusted dependently; therefore, the scheme is not completely parallel.

In this paper, we propose a scheme for the generation of laser chaos with TDS concealment and demonstrate its application in reinforcement learning. Our contribution includes three aspects. First, the new proposed scheme for the generation of complex laser chaos is simple in structure and easy to implement. Second, we propose a scheme to solve the MAB problem in parallel via using the generated laser chaos and verify its scalability and adaptability. Third, in order to solve the MAB problem in parallel, we propose a modified strategy and demonstrate its effectiveness.

2. SYSTEM MODEL AND RESULTS

A. Experimental Setup

The experimental setup of three globally coupled SLs is presented in Fig. 1 . Here, three distributed feedback (DFB) lasers are driven by laser diode controllers (LDCs) to control the current and temperature of the SLs. The wavelengths of free-running DFB lasers are precisely matched by adjusting the current and temperature. In this setup, the optical output from each DFB laser is divided into two parts through a 10:90 fiber coupler (FC). The smaller part is sent to the measure module, where the optical signal can be detected by a high-speed photodiode (PD, HP11982A, 15 GHz) and analyzed by a real-time oscilloscope (OSC) with 8-bit analog-to-digital converter (Keysight DSOV334A, 33 GHz, 80 GS/s), or directly sent to an optical spectrum analyzer (OSA, AndoAQ6317). The rest of the parts are combined into one with an FC through fiber jumpers with different lengths, then pass through a variable optical attenuator (VOA), and feed back to all the three DFB lasers via an optical circulator (OC). Thus, the coupling strength and feedback strength can be adjusted simultaneously by the VOA. For simplicity, they are referred to as coupling strength in the following.

Figure 1.Experimental setup of three globally coupled SLs. DFB1, DFB2, DFB3, three distributed feedback lasers; LDC; laser diode controller; FC, fiber coupler; OC, optical circulator; VOA, variable optical attenuator; $τ_{11}, τ_{22}, τ_{33}$ , feedback delay time; PD, photodiode; OSC, oscilloscope; OSA, optical spectrum analyzer.

Download full size

View all figures

B. Experimental Results

C_{m} (Δ t) = \frac{⟨ [I_{m} (t + Δ t) - ⟨ I_{m} (t + Δ t) ⟩] [I_{m} (t) - ⟨ I_{m} (t) ⟩] ⟩}{\sqrt{⟨ {[I_{m} (t + Δ t) - ⟨ I_{m} (t + Δ t) ⟩]}^{2} ⟩ ⟨ {[I_{m} (t) - ⟨ I_{m} (t) ⟩]}^{2} ⟩}},

ρ_{m}

Δ t

Figure 2.(a1)–(a3) The chaotic time series from the three DFB lasers; (b1)–(b3) the ACFs; (c1)–(c3) the power spectra. The attenuation is 9 dB, $I_{1}, I_{2}, I_{3} = 28.34, 24.5, 26.6 mA$ , $T_{1}, T_{2}, T_{3} = 27.75, 15.5, 18 ° C$ .

Download full size

View all figures

ρ_{m}

Figure 3.(a) $ρ_{m}$ as a function of attenuation; (b) $ρ_{m}$ as a function of $I_{2}$ .

Download full size

View all figures

Figure 4.(a1)–(a3) Time series of signals at states I, II, and III, respectively; (b1)–(b3) the corresponding power spectrum.

Download full size

View all figures

C. Numerical Results

E_{m} (t)

In Fig. 5, we present the time series, the ACF, and the power spectrum of the numerical results as in Fig. 2 . The results show that the TDS can be concealed in such a scheme if the parameters are properly selected. Note that the mismatch of parameters is important to improve the concealment of the TDS. When the currents are the same for the three SLs, the region in which the TDS is concealed is quite narrow. To find a proper bias current, we can fix the currents of two SLs and change the other. In this way, we find that a current mismatch of 0.5-3.5 mA allows better TDS concealment in all three SLs. We choose a mismatch of 2.5 mA.

Figure 5.(a1)–(a3) The chaotic time series from the three SLs; (b1)–(b3) the ACFs, (c1)–(c3) the power spectra. The parameters are: $I_{m} = 20, 22.5, 20 mA$ ; $k_{r m} = 11.7, 16.7, 11.7 {ns}^{- 1}$ ; $τ_{m m} = 2, 2.02, 2.04 ns$ ; $m = 1, 2, 3$ .

Download full size

View all figures

ρ_{m}

Figure 6.(a1)–(a3) The two-dimensional map of $ρ_{m}$ as functions of the coupling strength $k_{r 2}$ and bias current $I_{2}$ of DFB1, DFB2, and DFB3, respectively; (b1)–(b3) the PE of DFB1, DFB2, and DFB3, respectively. $I_{1} = I_{3}, I_{2} = I_{1} + 2.5 mA$ ; $k_{r 1} = k_{r 3}, k_{r 2} = k_{r 1} + 5 {ns}^{- 1}$ ; $τ_{m m} = 2, 2.02, 2.04 ns$ .

Download full size

View all figures

ρ_{m}

Figure 7.TDS concealment with different time delays. $I_{1} = I_{3}$ . (a) $I_{2} = I_{1} - 1 mA$ , $k_{r m} = 12,10, 12 {ns}^{- 1}$ , $τ_{m m} = 3, 3.02, 3.06 ns, m = 1, 2, 3$ ; (b) $I_{2} = I_{1} - 1 mA$ , $k_{r m} = 12,11, 12 {ns}^{- 1}$ , $τ_{m m} = 3, 3.1, 3.2 ns$ , (c) $I_{2} = I_{1} - 2 mA, k_{r m} = 13.3, 12.3, 13.3 {ns}^{- 1}$ ; $τ_{m m} = 4, 4.07, 4.13 ns$ , (d) $ρ_{m}$ as a function of $τ_{11}$ . $τ_{22} = τ_{11} + 0.3 ns$ , $τ_{33} = τ_{11} + 0.7 ns$ , $I_{m} = 21,19, 21 mA$ , $k_{r m} = 12,10, 12 {ns}^{- 1}$ , $m = 1, 2, 3$ .

Download full size

View all figures

3. APPLICATION IN DECISION MAKING

In this section, we utilize the triple-channel chaotic signals generated from the above scheme to solve an eight-armed bandit problem in parallel. By choosing one of eight slot machines, there is a chance of getting a reward. The reward probabilities are different and unknown to users [40]. Users need to explore the slot machines to find the one that has the highest reward probability, which we call the target machine. Due to the trade-off known as the exploration-exploitation dilemma [40, 41], the exploration needs to be effective so that the target machine can be found as quickly as possible and without the risk of missing it.

A. Scheme of Solving MAB Problem in Parallel

N

Figure 8.Architecture for the eight-armed bandit problem processed in parallel based on triple-channel chaos.

Download full size

View all figures

B. Threshold Value Adjustment

{TH}_{i} = k ⌊ {TV}_{i} ⌋, i = 1, 2, 3

t

t

N_{D_{i}} = k, total

C. Results and Discussion

CDR = N_{hit} / N_{total}

C_{m n} (Δ t) = \frac{⟨ [I_{m} (t + Δ t) - ⟨ I_{m} (t + Δ t) ⟩] [I_{n} (t) - ⟨ I_{n} (t) ⟩] ⟩}{\sqrt{⟨ {[I_{m} (t + Δ t) - ⟨ I_{m} (t + Δ t) ⟩]}^{2} ⟩ ⟨ {[I_{n} (t) - ⟨ I_{n} (t) ⟩]}^{2} ⟩}},

C_{12} (Δ t)

Figure 9.Evolution of CDR for the triple-channel signals with different correlations and for the one-channel scheme. The vertical bars indicate the standard deviation around the mean value for three sets of simulated signals. $P = [0.2,0.2,0.8,0.2,0.2,0.2,0.2,0.2]$ .

Download full size

View all figures

Next, we compare the decision-making performance of the one-channel scheme and the triple-channel scheme by calculating the CC with different sampling intervals. The results are illustrated in Fig. 10 . It can be seen that for both schemes, it converges quickly when the sampling interval is as small as 10 ps, which requires the highest sampling rate that is currently available, but slows down with the increase of sampling interval. Hence, we choose a sampling rate of 10 ps in the following. Also note that the CC value of the triple-channel scheme is statistically lower and grows more slowly than that of the one-channel scheme, which means that in the proposed scheme, it can converge more quickly to the desired accuracy, and the performance is relatively stable against the variation of sampling interval. Note that in Fig. 10 and the following, the CC value of the one-channel scheme is the average of the results of three channel signals.

Figure 10.CC with different sampling intervals for the one-channel and triple-channel schemes, respectively. The vertical bars indicate the standard deviation around the mean value for eight sets of simulated signals. $P = [0.8,0.2,0.2,0.2,0.2,0.2,0.2,0.2]$ .

Download full size

View all figures

ρ_{m}

Figure 11.(a), (b) CC and $ρ_{m}$ as functions of attenuation; (c) CDR as a function of learning cycles. The vertical bars indicate the standard deviation around the mean value for 11 sets of signals with $ρ_{m} < 0.2$ and $ρ_{m} > 0.3$ , respectively. The sampling interval is 10 ps. $P = [0.3,0.2,0.8,0.1,0.2,0.3,0.5,0.4]$ .

Download full size

View all figures

Next, we compare the decision-making performance of the one-channel scheme, the previously proposed parallel scheme [42], and the triple-channel scheme by calculating the CC, where experimentally generated signals with different bias currents are adopted. The results are illustrated in Fig. 12 . Triple-channel1 and Triple-channel2 represent the new scheme and the previously proposed scheme, respectively. Three channels of signals are used to solve the eight-armed bandit problem. However, in the Triple-channel2 scheme, the adopted algorithm for threshold adjustment is the same as in the one-channel scheme. It can be seen that for both the triple-channel schemes, the CC is quite stable against the variation of bias current, and the performance is quite similar, whereas for the one-channel scheme, it takes more cycles to reach the desired CDR, and the CC value fluctuates more obviously with the change of bias current, indicating that the one-channel scheme may be more sensitive to the dynamics of signals.

Figure 12.CC as a function of bias current, for a comparison of the triple-channel scheme (red solid line), the previously investigated parallel scheme (blue dotted line), and the one-channel scheme (black solid line). The vertical bars indicate the standard deviation around the mean value for three runs. $P = [0.3,0.2,0.8,0.1,0.2,0.3,0.5,0.4]$ .

Download full size

View all figures

{TH}_{1}

Figure 13.(a) Evolution of the CDR for different distributions of reward probability in a changing environment. $P_{1} = [0.8, 0.2,0.2,0.2,0.2,0.2,0.2,0.2]$ , $P_{2} = [0.7,0.2,0.3,0.2,0.2,0.2,0.2,0.2]$ . (b) Threshold value adaption for P₂.

Download full size

View all figures

k

Figure 14.Evolution of the averaged CDR with randomly selected signals for eight-armed and 16-armed bandit problems. $P = [0.3,0.2,0.8,0.1,0.2,0.3,0.5,0.4]$ , and $P = [0.3,0.2,0.8,0.1,0.2,0.5, 0.2,0.2,0.2,0.3,0.3,0.4,0.5,0.1,0.1,0.2]$ , respectively.

Download full size

View all figures

4. CONCLUSION

In conclusion, we propose a simple scheme of achieving triple-channel chaotic signals with TDS concealment and demonstrate it via experiment and numerical analysis. The parameters’ range that contributes to better TDS concealment is explored by systematically changing the bias current and the coupling strength. Moreover, we utilize the generated triple-channel chaotic signals and a modified strategy for the realization of an eight-armed bandit problem in parallel; the influences of the signal correlation between each channel, the TDS concealment, and the sampling interval on the performance of decision making are investigated. In the proposed decision-making scheme, the simplified algorithm compared with the one-channel scheme and the previously studied parallel scheme makes it easier for implementation. However, it can perform even better given that the mutual-correlation is relatively low. Moreover, it has stabler performance for different sampling rates than the one-channel scheme. The proposed system is scalable to varying size of MAB problems and is adaptable in changing environments. This work may be helpful for potential applications in the ultrafast processing of AI.

Category: Lasers and Laser Optics

Received: Jul. 21, 2020

Accepted: Sep. 3, 2020

Published Online: Oct. 29, 2020

The Author Email: Shuiying Xiang (jxxsy@126.com)

DOI:10.1364/PRJ.403319