Streamlined photonic reservoir computer with augmented memory capabilities

Changdi Zhou; Yu Huang; Yigong Yang; Deyu Cai; Pei Zhou; Kuenyao Lau; Nianqiang Li; Xiaofeng Li

doi:10.29026/oea.2025.240135

Introduction

The rapid advancement of artificial neural networks (ANNs) has significantly contributed to the progress of artificial intelligence (AI), demonstrating unparalleled competitiveness in fields such as image processing, game playing, and protein structure prediction¹⁻⁴. Although conventional software-based ANNs rooted on the von Neumann computing architecture embody successfully mimicked human cognitive abilities in complex tasks, they confront several formidable challenges, especially in energy consumption and processing speed⁵, as dictated by Moore's Law. Therefore, ANNs based on physical platforms, including electronics⁶⁻⁸, spintronics⁹⁻¹¹, photonic hardwares¹²⁻¹⁷, and others¹⁸⁻²¹, offer alternative solutions to meet the extensive computing demands of AI. Photonics-based technology, in particular, is a capable candidate due to its extremely fast processing speed and ultra-low power consumption^16,22−24. Significantly, reservoir computing (RC), a simplified machine learning scheme derived from recurrent neural networks²⁵⁻²⁷, is known as an amenable paradigm to analog implementation because only the RC output layer requires simple algorithm-based training. This desirable feature significantly reduces the complexity and cost of the training procedure, making it a hardware-friendly and high-speed computing implementation alternative. Especially, many photonic RC schemes can be further simplified by the so-called time-delay RC (TDRC), which only contains a single physical node with a time-delay feedback loop (FL)²⁸, employing time-multiplexing to establish virtual nodes in contrast with the need of large-scale physical nodes in the spatial RC. The fading memory capability provided by the FL enables such TDRC a unique advantage in handling time-dependent tasks.

However, FL significantly influences both the network flexibility and capabilities of TDRC, posing challenges for hardware implementation. The introduction of FL provokes a multifaceted trade-off, typically induces complex nonlinear dynamics within the system, necessitating precise parameters control, e.g., external optical injection or frequency detuning, to stabilize the output of semiconductor laser-based RCs. Furthermore, the length of FL has a significant impact on the performance of TDRC. Extending the delay line to accommodate more neurons also results in a larger footprint and reduced information-processing rates²⁹, whereas a shorter delay line compromises the computing accuracy due to the limited network size. These issues have prompted the exploration of novel structures, such as next-generation RC (NG-RC)^30,31. The NG-RC obviates the need for a reservoir, directly constructing output feature vectors through various nonlinear combinations of both present and historical data. This approach has demonstrated impressive performance across multiple tasks by achieving reduced training and warm-up times. However, it compromises physical openness, thus presenting challenges for neuromorphic implementations. Moreover, feedback-free RC (FFRC) or extreme learning machines have been also proposed and demonstrated^32,33. Account of the absence of the FL, which leads to the loss of a fading memory, conventional FFRC may showcase equivalent performance in discrete data processing, e.g., data recovery or image classification³⁴⁻³⁶, but poor performance in time-related or high-memory requirements tasks. To compensate for the loss of memory ability, Takano et al. preliminarily introduced a weighted sum of past data to the original one and successfully achieved the satisfied prediction performance of the Santa-Fe time series³⁷, but without accounting for the connection between independent sampling-period data. Zeng et al. theoretically proposed three different pre-processing and post-processing methods based on Takano’s work³⁸. Besides, the required FL or memory capability can also be provided through the inherent characteristics of some specific physical devices. Phang reported a novel RC based on an optical-fiber and loop-free kernel configuration utilizing the intrinsic memory property of stimulated Brillouin scattering³⁹. In Zhang et al.’s work, the pulse broadening effect caused by dispersion in optical fiber provides a short-term fading memory⁴⁰. This undoubtedly imposes specific requirements on hardware components. In light of the pressing demands for hardware compatibility and high integration, it is crucial to explore novel neural networks with flexible yet simple structures for high-speed and high-performance operations.

In this work, we propose a streamlined photonic RC with augmented memory capabilities based on the designed quasi-convolution coding (QC). Through analyzing and optimally selecting key parameters of QC, such QC-based RC (QRC) can acquire fading memory from encoding data, analogous to the function of FL in TDRC, thereby significantly simplifying the processing system by obviating FL. For such a photonic neural network, we employ a vertical-cavity surface-emitting laser (VCSEL) featuring a low threshold current and dual polarization modes to facilitate low-power parallel processing. Its compact size also offers significant advantages in terms of integration. Notably, the dual modes of the VCSEL enable parallel data insertion and processing, where different data can be loaded simultaneously, allowing processing latency by half⁴¹. Herein, the crosstalk between X polarization mode (XM) and Y polarization mode (YM) establishes a connection between the original and encoding data, imitating the interaction between the input and the memory in human brains⁴², which greatly benefits time-related tasks. The prominent performance of QRC is confirmed theoretically and experimentally through detailed comparisons with the conventional TDRC across several benchmark tasks. Additionally, we compare this work with existing experimental results for TDRC based on semiconductor lasers, further demonstrating the feasibility of our proposed scheme. Therefore, our approach might provide a viable alternative for the hardware implementation of easily integrated high-performance RC systems.

Methodology

Algorithms

Quasi-convolution coding

Traditional convolutional coding (CC) involves multiplying the source pixel with the convolution kernel and summing up to obtain the target pixel. The convolution kernel then slides in a predetermined direction and repeats the above operation to generate all outputs. This encoding method has proven highly effective in extracting data features, especially in machine vision⁴³.

Drawing on the operational criterion of CC and the remarkable memory framework of the human brain, where the neuronal responses can be maintained by a pure feedforward mechanism⁴⁴, we propose a QC algorithm, detailed in Fig. 1(a−c). Figure 1(a) illustrates the weighted sum process in QC. Here, the input is firstly transformed into a one-dimensional (1-D) vector via matrix transformation to facilitate the encoding process, while the sliding step is designed based on the step coefficients β and sampling period T. Figure 1(b) illustrates the input or sliding direction of the original data (blue line) and the convolution kernel (red line). The convolution kernel coefficients of j-th sliding c_j (j=1, 2, …, Q) gradually decrease after each sliding to mimic the fading memory of the human brain, determined by the size of kernel Q, which is designed as c_j=(Q+1−j)/ $\sum_{1}^{Q}$ and the denominator of c_j means the sum of integers from 1 to Q. Figure 1(c) vividly showcases the stretch-out view of QC. We simplify the encoding process by sliding through the entire sampling period, which is equivalent to encoding individual points separately, and thus, can be expressed as:

Figure 1.Concept of QC and RC structures. (a−c) Design of the proposed QC, which has a similar operational criterion to convolutional coding in data processing, but meanwhile can extract features in the temporal dimension and provide memory capability. (d−f) Schematic diagram of different RC structures. The comparison of the nodes’ states between (d) Spatial RC, (e) TDRC and (f) proposed QRC verifies that the encoding data will provide the memory capability through QC. NL, nonlinear nodes.

Download full size

View all figures

$s_{j} (t) = s_{0} [t - j (β T)],$ (1)

$s_{M} (t) = \sum_{j = 1}^{Q} c_{j} s_{j} (t),$ (2)

where t represents the continuous time, s₀(t) is the original data, which can be a time-continuous input stream or time-discrete input, and we take the former as an example here. The stride of the kernel is jointly determined by the step coefficient β and single sampling period T, represented as βT, i.e., the product of them. The s₀[t−j(βT)] in Eq. (1) indicates that the original data s₀(t) has slid j times with βT as the step size, in the time dimension. Each slid data s_j(t) is firstly multiplicated with the corresponding convolution kernel coefficients c_j and then summed up to yield the final encoding data s_M(t), from j equals to 1 to the kernel size Q, where j is a positive integer.

As we know, the size of convolution kernels and stride, which define the receptive field of the filter and step size for moving the filter across the input, have a crucial influence on CC performance, and in the subsequent research analysis, we shall also focus on their effects on QC. Through QC, we can establish the connection of information across each independent period and extract features in the time dimension, where s_M(t) endows the system with a memory capability akin to human brain memory.

Reservoir computing

In general, the RC implementation has two main approaches: spatial RC and TDRC as depicted in Fig. 1(d) and 1(e), respectively. The temporal evolution of the reservoir state in the spatial RC can be described as follows [Fig. 1(d)]⁴⁵:

$X (n) = f [W_{in} u (n) + W_{res} X (n - 1)],$ (3)

where n and the N_in dimensional vector u(n) stand for the discrete time and input vector, respectively. The N_res dimensional vectors X(n) and X(n−1) denote the state vectors of the neuron nodes in the reservoir layer at the current time and previous moment, respectively. While W_in is a complex N_res×N_in matrix indicating the input-to-reservoir connections, W_res is a complex N_res×N_res matrix accounting for the weight matrix of the internal connections of the reservoir. The function f represents the activation function (AF) in the reservoir layer, e.g., sigmoid.

This high-dimensional state space can also be generated in a time-delay dynamic system as follows [Fig. 1(e)]⁴⁶:

$\dot{X} (t) = f [t, X (t), X (t - τ)],$ (4)

$X (n) = f [P (n) + R (n) X (n - \frac{τ}{θ})],$ (5)

where τ is the delay time, and $\dot{X}$ (t) represents the derivative of X(t) relative to dimensionless t. The trained output weights W_out of the TDRC rely on the transient response matrix obtained from the discrete sampling of the reservoir. So we can transform Eq. (4) into Eq. (5) by discrete sampling to describe the state of the reservoir layer. θ is the interval between virtual nodes, and R(n) reflects the internal connections of the reservoir. Herein, the virtual nodes are obtained by sampling the transient response of the reservoir layer within the single sampling period T, i.e., the duration of one signal. The number of virtual nodes m refers to the number of the hidden layer’s transient responses, obtained through sampling with equal interval θ. The number of virtual nodes m, virtual node interval θ, and single sampling period T satisfy m=T/θ. The original data P(n) is obtained from the product of the mask M(n) and the input S(n). Notably, the existence of X(n−τ/θ) in Eq. (5) provides memory capability for the TDRC.

In Fig. 1(f), we demonstrate that the proposed QRC can also provide prominent memory ability leveraging the encoding data acquired through QC, without the need for FL. The current state of the QRC in the hidden layer X_tot(n)=[X(n), X_enc(n)] can be described by:

$X_{tot} (n) = f [P (n) + R_{2} (n) X_{enc} (n) + P_{enc} (n) + R_{1} (n) X (n)],$ (6)

where P_enc(n) and X_enc(n) are the encoding data and corresponding state of the reservoir, respectively. The square bracket [ ] means the merging of arrays. In the QRC, X(n)=f[P(n)+R₂(n)X_enc(n)], X_enc(n)=f[P_enc(n)+R₁(n)X(n)], while R₁ and R₂ represent the connections between nonlinear nodes just like the synaptic connection weights between neurons. As depicted in Eq. (6), P_enc(n) and X_enc(n) provide the required memory capability for this feedback-free configuration. The subsequent simulations and experiments will verify the superiority of the proposed QRC, that is, this extremely simplified neural network, with a flexible parameter configuration, possesses augmented memory capability.

Simulation model

Figure 2(a) illustrates the VCSEL-based schematic architecture of both the TDRC and the QRC. The final output Y_out is derived using output weights W_out, trained from the transient response matrices of XM and YM [V_x, V_y], where the crosstalk effect between these dual polarization modes simulates the subtle memory interactions in the human brain effectively.

Figure 2.Schematic diagram of the TDRC and the QRC, as well as their physical implementation based on the VCSEL. (a) Schematic architectures of the TDRC and QRC based on the VCSEL. Experimental setup of (b) the TDRC and (c) the QRC. PC, polarization controller; Att, attenuator; MZM, Mach-Zehnder Modulator; DL, delay line; EDFA, erbium-doped fiber amplifier; OBPF, optical bandpass filter; PBS, polarization beam splitter; PD, photodetector; AWG, arbitrary waveform generator; OSA, optical spectrum analyzer; OSC, oscilloscope; Ch, channel. The blue (red) line represents the optical (electrical) connection. (d) Optical spectra of the VCSEL. The black line represents the optical spectra of the free-running VCSEL. The red line represents XM and YM separated by PBS and PC.

Download full size

View all figures

Herein, we use the renowned spin-flip model to analyze the nonlinear dynamics of the VCSEL with optical injection, whose rate equations can be modified as⁴⁷:

$\begin{array}{l} \frac{d E_{x}}{d t} = & κ (1 + i α) (N E_{x} - E_{x} + i n E_{y}) \\ - (γ_{α} + i γ_{p}) E_{x} + k_{inj} ϵ_{x} (t) + F_{x}, \end{array}$ (7)

$\begin{array}{l} \frac{d E_{y}}{d t} = & κ (1 + i α) (N E_{y} - E_{y} - i n E_{x}) \\ + (γ_{α} + i γ_{p}) E_{y} + k_{inj} ϵ_{y} (t) + F_{y}, \end{array}$ (8)

$\frac{d N}{d t} = γ_{N} [μ - N (1 + {| E_{x} |}^{2} + {| E_{y} |}^{2}) + i n (E_{x} E_{y}^{*} - E_{y} E_{x}^{*})],$ (9)

$\frac{d n}{d t} = - γ_{s} n - γ_{N} [n ({| E_{x} |}^{2} + {| E_{y} |}^{2}) + i N (E_{y} E_{x}^{*} - E_{x} E_{y}^{*})],$ (10)

where E_x and E_y represent slow-varying complex electric field amplitudes of XM and YM, respectively, N stands for the total population inversion between the conduction and valence bands, and n accounts for the difference between the carrier reversal with opposite spins. The last terms in Eqs. (7) and (8) are the spontaneous emission noises described by the Langevin sources, which can be written as⁴⁸:

$F_{x} = \sqrt{β_{sp} / 2} [\sqrt{(N + n)} ξ_{1} + \sqrt{(N - n)} ξ_{2}],$ (11)

$F_{y} = - i \sqrt{β_{sp} / 2} [\sqrt{(N + n)} ξ_{1} - \sqrt{(N - n)} ξ_{2}],$ (12)

where the spontaneous emission rate β_sp is set to 10⁻⁶ ns⁻¹, and ξ_1,2 represents independent Gaussian white noise of unitary variance and zero mean value. The injected terms are described in the third term in Eqs. (7) and (8), k_inj stands for the injected strength and ε_x,y(t) is the output of Mach-Zehnder Modulator (MZM) described as^49,50:

$ϵ_{x, y} (t) = \frac{| ϵ_{0} |}{2} {1 + e^{i [P_{,enc} (t) + Φ_{0}]}} e^{i 2 π Δ f_{x, y} t},$ (13)

where |ε₀| represents the amplitude of the injection field, Φ₀ is the bias voltage of MZM, and Δf_x (Δf_y) is the frequency detuning between the injection field ε_x(t) [ε_y(t)] and the XM (YM) of the VCSEL. For simplicity, we set Δf_x=Δf_y=Δf.

Note here that the rate equations of Eqs. (7) to (10) are solved using a fourth-order Runge-Kutta algorithm with a time step of 2 ps, and the main simulation parameters are tabulated in Table 1⁴¹. The nonlinear processes described in Eqs. (7) to (10) demonstrate the implementation of the nonlinear function f in Eq. (6). In this work, we select three commonly used benchmark tasks to evaluate system performance, including chaotic time-series prediction, nonlinear channel equalization, and memory capacity. The normalized mean square error (NMSE) between the target and predicted values is utilized to assess the accuracy of model predictions; the symbol error rate (SER), defined as the ratio of the error recognition number to the total testing number, represents the classification ability of the system for discrete signal in the channel equalization task; the memory capacity (MC) reflects the retention for past input signals, which is beneficial in processing time-dependent tasks (detailed in Supplementary information, Section 1).

Table 1. Some key parameters of the VCSEL used.

View table

View all Tables

Table 1. Some key parameters of the VCSEL used.

Symbol	Parameter	Value
κ	Field decay rate	300 ns⁻¹
α	Linewidth enhancement factor	3
γ_α	Linear dichroism	0.1 ns⁻¹
γ_p	Linear birefringence	10 ns⁻¹
γ_N	Decay rate of N	1 ns⁻¹
γ_s	Spin-flip rate	50 ns⁻¹
μ	Normalized bias current of the VCSEL	1.01
\|ε₀\|	Injection field amplitude	1
Φ₀	Bias voltage of the MZM	0 V
Δf	Frequency detuning	0 GHz
θ	Virtual nodes interval	2×10⁻¹¹ s

Experimental setup

Figure 2(b) and 2(c) show the experiment setup for the TDRC and the proposed QRC, respectively. Two tunable lasers (TLs, i.e., TLD-C20 and NLC13) serve as the drive lasers providing two optical carriers, whose amplitudes are adjusted by attenuators and polarizations are aligned with the MZMs through polarization controllers (PCs). Here, the original signal P(n) obtained from S(n) masked by a binary random mask {−1, 1} and encoding signal P_enc(n) are generated from two arbitrary waveform generators (AWGs, i.e., AWG70001B and 81150A), and then loaded on the two optical carriers via MZMs. It should be noted that the original signal P(n) and the QC-encoded signal P_enc(n) in this work are firstly created on a personal computer during the preprocessing phase and then sent to the AWGs to generate the input in the electrical domain. However, the entire encoding process can be expected to be implemented in hardware. For instance, a Field-Programmable Gate Array (FPGA) can be utilized to function as an adder and multiplier, thereby executing the complete encoding process. By configuring specific kernel coefficients within the FPGA, which are multiplied with the slid original signals and subsequently summed up, the encoded signals are generated. The dual paths of the modulated light, which are realigned with the polarizations of the XM and YM of the off-the-shelf VCSEL separately, are combined with a 50∶50 optical coupler (OC). The main difference between the two experimental setups shown in Fig. 2(b) and 2(c) relies on the existence of the FL in Fig. 2(b), which additionally contains a delay line (DL) used to control the delay time τ, as well as an attenuator used to adjust the FL strength. The output of the VCSEL can be considered as 70% light of the OC [TDRC in Fig. 2(b)], or the Port 3 of the circulator [QRC in Fig. 2(c)], is first attenuated to within the working range of the erbium-doped fiber amplifier (EDFA) before being amplified, and then filtered by the optical bandpass filter (OBPF, WLTF-NM-S-1550-60/0.8-SM-0.9/1.0-FC/APC) to suppress the noise caused by optical injection. Herein, we use the polarization beam splitter (PBS) combined with the last PC, by observing the center wavelength of the two optical paths through the optical spectrum analyzer (OSA, AQ6370D), to separate the XM and YM of the VCSEL into two independent optical paths [shown in Fig. 2(d)]. The transient responses of the reservoir state are sampled through a real-time oscilloscope (OSC, WaveMaster 820Zi-B) after being detected by the photodetectors (PDs). In the experiment, considering the bandwidth limitation of the used AWG, we set θ=1 ns and the number of neurons in each mode m=100, so that the sampling period is T=100 ns (T=θ×m), and the feedback delay time is τ=T. Due to the two polarization modes of VCSEL, the total neuron number is 2m. Additionally, the sampling rates of the AWG and OSC are 1 GSa/s and 40 GSa/s, respectively.

Results

Simulation results

During the simulation phase, we initially compare the performance of the TDRC and QRC, to identify their optimal parameter space. We establish the FFRC as an additional control group to highlight the importance of encoding data. We then systematically analyze the impact of two key parameters of QC on the QRC, including the size of kernel Q and the step coefficient β (similar to the kernel size and stride in CC), and also determine the optimal parameter space. Additionally, we explore the impact of AF in post-processing along with the injection methods of encoding data, where the sigmoid function is selected as the AF in the output layer to nonlinearly transform the transient response matrix, i.e., an array composed of equidistant samples of the laser output, into an extended new matrix, which is used to enrich the neural representation by increasing the number of virtual nodes.

Performance comparison of different RCs

On account of the existence of the FL, which will enrich the dynamics of the VCSEL⁵¹, the bifurcation diagram with k_inj as the control parameter is given in Fig. 3(a). When the feedback strength is k_d=20 ns⁻¹, it can be clearly seen that the VCSEL is operating in the chaotic state during the range of k_inj ∈ [2 ns⁻¹, 16 ns⁻¹], and only when k_inj > 16 ns⁻¹, the VCSEL can be stabilized again, i.e., staying in a stable region, in which good performance of RCs can be guaranteed²⁸. After eliminating the FL, that is, by setting k_d=0 ns⁻¹, the stable region is remarkably broadened, which may enable the proposed QRC to exhibit the desired performance in a much wider parameter space. Figure 3(b−d) reveal the significant differences between these three kinds of RCs on benchmark tasks, while the detailed results of the chaotic time-series prediction are more intuitively depicted in Fig. 3(e). Compared with the TDRC, the FFRC and QRC are much more insensitive to changes of k_inj due to the absence of the FL. As expected, TDRC will perform satisfactorily under moderate feedback intensity due to the rich dynamics brought by the FL, while excessive self-feedback strength can make the VCSEL enter the chaotic region from the stable state and severely damage its performance. Note here that a lower NMSE or SER means the better performance of the RC, while a higher MC means that the RC possesses stronger memory ability. Remarkably, our proposed QRC with a streamlined structure can still exhibit competitive performance, that is, comparable to or outperforming the conventional TDRC, especially in terms of memory ability. This can be attributed to the combination of the proposed QC method and the intrinsic crosstalk effect between the two orthogonal polarization modes of the VCSEL.

Figure 3.(a) Bifurcation diagram with k_inj as the control parameter of the VCSEL. In all panels, the extrema (maxima and minima) of the intensity time series are shown as dots. (b−d) showcase the performance difference of the FFRC, TDRC and QRC, based on the benchmark tasks motioned before. With Q=6 and β=1.6 in (b); Q=9 and β=0.9 in (c); Q=39 and β=0.7 in (d). The injection power of the TDRC is set at 20 ns⁻¹. The detailed results of the chaotic time-series prediction are further demonstrated in (e), with k_inj=20 ns⁻¹ for FFRC; k_inj=20 ns⁻¹ and k_d=18 ns⁻¹ for TDRC; Q=6, β=1.6 and k_inj=20 ns⁻¹ for QRC. The target signal (red), prediction result (black), and error between them (blue) are shown.

Download full size

View all figures

Furthermore, we analyze the optimal parameter space of the systems. The two-dimensional maps of the NMSE, SER and MC of these RCs in the parameter space of k_inj and Δf are provided in Fig. 4, where the regions of NMSE<0.01, SER<0.01 and MC>10 are marked with white lines. We can hardly find these optimal parameter space in the FFRC [Fig. 4(a−c)], and only limited optimal parameter space for the TDRC [Fig. 4(d−f)], whose optimal parameter space is closely related to the injection locking region, where the data loaded onto the drive laser will be well received and processed by the response laser due to the injection locking effect. As expected, QRC [Fig. 4(g−i)] enables a more flexible setting of the parameters, which significantly expands the optimal parameter space, thus revealing the exceptional advantages of hardware implementation.

Figure 4.Two-dimensional maps of (a, d, g) NMSE, (b, e, h) SER, and (c, f, i) linear MC in the parameter space of k_inj and Δf. (a–c), (d–f) and (g–i) showcase the results of the FFRC, the TDRC and the QRC, respectively. A darker color indicates a smaller value, while the opposite means a larger value. With Q=6 and β=1.6 in (g); Q=9 and β=0.9 in (h); Q=39 and β=0.7 in (i). The feedback strength of the TDRC is set at 18 ns⁻¹, 12 ns⁻¹, 21 ns⁻¹ in (d, e, f), respectively. These results stem from the joint training of XM and YM.

Download full size

View all figures

These comparisons further confirm the reliability of the proposed scheme. Incorporating QC-encoded data enables QRC augmented memory, making it suitable for a wider range of complex tasks, while maintaining a relatively simple network model.

Analysis of key parameters in QC

The impact of the kernel size Q and the step coefficient β is subsequently analyzed. Interestingly, the important role played by the crosstalk between the dual polarization modes can be figured out in Fig. 5. Here, the blue dashed lines depict the performance when the original data is independently injected into XM, i.e. without YM and encoding data, serving as a baseline. When the original and encoding data are injected into XM and YM in parallel, the performance of the XM will change along with the variation originated from the YM injection effect. Surprisingly, with the crosstalk effect, YM can provide the desired memory ability for XM [Fig. 5(c) and 5(f)]. In addition, the results of these benchmark tasks trained only based on the optical injection terms with encoding signals are represented in the red dashed lines, intuitively reflecting the importance of the reservoir layer, i.e., the VCSEL in this work. In Fig. 5(a−c), the resulting performance of QRC on the three selected tasks exhibits similar trends, i.e., as Q is increased, it is first improved and then gradually deteriorates after reaching its best. We deduce that a larger Q means a wider receptive field, which may also lead to the loss of details. Figure 5(d−f) reflect the impact of the step coefficient β. When β=0, QRC is equal to a conventional FFRC, which has almost no memory ability and is much worse than the TDRC. Notably, when β is quantized in half-integer increments, especially when it aligns with integer values, there is a noticeable drop in the QRC performance. Sliding by the integer multiples of a single sampling period dilutes sample correlation, rendering the encoding data a linear superposition of multiple independent periods, i.e., just severely distorted original data. The black lines in Fig. 5 reveal the performance can be further improved when XM and YM are combined with AF. Importantly, the encoding data loaded into YM performs poorly in nonlinear channel equalization tasks, yet excels when combined with XM.

Figure 5.Analysis of the size of kernel Q and the step coefficient β. The (a) NMSE, (b) SER and (c) MC as a function of the kernel size Q, with k_inj=20 ns⁻¹ and β=0.8. The (d) NMSE, (e) SER and (f) MC as a function of step coefficient β, with k_inj=20 ns⁻¹ and Q=10. The original data and encoding data are injected in the XM and YM, respectively. Here, we use AF, which provides an additional nonlinear transformation for the output, to obtain the extended matrices [V_fx, V_fy]. The blue dashed lines illustrate the performance achieved by singularly injecting the original data into XM, and the red dashed lines represent the trained results of the optical injection terms with encoding signals.

Download full size

View all figures

Similarly, the two-dimensional maps of the NMSE, SER and MC of QRC in the parameter space of Q and β are depicted in Fig. 6. The remarked regions in Fig. 6 show the optimal parameter space, where the claw-like structures are consistent well with Fig. 5(d−f), i.e., a non-negligible decrease in performance will occur when β is quantized in half-integer increments. Figure 6(a−c) and Fig. 6(d−f) showcase the results without and with AF, respectively. It can be found that when the transient response matrixes [V_x, V_y] are combined with the extended matrices [V_fx, V_fy] obtained from AF, the optimal parameter space of QRC will expand. However, the effect of AF on MC exhibits marginal improvement, as MC mainly relies on the input or the structure of the system itself, like the kernel size Q and the step coefficient β of QC, or the delay time τ of the FL, rather than the nonlinear mapping in the output.

Figure 6.Two-dimensional maps of (a, d) NMSE, (b, e) SER, and (c, f) linear MC in the parameter space of Q and β. (a–c) and (d–f) showcase the results without and with AF, respectively. These results stem from the joint training of XM and YM. With k_inj=20 ns⁻¹ and Δf=0 GHz.

Download full size

View all figures

Additionally, the distinct data injection methods of QRC are considered, where our findings suggest that parallel injection of the original and encoding data holds significant potential for time-related tasks (see Supplementary information, Section 2).

Experiment results

In the experiment, we fix the temperature of the VCSEL at 28.82 °C, and the bias current at 2.15 mA, which is slightly below the threshold current of 2.16 mA⁵². At this condition, the central wavelengths of XM and YM are around 1558.372 nm and 1558.240 nm, respectively, and the output of the free-running VCSEL is 0.4856 μW approximately. The central wavelengths of two TLs are set at 1558.372 nm and 1558.240 nm, implementing injection without frequency detuning. Here, the XM and YM possess similar output strengths, so both polarization modes can simultaneously handle various loaded data for parallel processing. The injection powers of the VCSEL and the feedback power of the FL are adjusted by the attenuators. In the experiment, the injection power of the dual modes is set to be almost the same. This experimental structure is greatly simplified due to the absence of the FL, making it easier to integrate and more flexible for parameter selection. For the FFRC and QRC, the impact of the injection power on the performance is explored. For the TDRC, the injection power is fixed at 1292.4 μW (almost equal to 1297.2 μW, corresponding to the maximum of the considered injection power of the FFRC and QRC), where the effect of feedback strength is uncovered. It is worth noting that the introduced noise between the experimental equipment or optical components poses challenges in achieving the expected high performance acquired in the simulation phase.

Figure 7 illustrates the experimental results for comparing the FFRC, TDRC and QRC on the previously introduced tasks just as we did in simulations. Figure 7(a−c) display expected trends, similar to those shown in Fig. 3(b−d). The existence of FL can undoubtedly improve the performance at moderate feedback strength, but excessive feedback strength can disrupt the stable state of the system^28,51, thereby reducing the performance. Specifically, the optimal NMSE, SER and MC of the TDRC can be achieved at 0.0169, 0.0267 and 1.9263, respectively. When focusing on the FFRC and QRC, the performance is almost synchronously improved with the increase of the injection power. It is mainly attributed to an improvement in the quality of the transient response matrixes due to the improved signal-to-noise ratio. The optimal performance of the NMSE, SER and MC in the considered injection power can achieve 0.0411, 0.0572, and 0.4895 for the FFRC, as well as 0.0157, 0.0027, and 3.4605 for the QRC. Obviously, when the system lacks memory ability, there will be a significant decrease in performance yet can be improved by the proposed QC. Interestingly, QRC, whose memory ability is provided by the encoding data, exhibits excellent performance in several kinds of benchmark tasks, especially in discrete data processing and memory ability. Meanwhile, compared with TDRC, QRC also demonstrates superiority in energy consumption due to lower injection power requirements and reduced power loss. The flexible parameter configurations in QRC allow for the reduced demand of injection power, in turn reducing energy consumption, whereas the FL in TDRC often results in higher injection power to ensure the system working in a stable region²⁸. For instance, in the channel equalization task, QRC and TDRC will exhibit comparable performance in experimental (simulated) conditions when the injection power (strength) is approximately 600 μW and 1300 μW (9 ns⁻¹ and 20 ns⁻¹), as illustrated in Fig. 7(b) [Fig. 3(c)]. Additionally, QRC avoids the extra energy costs from beam splitting or coupling in FL, thus improving the overall energy efficiency.

Figure 7.The experimental performance comparison between the FFRC, the TDRC and the QRC on the (a) time-series prediction, (b) nonlinear channel equalization and (c) memory ability. With Q=6 and β=1.6 in (a); Q=9 and β=0.9 in (b); Q=39 and β=0.7 in (c). The performance of the QRC on nonlinear channel equalization is detailed in (d), while (e) depicts the three RCs’ memory details when k_inj is 1297.2 μW in the FFRC, the QRC and 1292.4 μW in the TDRC.

Download full size

View all figures

Figure 7(d) further shows the details of QRC on nonlinear channel equalization. The performance of loaded encoding data through YM is far inferior to XM, which loads original data. However, significant performance improvement will occur after merging the transient response matrixes obtained from XM and YM, which is highly consistent with the simulation results in Fig. 5(b) and 5(e). Figure 7(e) showcases the memory details of the three networks mentioned above. The graph demonstrates that the superior memory capacity of the QRC, can maintain a mc(4) value of 0.7787, indicative of its retention for the 4^th past input signal. In contrast, the mc(4) of the FFRC (TDRC) drops sharply from 0.4194 (0.9649) to 0.0027 (0.0024). Thus, the enhanced memory capability of QRC also renders it more adept at handling complex tasks.

Finally, the influence of the number of virtual nodes is also taken into account through oversampling [see Supplementary information, Section 3], which will lead to higher accuracy and faster computational speed⁵³. The optimal average value of the NMSE (SER, MC) is 0.0054 (0.001, 3.8818) achieved at the total neuron number set at 1600 (800, 800), respectively. Additionally, the comparison of this work and existing competitive experimental results based on semiconductor lasers is summarized in Supplementary information, Section 4, Table S1, where our scheme offers high-speed and high-performance parallel computing through a simple, yet flexible structure with the off-the-shelf hardware configuration.

Conclusions

In this study, we have proposed and validated both theoretically and experimentally a novel QRC with enhanced memory capabilities, which removes the dependence on FL and demonstrates advanced performance in a streamlined structure. As a proof-of-concept prototype, we have utilized an easily integrated low-power VCSEL for parallel processing. The dual polarization modes of the VCSEL enable original and encoding data insertion and processing in parallel, reducing processing latency by half. The encoding data from QC endows QRC with the desired memory capability and the crosstalk between XM and YM links the original and encoding data, mimicking the input-memory interaction in a human brain. Moreover, both simulation and experimental results consistently demonstrate the feasibility and superiority of this QRC scheme compared to the well-studied TDRC. Eliminating the reliance on FL via pre-processing encoding offers a viable solution for simplifying experimental setups, easing hardware implementation challenges, and allowing for more flexible parameter configurations, which pave the way for the development of high integration.

Future work will be focused on extending the proposed QC to the recently widely studied deep RCs^54,55, noted for their excellent ability to handle complicated tasks. Despite the challenge of hardware implementation due to their extremely complex structure, QC may offer an auspicious approach to enhance the extensibility of deep physical ANNs.

Category: Research Articles

Received: Jun. 5, 2024

Accepted: Aug. 19, 2024

Published Online: Mar. 24, 2025

The Author Email: Nianqiang Li (NQLi), Xiaofeng Li (XFLi)

DOI:10.29026/oea.2025.240135