Opto-Electronic Advances, Volume. 8, Issue 1, 240135-1(2025)

Streamlined photonic reservoir computer with augmented memory capabilities

Changdi Zhou1,2, Yu Huang1,2, Yigong Yang1,2, Deyu Cai1,2, Pei Zhou1,2, Kuenyao Lau1,2, Nianqiang Li1,2、*, and Xiaofeng Li1,2、**
Author Affiliations
  • 1School of Optoelectronic Science and Engineering & Collaborative Innovation Center of Suzhou Nano Science and Technology, Soochow University, Suzhou 215006, China
  • 2Key Lab of Advanced Optical Manufacturing Technologies of Jiangsu Province & Key Lab of Modern Optical Technologies of Education Ministry of China, Soochow University, Suzhou 215006, China
  • show less

    Photonic platforms are gradually emerging as a promising option to encounter the ever-growing demand for artificial intelligence, among which photonic time-delay reservoir computing (TDRC) is widely anticipated. While such a computing paradigm can only employ a single photonic device as the nonlinear node for data processing, the performance highly relies on the fading memory provided by the delay feedback loop (FL), which sets a restriction on the extensibility of physical implementation, especially for highly integrated chips. Here, we present a simplified photonic scheme for more flexible parameter configurations leveraging the designed quasi-convolution coding (QC), which completely gets rid of the dependence on FL. Unlike delay-based TDRC, encoded data in QC-based RC (QRC) enables temporal feature extraction, facilitating augmented memory capabilities. Thus, our proposed QRC is enabled to deal with time-related tasks or sequential data without the implementation of FL. Furthermore, we can implement this hardware with a low-power, easily integrable vertical-cavity surface-emitting laser for high-performance parallel processing. We illustrate the concept validation through simulation and experimental comparison of QRC and TDRC, wherein the simpler-structured QRC outperforms across various benchmark tasks. Our results may underscore an auspicious solution for the hardware implementation of deep neural networks.

    Introduction

    The rapid advancement of artificial neural networks (ANNs) has significantly contributed to the progress of artificial intelligence (AI), demonstrating unparalleled competitiveness in fields such as image processing, game playing, and protein structure prediction14. Although conventional software-based ANNs rooted on the von Neumann computing architecture embody successfully mimicked human cognitive abilities in complex tasks, they confront several formidable challenges, especially in energy consumption and processing speed5, as dictated by Moore's Law. Therefore, ANNs based on physical platforms, including electronics68, spintronics911, photonic hardwares1217, and others1821, offer alternative solutions to meet the extensive computing demands of AI. Photonics-based technology, in particular, is a capable candidate due to its extremely fast processing speed and ultra-low power consumption16,2224. Significantly, reservoir computing (RC), a simplified machine learning scheme derived from recurrent neural networks2527, is known as an amenable paradigm to analog implementation because only the RC output layer requires simple algorithm-based training. This desirable feature significantly reduces the complexity and cost of the training procedure, making it a hardware-friendly and high-speed computing implementation alternative. Especially, many photonic RC schemes can be further simplified by the so-called time-delay RC (TDRC), which only contains a single physical node with a time-delay feedback loop (FL)28, employing time-multiplexing to establish virtual nodes in contrast with the need of large-scale physical nodes in the spatial RC. The fading memory capability provided by the FL enables such TDRC a unique advantage in handling time-dependent tasks.

    However, FL significantly influences both the network flexibility and capabilities of TDRC, posing challenges for hardware implementation. The introduction of FL provokes a multifaceted trade-off, typically induces complex nonlinear dynamics within the system, necessitating precise parameters control, e.g., external optical injection or frequency detuning, to stabilize the output of semiconductor laser-based RCs. Furthermore, the length of FL has a significant impact on the performance of TDRC. Extending the delay line to accommodate more neurons also results in a larger footprint and reduced information-processing rates29, whereas a shorter delay line compromises the computing accuracy due to the limited network size. These issues have prompted the exploration of novel structures, such as next-generation RC (NG-RC)30,31. The NG-RC obviates the need for a reservoir, directly constructing output feature vectors through various nonlinear combinations of both present and historical data. This approach has demonstrated impressive performance across multiple tasks by achieving reduced training and warm-up times. However, it compromises physical openness, thus presenting challenges for neuromorphic implementations. Moreover, feedback-free RC (FFRC) or extreme learning machines have been also proposed and demonstrated32,33. Account of the absence of the FL, which leads to the loss of a fading memory, conventional FFRC may showcase equivalent performance in discrete data processing, e.g., data recovery or image classification3436, but poor performance in time-related or high-memory requirements tasks. To compensate for the loss of memory ability, Takano et al. preliminarily introduced a weighted sum of past data to the original one and successfully achieved the satisfied prediction performance of the Santa-Fe time series37, but without accounting for the connection between independent sampling-period data. Zeng et al. theoretically proposed three different pre-processing and post-processing methods based on Takano’s work38. Besides, the required FL or memory capability can also be provided through the inherent characteristics of some specific physical devices. Phang reported a novel RC based on an optical-fiber and loop-free kernel configuration utilizing the intrinsic memory property of stimulated Brillouin scattering39. In Zhang et al.’s work, the pulse broadening effect caused by dispersion in optical fiber provides a short-term fading memory40. This undoubtedly imposes specific requirements on hardware components. In light of the pressing demands for hardware compatibility and high integration, it is crucial to explore novel neural networks with flexible yet simple structures for high-speed and high-performance operations.

    In this work, we propose a streamlined photonic RC with augmented memory capabilities based on the designed quasi-convolution coding (QC). Through analyzing and optimally selecting key parameters of QC, such QC-based RC (QRC) can acquire fading memory from encoding data, analogous to the function of FL in TDRC, thereby significantly simplifying the processing system by obviating FL. For such a photonic neural network, we employ a vertical-cavity surface-emitting laser (VCSEL) featuring a low threshold current and dual polarization modes to facilitate low-power parallel processing. Its compact size also offers significant advantages in terms of integration. Notably, the dual modes of the VCSEL enable parallel data insertion and processing, where different data can be loaded simultaneously, allowing processing latency by half41. Herein, the crosstalk between X polarization mode (XM) and Y polarization mode (YM) establishes a connection between the original and encoding data, imitating the interaction between the input and the memory in human brains42, which greatly benefits time-related tasks. The prominent performance of QRC is confirmed theoretically and experimentally through detailed comparisons with the conventional TDRC across several benchmark tasks. Additionally, we compare this work with existing experimental results for TDRC based on semiconductor lasers, further demonstrating the feasibility of our proposed scheme. Therefore, our approach might provide a viable alternative for the hardware implementation of easily integrated high-performance RC systems.

    Methodology

    Algorithms

    Quasi-convolution coding

    Traditional convolutional coding (CC) involves multiplying the source pixel with the convolution kernel and summing up to obtain the target pixel. The convolution kernel then slides in a predetermined direction and repeats the above operation to generate all outputs. This encoding method has proven highly effective in extracting data features, especially in machine vision43.

    Drawing on the operational criterion of CC and the remarkable memory framework of the human brain, where the neuronal responses can be maintained by a pure feedforward mechanism44, we propose a QC algorithm, detailed in Fig. 1(a−c). Figure 1(a) illustrates the weighted sum process in QC. Here, the input is firstly transformed into a one-dimensional (1-D) vector via matrix transformation to facilitate the encoding process, while the sliding step is designed based on the step coefficients β and sampling period T. Figure 1(b) illustrates the input or sliding direction of the original data (blue line) and the convolution kernel (red line). The convolution kernel coefficients of j-th sliding cj (j=1, 2, …, Q) gradually decrease after each sliding to mimic the fading memory of the human brain, determined by the size of kernel Q, which is designed as cj=(Q+1−j)/1Qand the denominator of cj means the sum of integers from 1 to Q. Figure 1(c) vividly showcases the stretch-out view of QC. We simplify the encoding process by sliding through the entire sampling period, which is equivalent to encoding individual points separately, and thus, can be expressed as:

    Concept of QC and RC structures. (a−c) Design of the proposed QC, which has a similar operational criterion to convolutional coding in data processing, but meanwhile can extract features in the temporal dimension and provide memory capability. (d−f) Schematic diagram of different RC structures. The comparison of the nodes’ states between (d) Spatial RC, (e) TDRC and (f) proposed QRC verifies that the encoding data will provide the memory capability through QC. NL, nonlinear nodes.

    Figure 1.Concept of QC and RC structures. (ac) Design of the proposed QC, which has a similar operational criterion to convolutional coding in data processing, but meanwhile can extract features in the temporal dimension and provide memory capability. (df) Schematic diagram of different RC structures. The comparison of the nodes’ states between (d) Spatial RC, (e) TDRC and (f) proposed QRC verifies that the encoding data will provide the memory capability through QC. NL, nonlinear nodes.

    sj(t)=s0[tj(βT)],

    sM(t)=j=1Qcjsj(t),

    where t represents the continuous time, s0(t) is the original data, which can be a time-continuous input stream or time-discrete input, and we take the former as an example here. The stride of the kernel is jointly determined by the step coefficient β and single sampling period T, represented as βT, i.e., the product of them. The s0[tj(βT)] in Eq. (1) indicates that the original data s0(t) has slid j times with βT as the step size, in the time dimension. Each slid data sj(t) is firstly multiplicated with the corresponding convolution kernel coefficients cj and then summed up to yield the final encoding data sM(t), from j equals to 1 to the kernel size Q, where j is a positive integer.

    As we know, the size of convolution kernels and stride, which define the receptive field of the filter and step size for moving the filter across the input, have a crucial influence on CC performance, and in the subsequent research analysis, we shall also focus on their effects on QC. Through QC, we can establish the connection of information across each independent period and extract features in the time dimension, where sM(t) endows the system with a memory capability akin to human brain memory.

    Reservoir computing

    In general, the RC implementation has two main approaches: spatial RC and TDRC as depicted in Fig. 1(d) and 1(e), respectively. The temporal evolution of the reservoir state in the spatial RC can be described as follows [Fig. 1(d)]45:

    X(n)=f[Winu(n)+WresX(n1)],

    where n and the Nin dimensional vector u(n) stand for the discrete time and input vector, respectively. The Nres dimensional vectors X(n) and X(n−1) denote the state vectors of the neuron nodes in the reservoir layer at the current time and previous moment, respectively. While Win is a complex Nres×Nin matrix indicating the input-to-reservoir connections, Wres is a complex Nres×Nres matrix accounting for the weight matrix of the internal connections of the reservoir. The function f represents the activation function (AF) in the reservoir layer, e.g., sigmoid.

    This high-dimensional state space can also be generated in a time-delay dynamic system as follows [Fig. 1(e)]46:

    X˙(t)=f[t,X(t),X(tτ)],

    X(n)=f[P(n)+R(n)X(nτθ)],

    where τ is the delay time, and X˙(t) represents the derivative of X(t) relative to dimensionless t. The trained output weights Wout of the TDRC rely on the transient response matrix obtained from the discrete sampling of the reservoir. So we can transform Eq. (4) into Eq. (5) by discrete sampling to describe the state of the reservoir layer. θ is the interval between virtual nodes, and R(n) reflects the internal connections of the reservoir. Herein, the virtual nodes are obtained by sampling the transient response of the reservoir layer within the single sampling period T, i.e., the duration of one signal. The number of virtual nodes m refers to the number of the hidden layer’s transient responses, obtained through sampling with equal interval θ. The number of virtual nodes m, virtual node interval θ, and single sampling period T satisfy m=T/θ. The original data P(n) is obtained from the product of the mask M(n) and the input S(n). Notably, the existence of X(n−τ/θ) in Eq. (5) provides memory capability for the TDRC.

    In Fig. 1(f), we demonstrate that the proposed QRC can also provide prominent memory ability leveraging the encoding data acquired through QC, without the need for FL. The current state of the QRC in the hidden layer Xtot(n)=[X(n), Xenc(n)] can be described by:

    Xtot(n)=f[P(n)+R2(n)Xenc(n)+Penc(n)+R1(n)X(n)],

    where Penc(n) and Xenc(n) are the encoding data and corresponding state of the reservoir, respectively. The square bracket [ ] means the merging of arrays. In the QRC, X(n)=f[P(n)+R2(n)Xenc(n)], Xenc(n)=f[Penc(n)+R1(n)X(n)], while R1 and R2 represent the connections between nonlinear nodes just like the synaptic connection weights between neurons. As depicted in Eq. (6), Penc(n) and Xenc(n) provide the required memory capability for this feedback-free configuration. The subsequent simulations and experiments will verify the superiority of the proposed QRC, that is, this extremely simplified neural network, with a flexible parameter configuration, possesses augmented memory capability.

    Simulation model

    Figure 2(a) illustrates the VCSEL-based schematic architecture of both the TDRC and the QRC. The final output Yout is derived using output weights Wout, trained from the transient response matrices of XM and YM [Vx, Vy], where the crosstalk effect between these dual polarization modes simulates the subtle memory interactions in the human brain effectively.

    Schematic diagram of the TDRC and the QRC, as well as their physical implementation based on the VCSEL. (a) Schematic architectures of the TDRC and QRC based on the VCSEL. Experimental setup of (b) the TDRC and (c) the QRC. PC, polarization controller; Att, attenuator; MZM, Mach-Zehnder Modulator; DL, delay line; EDFA, erbium-doped fiber amplifier; OBPF, optical bandpass filter; PBS, polarization beam splitter; PD, photodetector; AWG, arbitrary waveform generator; OSA, optical spectrum analyzer; OSC, oscilloscope; Ch, channel. The blue (red) line represents the optical (electrical) connection. (d) Optical spectra of the VCSEL. The black line represents the optical spectra of the free-running VCSEL. The red line represents XM and YM separated by PBS and PC.

    Figure 2.Schematic diagram of the TDRC and the QRC, as well as their physical implementation based on the VCSEL. (a) Schematic architectures of the TDRC and QRC based on the VCSEL. Experimental setup of (b) the TDRC and (c) the QRC. PC, polarization controller; Att, attenuator; MZM, Mach-Zehnder Modulator; DL, delay line; EDFA, erbium-doped fiber amplifier; OBPF, optical bandpass filter; PBS, polarization beam splitter; PD, photodetector; AWG, arbitrary waveform generator; OSA, optical spectrum analyzer; OSC, oscilloscope; Ch, channel. The blue (red) line represents the optical (electrical) connection. (d) Optical spectra of the VCSEL. The black line represents the optical spectra of the free-running VCSEL. The red line represents XM and YM separated by PBS and PC.

    Herein, we use the renowned spin-flip model to analyze the nonlinear dynamics of the VCSEL with optical injection, whose rate equations can be modified as47:

    dExdt=κ(1+iα)(NExEx+inEy)(γα+iγp)Ex+kinjϵx(t)+Fx,

    dEydt=κ(1+iα)(NEyEyinEx)+(γα+iγp)Ey+kinjϵy(t)+Fy,

    dNdt=γN[μN(1+|Ex|2+|Ey|2)+in(ExEy*EyEx*)],

    dndt=γsnγN[n(|Ex|2+|Ey|2)+iN(EyEx*ExEy*)],

    where Ex and Ey represent slow-varying complex electric field amplitudes of XM and YM, respectively, N stands for the total population inversion between the conduction and valence bands, and n accounts for the difference between the carrier reversal with opposite spins. The last terms in Eqs. (7) and (8) are the spontaneous emission noises described by the Langevin sources, which can be written as48:

    Fx=βsp/2[(N+n)ξ1+(Nn)ξ2],

    Fy=iβsp/2[(N+n)ξ1(Nn)ξ2],

    where the spontaneous emission rate βsp is set to 10−6 ns−1, and ξ1,2 represents independent Gaussian white noise of unitary variance and zero mean value. The injected terms are described in the third term in Eqs. (7) and (8), kinj stands for the injected strength and εx,y(t) is the output of Mach-Zehnder Modulator (MZM) described as49,50:

    ϵx,y(t)=|ϵ0|2{1+ei[P,enc(t)+Φ0]}ei2πΔfx,yt,

    where |ε0| represents the amplitude of the injection field, Φ0 is the bias voltage of MZM, and Δfxfy) is the frequency detuning between the injection field εx(t) [εy(t)] and the XM (YM) of the VCSEL. For simplicity, we set Δfxfyf.

    Note here that the rate equations of Eqs. (7) to (10) are solved using a fourth-order Runge-Kutta algorithm with a time step of 2 ps, and the main simulation parameters are tabulated in Table 141. The nonlinear processes described in Eqs. (7) to (10) demonstrate the implementation of the nonlinear function f in Eq. (6). In this work, we select three commonly used benchmark tasks to evaluate system performance, including chaotic time-series prediction, nonlinear channel equalization, and memory capacity. The normalized mean square error (NMSE) between the target and predicted values is utilized to assess the accuracy of model predictions; the symbol error rate (SER), defined as the ratio of the error recognition number to the total testing number, represents the classification ability of the system for discrete signal in the channel equalization task; the memory capacity (MC) reflects the retention for past input signals, which is beneficial in processing time-dependent tasks (detailed in Supplementary information, Section 1).

    • Table 1. Some key parameters of the VCSEL used.

      Table 1. Some key parameters of the VCSEL used.

      SymbolParameterValue
      κField decay rate300 ns−1
      αLinewidth enhancement factor3
      γαLinear dichroism0.1 ns−1
      γpLinear birefringence10 ns−1
      γNDecay rate of N1 ns−1
      γsSpin-flip rate50 ns−1
      μNormalized bias current of the VCSEL1.01
      |ε0|Injection field amplitude1
      Φ0Bias voltage of the MZM0 V
      ΔfFrequency detuning0 GHz
      θVirtual nodes interval2×10−11 s

    Experimental setup

    Figure 2(b) and 2(c) show the experiment setup for the TDRC and the proposed QRC, respectively. Two tunable lasers (TLs, i.e., TLD-C20 and NLC13) serve as the drive lasers providing two optical carriers, whose amplitudes are adjusted by attenuators and polarizations are aligned with the MZMs through polarization controllers (PCs). Here, the original signal P(n) obtained from S(n) masked by a binary random mask {−1, 1} and encoding signal Penc(n) are generated from two arbitrary waveform generators (AWGs, i.e., AWG70001B and 81150A), and then loaded on the two optical carriers via MZMs. It should be noted that the original signal P(n) and the QC-encoded signal Penc(n) in this work are firstly created on a personal computer during the preprocessing phase and then sent to the AWGs to generate the input in the electrical domain. However, the entire encoding process can be expected to be implemented in hardware. For instance, a Field-Programmable Gate Array (FPGA) can be utilized to function as an adder and multiplier, thereby executing the complete encoding process. By configuring specific kernel coefficients within the FPGA, which are multiplied with the slid original signals and subsequently summed up, the encoded signals are generated. The dual paths of the modulated light, which are realigned with the polarizations of the XM and YM of the off-the-shelf VCSEL separately, are combined with a 50∶50 optical coupler (OC). The main difference between the two experimental setups shown in Fig. 2(b) and 2(c) relies on the existence of the FL in Fig. 2(b), which additionally contains a delay line (DL) used to control the delay time τ, as well as an attenuator used to adjust the FL strength. The output of the VCSEL can be considered as 70% light of the OC [TDRC in Fig. 2(b)], or the Port 3 of the circulator [QRC in Fig. 2(c)], is first attenuated to within the working range of the erbium-doped fiber amplifier (EDFA) before being amplified, and then filtered by the optical bandpass filter (OBPF, WLTF-NM-S-1550-60/0.8-SM-0.9/1.0-FC/APC) to suppress the noise caused by optical injection. Herein, we use the polarization beam splitter (PBS) combined with the last PC, by observing the center wavelength of the two optical paths through the optical spectrum analyzer (OSA, AQ6370D), to separate the XM and YM of the VCSEL into two independent optical paths [shown in Fig. 2(d)]. The transient responses of the reservoir state are sampled through a real-time oscilloscope (OSC, WaveMaster 820Zi-B) after being detected by the photodetectors (PDs). In the experiment, considering the bandwidth limitation of the used AWG, we set θ=1 ns and the number of neurons in each mode m=100, so that the sampling period is T=100 ns (T=θ×m), and the feedback delay time is τ=T. Due to the two polarization modes of VCSEL, the total neuron number is 2m. Additionally, the sampling rates of the AWG and OSC are 1 GSa/s and 40 GSa/s, respectively.

    Results

    Simulation results

    During the simulation phase, we initially compare the performance of the TDRC and QRC, to identify their optimal parameter space. We establish the FFRC as an additional control group to highlight the importance of encoding data. We then systematically analyze the impact of two key parameters of QC on the QRC, including the size of kernel Q and the step coefficient β (similar to the kernel size and stride in CC), and also determine the optimal parameter space. Additionally, we explore the impact of AF in post-processing along with the injection methods of encoding data, where the sigmoid function is selected as the AF in the output layer to nonlinearly transform the transient response matrix, i.e., an array composed of equidistant samples of the laser output, into an extended new matrix, which is used to enrich the neural representation by increasing the number of virtual nodes.

    Performance comparison of different RCs

    On account of the existence of the FL, which will enrich the dynamics of the VCSEL51, the bifurcation diagram with kinj as the control parameter is given in Fig. 3(a). When the feedback strength is kd=20 ns−1, it can be clearly seen that the VCSEL is operating in the chaotic state during the range of kinj ∈ [2 ns−1, 16 ns−1], and only when kinj > 16 ns−1, the VCSEL can be stabilized again, i.e., staying in a stable region, in which good performance of RCs can be guaranteed28. After eliminating the FL, that is, by setting kd=0 ns−1, the stable region is remarkably broadened, which may enable the proposed QRC to exhibit the desired performance in a much wider parameter space. Figure 3(b−d) reveal the significant differences between these three kinds of RCs on benchmark tasks, while the detailed results of the chaotic time-series prediction are more intuitively depicted in Fig. 3(e). Compared with the TDRC, the FFRC and QRC are much more insensitive to changes of kinj due to the absence of the FL. As expected, TDRC will perform satisfactorily under moderate feedback intensity due to the rich dynamics brought by the FL, while excessive self-feedback strength can make the VCSEL enter the chaotic region from the stable state and severely damage its performance. Note here that a lower NMSE or SER means the better performance of the RC, while a higher MC means that the RC possesses stronger memory ability. Remarkably, our proposed QRC with a streamlined structure can still exhibit competitive performance, that is, comparable to or outperforming the conventional TDRC, especially in terms of memory ability. This can be attributed to the combination of the proposed QC method and the intrinsic crosstalk effect between the two orthogonal polarization modes of the VCSEL.

    (a) Bifurcation diagram with kinj as the control parameter of the VCSEL. In all panels, the extrema (maxima and minima) of the intensity time series are shown as dots. (b−d) showcase the performance difference of the FFRC, TDRC and QRC, based on the benchmark tasks motioned before. With Q=6 and β=1.6 in (b); Q=9 and β=0.9 in (c); Q=39 and β=0.7 in (d). The injection power of the TDRC is set at 20 ns−1. The detailed results of the chaotic time-series prediction are further demonstrated in (e), with kinj=20 ns−1 for FFRC; kinj=20 ns−1 and kd=18 ns−1 for TDRC; Q=6, β=1.6 and kinj=20 ns−1 for QRC. The target signal (red), prediction result (black), and error between them (blue) are shown.

    Figure 3.(a) Bifurcation diagram with kinj as the control parameter of the VCSEL. In all panels, the extrema (maxima and minima) of the intensity time series are shown as dots. (bd) showcase the performance difference of the FFRC, TDRC and QRC, based on the benchmark tasks motioned before. With Q=6 and β=1.6 in (b); Q=9 and β=0.9 in (c); Q=39 and β=0.7 in (d). The injection power of the TDRC is set at 20 ns−1. The detailed results of the chaotic time-series prediction are further demonstrated in (e), with kinj=20 ns−1 for FFRC; kinj=20 ns−1 and kd=18 ns−1 for TDRC; Q=6, β=1.6 and kinj=20 ns−1 for QRC. The target signal (red), prediction result (black), and error between them (blue) are shown.

    Furthermore, we analyze the optimal parameter space of the systems. The two-dimensional maps of the NMSE, SER and MC of these RCs in the parameter space of kinj and Δf are provided in Fig. 4, where the regions of NMSE<0.01, SER<0.01 and MC>10 are marked with white lines. We can hardly find these optimal parameter space in the FFRC [Fig. 4(a−c)], and only limited optimal parameter space for the TDRC [Fig. 4(d−f)], whose optimal parameter space is closely related to the injection locking region, where the data loaded onto the drive laser will be well received and processed by the response laser due to the injection locking effect. As expected, QRC [Fig. 4(g−i)] enables a more flexible setting of the parameters, which significantly expands the optimal parameter space, thus revealing the exceptional advantages of hardware implementation.

    Two-dimensional maps of (a, d, g) NMSE, (b, e, h) SER, and (c, f, i) linear MC in the parameter space of kinj and Δf. (a–c), (d–f) and (g–i) showcase the results of the FFRC, the TDRC and the QRC, respectively. A darker color indicates a smaller value, while the opposite means a larger value. With Q=6 and β=1.6 in (g); Q=9 and β=0.9 in (h); Q=39 and β=0.7 in (i). The feedback strength of the TDRC is set at 18 ns−1, 12 ns−1, 21 ns−1 in (d, e, f), respectively. These results stem from the joint training of XM and YM.

    Figure 4.Two-dimensional maps of (a, d, g) NMSE, (b, e, h) SER, and (c, f, i) linear MC in the parameter space of kinj and Δf. (a–c), (d–f) and (g–i) showcase the results of the FFRC, the TDRC and the QRC, respectively. A darker color indicates a smaller value, while the opposite means a larger value. With Q=6 and β=1.6 in (g); Q=9 and β=0.9 in (h); Q=39 and β=0.7 in (i). The feedback strength of the TDRC is set at 18 ns−1, 12 ns−1, 21 ns−1 in (d, e, f), respectively. These results stem from the joint training of XM and YM.

    These comparisons further confirm the reliability of the proposed scheme. Incorporating QC-encoded data enables QRC augmented memory, making it suitable for a wider range of complex tasks, while maintaining a relatively simple network model.

    Analysis of key parameters in QC

    The impact of the kernel size Q and the step coefficient β is subsequently analyzed. Interestingly, the important role played by the crosstalk between the dual polarization modes can be figured out in Fig. 5. Here, the blue dashed lines depict the performance when the original data is independently injected into XM, i.e. without YM and encoding data, serving as a baseline. When the original and encoding data are injected into XM and YM in parallel, the performance of the XM will change along with the variation originated from the YM injection effect. Surprisingly, with the crosstalk effect, YM can provide the desired memory ability for XM [Fig. 5(c) and 5(f)]. In addition, the results of these benchmark tasks trained only based on the optical injection terms with encoding signals are represented in the red dashed lines, intuitively reflecting the importance of the reservoir layer, i.e., the VCSEL in this work. In Fig. 5(a−c), the resulting performance of QRC on the three selected tasks exhibits similar trends, i.e., as Q is increased, it is first improved and then gradually deteriorates after reaching its best. We deduce that a larger Q means a wider receptive field, which may also lead to the loss of details. Figure 5(d−f) reflect the impact of the step coefficient β. When β=0, QRC is equal to a conventional FFRC, which has almost no memory ability and is much worse than the TDRC. Notably, when β is quantized in half-integer increments, especially when it aligns with integer values, there is a noticeable drop in the QRC performance. Sliding by the integer multiples of a single sampling period dilutes sample correlation, rendering the encoding data a linear superposition of multiple independent periods, i.e., just severely distorted original data. The black lines in Fig. 5 reveal the performance can be further improved when XM and YM are combined with AF. Importantly, the encoding data loaded into YM performs poorly in nonlinear channel equalization tasks, yet excels when combined with XM.

    Analysis of the size of kernel Q and the step coefficient β. The (a) NMSE, (b) SER and (c) MC as a function of the kernel size Q, with kinj=20 ns−1 and β=0.8. The (d) NMSE, (e) SER and (f) MC as a function of step coefficient β, with kinj=20 ns−1 and Q=10. The original data and encoding data are injected in the XM and YM, respectively. Here, we use AF, which provides an additional nonlinear transformation for the output, to obtain the extended matrices [Vfx, Vfy]. The blue dashed lines illustrate the performance achieved by singularly injecting the original data into XM, and the red dashed lines represent the trained results of the optical injection terms with encoding signals.

    Figure 5.Analysis of the size of kernel Q and the step coefficient β. The (a) NMSE, (b) SER and (c) MC as a function of the kernel size Q, with kinj=20 ns−1 and β=0.8. The (d) NMSE, (e) SER and (f) MC as a function of step coefficient β, with kinj=20 ns−1 and Q=10. The original data and encoding data are injected in the XM and YM, respectively. Here, we use AF, which provides an additional nonlinear transformation for the output, to obtain the extended matrices [Vfx, Vfy]. The blue dashed lines illustrate the performance achieved by singularly injecting the original data into XM, and the red dashed lines represent the trained results of the optical injection terms with encoding signals.

    Similarly, the two-dimensional maps of the NMSE, SER and MC of QRC in the parameter space of Q and β are depicted in Fig. 6. The remarked regions in Fig. 6 show the optimal parameter space, where the claw-like structures are consistent well with Fig. 5(d−f), i.e., a non-negligible decrease in performance will occur when β is quantized in half-integer increments. Figure 6(a−c) and Fig. 6(d−f) showcase the results without and with AF, respectively. It can be found that when the transient response matrixes [Vx, Vy] are combined with the extended matrices [Vfx, Vfy] obtained from AF, the optimal parameter space of QRC will expand. However, the effect of AF on MC exhibits marginal improvement, as MC mainly relies on the input or the structure of the system itself, like the kernel size Q and the step coefficient β of QC, or the delay time τ of the FL, rather than the nonlinear mapping in the output.

    Two-dimensional maps of (a, d) NMSE, (b, e) SER, and (c, f) linear MC in the parameter space of Q and β. (a–c) and (d–f) showcase the results without and with AF, respectively. These results stem from the joint training of XM and YM. With kinj=20 ns−1 and Δf=0 GHz.

    Figure 6.Two-dimensional maps of (a, d) NMSE, (b, e) SER, and (c, f) linear MC in the parameter space of Q and β. (a–c) and (d–f) showcase the results without and with AF, respectively. These results stem from the joint training of XM and YM. With kinj=20 ns−1 and Δf=0 GHz.

    Additionally, the distinct data injection methods of QRC are considered, where our findings suggest that parallel injection of the original and encoding data holds significant potential for time-related tasks (see Supplementary information, Section 2).

    Experiment results

    In the experiment, we fix the temperature of the VCSEL at 28.82 °C, and the bias current at 2.15 mA, which is slightly below the threshold current of 2.16 mA52. At this condition, the central wavelengths of XM and YM are around 1558.372 nm and 1558.240 nm, respectively, and the output of the free-running VCSEL is 0.4856 μW approximately. The central wavelengths of two TLs are set at 1558.372 nm and 1558.240 nm, implementing injection without frequency detuning. Here, the XM and YM possess similar output strengths, so both polarization modes can simultaneously handle various loaded data for parallel processing. The injection powers of the VCSEL and the feedback power of the FL are adjusted by the attenuators. In the experiment, the injection power of the dual modes is set to be almost the same. This experimental structure is greatly simplified due to the absence of the FL, making it easier to integrate and more flexible for parameter selection. For the FFRC and QRC, the impact of the injection power on the performance is explored. For the TDRC, the injection power is fixed at 1292.4 μW (almost equal to 1297.2 μW, corresponding to the maximum of the considered injection power of the FFRC and QRC), where the effect of feedback strength is uncovered. It is worth noting that the introduced noise between the experimental equipment or optical components poses challenges in achieving the expected high performance acquired in the simulation phase.

    Figure 7 illustrates the experimental results for comparing the FFRC, TDRC and QRC on the previously introduced tasks just as we did in simulations. Figure 7(a−c) display expected trends, similar to those shown in Fig. 3(b−d). The existence of FL can undoubtedly improve the performance at moderate feedback strength, but excessive feedback strength can disrupt the stable state of the system28,51, thereby reducing the performance. Specifically, the optimal NMSE, SER and MC of the TDRC can be achieved at 0.0169, 0.0267 and 1.9263, respectively. When focusing on the FFRC and QRC, the performance is almost synchronously improved with the increase of the injection power. It is mainly attributed to an improvement in the quality of the transient response matrixes due to the improved signal-to-noise ratio. The optimal performance of the NMSE, SER and MC in the considered injection power can achieve 0.0411, 0.0572, and 0.4895 for the FFRC, as well as 0.0157, 0.0027, and 3.4605 for the QRC. Obviously, when the system lacks memory ability, there will be a significant decrease in performance yet can be improved by the proposed QC. Interestingly, QRC, whose memory ability is provided by the encoding data, exhibits excellent performance in several kinds of benchmark tasks, especially in discrete data processing and memory ability. Meanwhile, compared with TDRC, QRC also demonstrates superiority in energy consumption due to lower injection power requirements and reduced power loss. The flexible parameter configurations in QRC allow for the reduced demand of injection power, in turn reducing energy consumption, whereas the FL in TDRC often results in higher injection power to ensure the system working in a stable region28. For instance, in the channel equalization task, QRC and TDRC will exhibit comparable performance in experimental (simulated) conditions when the injection power (strength) is approximately 600 μW and 1300 μW (9 ns−1 and 20 ns−1), as illustrated in Fig. 7(b) [Fig. 3(c)]. Additionally, QRC avoids the extra energy costs from beam splitting or coupling in FL, thus improving the overall energy efficiency.

    The experimental performance comparison between the FFRC, the TDRC and the QRC on the (a) time-series prediction, (b) nonlinear channel equalization and (c) memory ability. With Q=6 and β=1.6 in (a); Q=9 and β=0.9 in (b); Q=39 and β=0.7 in (c). The performance of the QRC on nonlinear channel equalization is detailed in (d), while (e) depicts the three RCs’ memory details when kinj is 1297.2 μW in the FFRC, the QRC and 1292.4 μW in the TDRC.

    Figure 7.The experimental performance comparison between the FFRC, the TDRC and the QRC on the (a) time-series prediction, (b) nonlinear channel equalization and (c) memory ability. With Q=6 and β=1.6 in (a); Q=9 and β=0.9 in (b); Q=39 and β=0.7 in (c). The performance of the QRC on nonlinear channel equalization is detailed in (d), while (e) depicts the three RCs’ memory details when kinj is 1297.2 μW in the FFRC, the QRC and 1292.4 μW in the TDRC.

    Figure 7(d) further shows the details of QRC on nonlinear channel equalization. The performance of loaded encoding data through YM is far inferior to XM, which loads original data. However, significant performance improvement will occur after merging the transient response matrixes obtained from XM and YM, which is highly consistent with the simulation results in Fig. 5(b) and 5(e). Figure 7(e) showcases the memory details of the three networks mentioned above. The graph demonstrates that the superior memory capacity of the QRC, can maintain a mc(4) value of 0.7787, indicative of its retention for the 4th past input signal. In contrast, the mc(4) of the FFRC (TDRC) drops sharply from 0.4194 (0.9649) to 0.0027 (0.0024). Thus, the enhanced memory capability of QRC also renders it more adept at handling complex tasks.

    Finally, the influence of the number of virtual nodes is also taken into account through oversampling [see Supplementary information, Section 3], which will lead to higher accuracy and faster computational speed53. The optimal average value of the NMSE (SER, MC) is 0.0054 (0.001, 3.8818) achieved at the total neuron number set at 1600 (800, 800), respectively. Additionally, the comparison of this work and existing competitive experimental results based on semiconductor lasers is summarized in Supplementary information, Section 4, Table S1, where our scheme offers high-speed and high-performance parallel computing through a simple, yet flexible structure with the off-the-shelf hardware configuration.

    Conclusions

    In this study, we have proposed and validated both theoretically and experimentally a novel QRC with enhanced memory capabilities, which removes the dependence on FL and demonstrates advanced performance in a streamlined structure. As a proof-of-concept prototype, we have utilized an easily integrated low-power VCSEL for parallel processing. The dual polarization modes of the VCSEL enable original and encoding data insertion and processing in parallel, reducing processing latency by half. The encoding data from QC endows QRC with the desired memory capability and the crosstalk between XM and YM links the original and encoding data, mimicking the input-memory interaction in a human brain. Moreover, both simulation and experimental results consistently demonstrate the feasibility and superiority of this QRC scheme compared to the well-studied TDRC. Eliminating the reliance on FL via pre-processing encoding offers a viable solution for simplifying experimental setups, easing hardware implementation challenges, and allowing for more flexible parameter configurations, which pave the way for the development of high integration.

    Future work will be focused on extending the proposed QC to the recently widely studied deep RCs54,55, noted for their excellent ability to handle complicated tasks. Despite the challenge of hardware implementation due to their extremely complex structure, QC may offer an auspicious approach to enhance the extensibility of deep physical ANNs.

    [8] M Chu, B Kim, S Park et al. Neuromorphic hardware system for visual pattern recognition with memristor array and CMOS neuron. IEEE Trans Ind Electron, 62, 2410-2419(2015).

    [16] R Hamerly, L Bernstein, A Sludds et al. Large-scale optical neural networks based on photoelectric multiplication. Phys Rev X, 9, 021032(2019).

    [23] CH Li, W Du, YX Huang et al. Photonic synapses with ultralow energy consumption for artificial visual perception and brain storage. Opto-Electron Adv, 5, 210069(2022).

    [24] CR Huang, VJ Sorger, M Miscuglio et al. Prospects and applications of photonic neural networks. Adv Phys X, 7, 1981155(2022).

    [25] I Goodfellow, Y Bengio, A Courville. Deep Learning(2016).

    [26] H Jaeger. The “echo state” approach to analyzing and training recurrent neural networks(2001).

    [41] XX Guo, SY Xiang, YH Zhang et al. Polarization multiplexing reservoir computing based on a VCSEL with polarized optical feedback. IEEE J Sel Top Quantum Electron, 26, 1700109(2020).

    [42] LR Squire. Memory and Brain(1987).

    [45] M Lukoševičius. A practical guide to applying echo state networks. In Montavon G, Orr GB, Müller KR. Neural Networks: Tricks of the Trade, 659-686(2012).

    [53] L Larger, A Baylón-Fuentes, R Martinenghi et al. High-speed photonic reservoir computing using a time-delay-based architecture: million words per second classification. Phys Rev X, 7, 011015(2017).

    Tools

    Get Citation

    Copy Citation Text

    Changdi Zhou, Yu Huang, Yigong Yang, Deyu Cai, Pei Zhou, Kuenyao Lau, Nianqiang Li, Xiaofeng Li. Streamlined photonic reservoir computer with augmented memory capabilities[J]. Opto-Electronic Advances, 2025, 8(1): 240135-1

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Research Articles

    Received: Jun. 5, 2024

    Accepted: Aug. 19, 2024

    Published Online: Mar. 24, 2025

    The Author Email: Nianqiang Li (NQLi), Xiaofeng Li (XFLi)

    DOI:10.29026/oea.2025.240135

    Topics