Optoelectronic reservoir computing based on complex-value encoding

Chunxu Ding; Rongjun Shao; Jingwei Li; Yuan Qu; Linxian Liu; Qiaozhi He; Xunbin Wei; Jiamiao Yang

doi:10.1117/1.APN.3.6.066006

1 Introduction

In the last decade, the recurrent neural network (RNN) as a basic architecture of deep learning has achieved great success in various fields, such as time series analysis1^–3 and natural language processing.4^,5 Nonetheless, its implementation on electronic hardware is facing considerable challenges in terms of speed and energy consumption because of the explosive growth of data and rapid increase of task complexity.6^–8 Optical computing,9^–13 with its advantages, including low power consumption, high computational speed, and inherent parallelism, is a potential solution for this problem. Using passive optical materials in the modulation, engineers can minimize the power consumption of optical computing down to a negligible level. Optical reservoir computing (ORC)14 is an embodiment of RNN. It maps the input to a high-dimensional space through its optical reservoir and then correlates the high-dimensional states from the optical reservoir to the output.15 ORC brings the advantages of optical computing into reservoir computing and has attracted the attention of people working on temporal information processing.

In order to tackle complicated tasks, the current ORC has evolved from the early small-scale silicon-based ORC16^,17 to time-delay ORC18^–21 and spatial light-based ORC22^–27 with larger-scale nodes. Time-delay ORC constructs the optical reservoir by connecting virtual nodes separated temporally, achieving system miniaturization at the cost of sacrificing processing speed for node scalability. Spatial light-based ORC utilizes diffraction or scattering effects to construct the optical reservoir with a large number of physical nodes, enabling parallel information processing with a high speed unaffected by node size. Thanks to these characteristics, spatial light-based ORC has found wide applications in high-speed processing tasks, such as motion recognition24 and chaotic sequence prediction.23^,25^–27 Currently, while ORC has significantly outperformed traditional RC in processing speed under large-scale node conditions, there is still a considerable gap in accuracy performance. Dong et al. found that enhancing signal encoding resolution effectively improves the accuracy performance of ORC and proposed an advanced basket encoding,25 achieving prediction accuracy surpassing other encoding methods in Mackey–Glass (MG) chaotic sequence prediction. Enhancing the encoding resolution means that the optical system can recognize more signals, which helps enrich the internal state of the optical reservoir, thereby improving accuracy performance.28^,29 However, the current spatial light-based ORC solely encodes information in a single dimension of the optical field, which limits encoding flexibility and impedes progress toward achieving a breakthrough in input resolution. Furthermore, due to the interaction between amplitude and phase in coherent light propagation, single-dimensional encoding may result in the introduction of task-irrelevant information into the computation, thereby affecting the accuracy performance of spatial light-based ORC to some extent.

Here, we propose complex-value encoding-based optoelectronic reservoir computing (CE-ORC) through the modulation of the complex amplitude of the optical field. This development enhances the discrimination of data in the optical field and prevents the influence of task-irrelevant information on computation. In addition, complex-value encoding for the convenient introduction of scale factors as hyperparameters, tuning the optical reservoir to achieve optimal accuracy performance. We have built a CE-ORC processing unit with thousands of neurons based on a digital micromirror device (DMD) and scattering media. The dedicated field programmable gate array (FPGA) board has been developed to optimize the iteration rate of the processing unit using parallel processing and high-speed interfaces. We demonstrated the excellent performance of CE-ORC, which showed better prediction accuracy than the conventional ORC in both MG time series prediction and weather forecast.

2 Methods

2.1 Principle of CE-ORC

The fundamental idea of CE-ORC is employing the multidimensional property of the optical field to improve the input resolution of the ORC system and introduce multiple hyperparameters to tune the reservoir dynamics for better accuracy performance. As shown in Fig. 1, we construct the optical reservoir with the combination of scattering medium and detector. The CE-ORC first converts the input state defined in the real space and the previous optical reservoir state to the corresponding complex vectors using the complex-valued encoding strategy (see Sec. 2.2). The combination of these two vectors will be loaded on the incident optical field through modulation. The modulated optical field as the input will propagate through the scattering medium to update the reservoir state. The current optical reservoir state, in the end, will be mapped to the output of CE-ORC through a linear operation.

Figure 1.Principle of CE-ORC. The complex-valued encoding strategy converts the input state $U_{t}$ defined in the real space, with the previous state $R_{t - 1}$ , to a complex optical field. The scattering medium as the reservoir transforms this field to the current state $R_{t}$ that is captured by the detector. Based on the captured image, a trained program generates the current output $Y_{t}$ .

Download full size

View all figures

The complex-valued encoding strategy first encodes the real-value information into amplitude vector $A$ and phase vector $φ$ , and then synthesizes these two vectors into a complex vector through the Hadamard product represented by $⊙$ . For tuning the optical reservoir dynamics, we introduce scale factors into the amplitude encoding. Here, $s_{i}$ and $s_{r}$ represent the input weight scale factor and internal weight scale factor, respectively. The phase encoding, implemented through a fixed conversion coefficient that maps signals to the $[0, π]$ interval, is not affected by these scale factors. This approach protects the diversity of the original information and suppresses the influence on the input resolution, as the phase encoding remains consistent, regardless of the scale factors applied.

The incident optical field $E$ consists of two parts: the optical field $E_{in}$ and $E_{res}$ , which are loaded with the input information and the previous optical reservoir state, respectively. The scattering medium mimicking a randomly connected network scrambles the information in $E$ . To obtain the current state $R_{t}$ of the reservoir, we can capture the intensity distribution of the transmitted $E$ with a detector. The mathematical expression of this process is ${\begin{cases} R_{t} = (1 - a) \cdot R_{t - 1} + a \cdot f (W_{in} \cdot E_{in} + W_{res} \cdot E_{res}) \\ E_{in} = A (U_{t}, s_{i}, n) ⊙ \exp [i φ (U_{t}, n)], E_{res} = A (R_{t - 1}, s_{r}, m) ⊙ \exp [i φ (R_{t - 1}, m)] \end{cases},$ (1)where $W_{in}$ and $W_{res}$ are the input weights given (green solid arrows in Fig. 1) and the internal weights of the optical reservoir (green dashed arrows in Fig. 1). $f (\cdot)$ is an activation function associated with the intensity readout by the detector. $a$ is the leak rate that balances the influences of the input and previous state on the current state.

Based on the current state $R_{t}$ , we can calculate the output $Y_{t}$ through a straightforward linear operation according to the RC theory (Note 1 and Fig. S1 in the Supplemental Material), $Y_{t} = W_{out} \cdot R_{t},$ (2)where $W_{out}$ is the weight for output and is the only component in CE-ORC that needs to be trained. The training is equivalent to a linear regression problem minimization, $Δ = \sum_{k} ‖ W_{out} \cdot R_{k} - y_{k} ‖^{2} + γ ‖ W_{out} ‖^{2} .$ (3)

Here, $R_{k}$ and $y_{k}$ are the state of the optical reservoir and the targeted output at the step $k$ , respectively. $γ$ is the regularization factor of the ridge regression program to prevent overfitting.

2.2 Complex-value Encoding Based on a DMD

A complex-value encoding strategy based on the multidegree-of-freedom characteristic of the optical field is proposed to improve the input resolution. In this encoding strategy, we converted a real number $x$ in the range of $(- 1, 1)$ to a complex vector. If we assume that the dimension of the encoded complex vector was $n_{bin}$ , the $k$ ’th element $Φ_{k} (x)$ in the vector can be expressed as ${\begin{cases} Φ_{k} (x) = A_{k} (u_{k}, s) \times \exp [i φ_{k} (x)] \\ A_{k} (u_{k}, s) = s \times (1 + 0.5 [\tanh (5 u_{k} - 10) - \tanh (5 u_{k} + 10)] + \exp (- \frac{u_{k}^{2}}{0.18} 1) \\ u_{k} = 3 x - 2 + \frac{4 (k - 1)}{n_{bin}} \\ φ_{k} (x) = \exp (\frac{(x + 1)}{2} π i) \end{cases} .$

Here, $k$ is an integer index ranging from 1 to $n_{bin}$ ( $k \in {1, 2, \dots, n_{bin}}$ ). $A_{k}$ and $φ_{k}$ are the amplitude and phase of the $Φ_{k}$ , respectively. $u_{k}$ is an intermediate quantity. $s$ is the scale factor, which affects the amplitude of the encoded complex vector. The phase is encoded within the monotonic range of $[0, π]$ through a linear transformation of the original data from the real space. This approach not only conserves the similarity relationships of the original data but also effectively reduces encoding complexity and improves encoding speed. The scale factor in the complex-value encoding strategy can serve to adjust the scattering transmission matrix, which fits with the idea of traditional RC theory to optimize the input weights and the spectral radius.

We generated the optical field by modulating the incident laser beam with a DMD. In the conventional method, the DMD provided only a binary amplitude modulation on the incident light. In CE-ORC, we introduced the superpixel encoding to modulate both the amplitude and phase of incident light. This encoding grouped $n \times n$ ( $4 \times 4$ in our demonstration) neighboring DMD pixels together as a superpixel. Each DMD pixel had its own phase prefactor in the first-order diffraction. A low-pass filter behind $L_{1}$ blended the images of pixels and averages over the neighboring pixels.

2.3 Experimental Setup and Characterization

Figure 2(a) shows a kind of CE-ORC implementation based on a DMD. To manipulate both the amplitude and phase of the optical field with the DMD, we adopted the superpixel technique30^,31 (Note 2 and Fig. S2 in the Supplemental Material) modulating the incident laser. The superpixel technique loaded the complex information in the first-order diffraction from the DMD. The specific form of the complex information depended on the input and previous state. After being filtered out by a pinhole placed off-axis, the first-order diffraction was then transmitted through the scattering medium that mimicked the reservoir. In the end, a detector recorded the pattern of speckles as the current state. The above process corresponds to one optical reservoir update and it will be repeated as many times as there are reservoir states to compute.

Figure 2.Experimental setup and characterization of CE-ORC system. (a) Schematic illustration and physical demonstration of the CE-ORC implementation. M, mirror; $L_{1}$ and $L_{2}$ , lenses; PF, pinhole filter; SM, scattering medium. (b) CE-ORC processing unit and dedicated FPGA board. (c) Distance matrix of the CE-ORC system with the encoding dimension of 10. (d) Correlation matrix of the responses in different ORC systems.

Download full size

View all figures

For demonstration, we constructed a CE-ORC processing unit with the size of $350 mm \times 300 mm \times 160 mm$ (see Sec. 2.4), as shown in Fig. 2(b). The optical components were mounted in a sealable frame, effectively reducing air disturbance and stray light, and thereby ensuring system stability. To optimize the iteration rate of CE-ORC, we developed a dedicated FPGA board. This board optimized the data interaction between the DMD and the detector, eliminating the need for a computer relay. In addition, we converted some complex operations, such as encoding, into parallel lookup tables, further enhancing the processing speed of the unit. Finally, the CE-ORC processing unit completed 1000 iterations of computation in about 825 ms, achieving an iteration rate of $\sim 1.2 kHz$ .

We evaluated the input resolution of the CE-ORC processing unit and compared it with the conventional basket-encoding ORC with respect to the normalized distance matrix of the encoded information [Fig. 2(c)]. The element in the distance matrix is the Euclidean distance between the encoded vectors corresponding to the horizontal and vertical coordinates. The distance matrix of the original data in real space is also shown in Fig. 2(c). In comparison, the distance matrix generated by CE-ORC shows a higher degree of similarity to the distance matrix of the original data. This indicates that the CE-ORC can well express the differences between two original data in the optical field and has a high input resolution. Moreover, the input resolution of CE-ORC is almost independent of the scale factor because the normalized distance matrix does not change significantly as the scale factor decreases. In addition, we evaluated the effect of input resolution improvement on the optical reservoir by the correlation matrix of the speckle patterns for different inputs [Fig. 2(d)]. Speckle patterns captured by the detector are the responses of the optical reservoir to inputs. In comparison, the richness of the responses in CE-ORC was better than that in the conventional basket-encoding ORC, whose correlation matrix was highly fragmented. All this evidence supports that CE-ORC has richer optical reservoir states due to the improved input resolution, which can achieve better task performance than conventional ORC.

2.4 Details of Experimental Setup

We introduced a laser with a 532 nm wavelength into the CE-ORC processing unit through a single-mode fiber. The lens $L_{0}$ ( $f = 100 mm$ ) was used to collimate and expand the laser beam to cover the entire active area of the DMD chip (DLP9500, Texas Instruments). After being modulated by the DMD, the optical field would pass the lens $L_{1}$ ( $f = 200 mm$ ) and pinhole filter, allowing only the first-order diffraction to pass through. The first-order diffraction is then transmitted through the scattering medium. Behind the scattering medium, an objective ( $NA = 0.4$ , 20×) collected the scattered light and projected it to a CMOS chip (PYTHON2000, Onsemi). The images captured by the CMOS chip are transmitted to the FPGA board for subsequent processing.

To optimize the computation speed of the CE-ORC processing unit, we eliminated unnecessary data transfer between hardware and computer during task execution and utilized high-speed interfaces and the powerful parallel processing capability of FPGAs. The FPGA board communicated with the DMD control card and CMOS control card through four SerDes interfaces that supported a high-speed transmission rate of up to 20 Gbps. To prevent the reduction of interaction efficiency between the DMD and CMOS caused by computer processing, we integrated the processing programs for CE-ORC into the FPGA board so that operations, such as extracting the optical reservoir state, complex-value encoding, and superpixel encoding could be executed on the board. In addition, the FPGA board was equipped with 42 MB of BRAM and 16G of DDR4. The ample storage resources on the board enabled us to employ high-efficiency lookup tables, resulting in a significant reduction of time spent on encoding operations. Moreover, the FPGA board supported 150 lookup tables working synchronously. Finally, The CE-ORC processing unit could complete 1000 computations in $\sim 825 ms$ with 5000 nodes.

3 Results

3.1 MG Time Series Prediction

To confirm the utility of CE-ORC for the time series analysis, we tested the CE-ORC processing unit on the MG time data set (Note 3 in the Supplemental Material), including 8250 steps. The reservoir node size was set to 512. The state of each node was encoded as a 10-dimensional complex vector so that the state vector of the reservoir had a dimension of $512 \times 10$ . The input was encoded as a 128-dimensional complex vector, and then this vector was augmented by replicating it 40 times to match the dimension of the state vector. Matching the dimensions was to ensure the input area was equal to the state area in the incident optical field, which protected the memory capacity of the reservoir.25 We tested both the one-step and free-running predictions (Note 4 and Fig. S3 in the Supplemental Material) when the delay $τ = 17$ and the Lyapunov exponent (Note 5 in the Supplementary Material) $Λ_{\max} = 0.006$ as usual.25^,32 We fed the CE-ORC with the first 7600 steps for training and the other 650 steps for testing.

Figure 3.The horizontal coordinate representing time has been normalized with respect to the Lyapunov exponent $Λ_{\max}$ . (a) Example of free-running prediction made by CE-ORC at optimal configuration ( $s_{i} = 0.95$ and $s_{r} = 0.9$ ). (b) Comparison of NMSE in the free-running predictions given by the conventional basket-encoding ORC, and CE-ORC with different configurations of the scale factors. (c) Example of one-step prediction made by CE-ORC at $s_{i} = 0.95$ and $s_{r} = 0.9$ . (d) Comparison of errors in the one-step predictions given by the conventional basket-encoding ORC and CE-ORC. Note: Free-running prediction [(a), (b)] and one-step prediction [(c), (d)] are different prediction modes and should not be directly compared.

Download full size

View all figures

We observed that the CE-ORC had the optimal prediction performance when $s_{i} = 0.95$ , $s_{r} = 0.9$ , and $a = 0.9$ . Figure 3(a) shows one of the successful prediction results in the free-running prediction, where the temporal axis is normalized by the maximal Lyapunov exponent. To evaluate the predictive performance, we compared the results obtained from optimized CE-ORC, unoptimized CE-ORC ( $s_{i}$ and $s_{r}$ were 1), and conventional basket-encoding ORC, in terms of normalized mean squared errors (NMSEs) [Fig. 3(b)]. It is important to note that in all the comparison experiments in this paper, the conventional basket-encoding ORC was configured with the same reservoir size, input encoding dimension, and state encoding dimension as the CE-ORC to ensure a fair comparison. For illustration, we averaged the NMSEs over 50 repeated experiments. We found that the value of the NMSE increased over time because of the cumulative effect of errors in free prediction. The NMSE obtained from the optimized CE-ORC was the smallest, and the unoptimized CE-ORC delivered a result falling in between the optimized CE-ORC and conventional basket-encoding ORC. The NMSE obtained from the optimized CE-ORC around two Lyapunov times reduced by about $\sim 75 %$ compared to that from the conventional basket-encoding ORC. These results imply that improved input resolution contributes to higher accuracy performance, and searching suitable scale factors can further improve the accuracy performance.

In the one-step prediction, the outputs of optimized CE-ORC were almost identical to the values given in the target data set [Fig. 3(c)]. The prediction errors from CE-ORC were much smaller than those from the conventional basket-encoding ORC [Fig. 3(d)]. Their corresponding NMSEs were $\sim 4.6 \times 10^{- 4}$ and $\sim 2.4 \times 10^{- 3}$ , respectively; namely, the NMSE decreased by $\sim 80 %$ . All these results indicate that the CE-ORC has a higher accuracy performance than conventional ORC in the prediction.

3.2 Weather Forecast

Furthermore, we tested the CE-ORC on weather data to demonstrate its capability in real-world applications. The weather data set was the record 14,400 h of temperature and humidity from Shanghai Hongqiao International Airport from January 1, 2018. The reservoir contained 5000 nodes. The state of each node was encoded as a 10-dimensional complex vector so that the state vector of the reservoir had a dimension of $5000 \times 10$ . To match the dimension of the state vector, we augmented the input vector by replicating it 500 times after encoding the input as a 100-dimensional complex vector.

In the free-running prediction, we fed the first 14,280 steps for training and the other 120 steps for testing. We found that the CE-ORC processing unit reached the optimal prediction performance for temperature and humidity with the parameter configurations of $s_{i} = 0.7$ , $s_{r} = 0.55$ , $a = 0.5$ , and $s_{i} = 0.8$ , $s_{r} = 0.6$ , $a = 0.45$ , respectively. Within the first 24 h, the predictions of CE-ORC were well consistent with the real data [Figs. 4(a) and 4(b)]. The errors of CE-ORC predictions were within 2 Fahrenheit and 5%. The predictions of CE-ORC gradually diverged from the true values after 24 h, but their trend was still consistent with the trend of the real data. In contrast, the prediction error of conventional basket-encoding ORC was much larger than that of CE-ORC.

Figure 4.Test of CE-ORC for weather forecast. (a) Predictions of temperature made by the CE-ORC and conventional basket-encoding ORC in the free-running prediction mode. (b) Predictions of humidity made by the CE-ORC and conventional basket-encoding ORC in the free-running prediction mode. (c) Predictions of temperature made by the CE-ORC in the one-step prediction mode. (d) Errors in the predictions of humidity made by the CE-ORC in the one-step prediction mode.

Download full size

View all figures

In the one-step prediction, we fed the CE-ORC model with the first 13,400 steps for training and the other 1000 steps for testing. The predictions of CE-ORC were almost the same as the real data [Figs. 4(c) and 4(d)] because the errors in prediction did not accumulate. The errors in their predictions were within 2.5 Fahrenheit and 6%, respectively. These results show that by tuning the reservoir dynamics, the CE-ORC has good accuracy performance in weather forecasts and can adapt well to task changes.

4 Discussion and Conclusion

We proposed CE-ORC, which introduced complex-value encoding to spatial light-based ORC, enhancing input resolution and expanding system configurability through multiple hyperparameters (including two scale factors and the leak rate), thereby enabling the adjustment of optical reservoir dynamics and significantly improving prediction accuracy. Furthermore, the CE-ORC processing unit we constructed modulates the complex amplitude of the optical field using a DMD, facilitating rapid parallel encoding of large-scale information in the spatial domain. Controlled by a dedicated FPGA board, this unit optimized the iteration rate to $\sim 1.2 kHz$ , achieved using parallel lookup tables for encoding operations and high-speed interfaces to reduce information transfer time between the DMD and detector. In our demonstration, we tested the CE-ORC for predicting the evolution of chaotic MG systems as well as weather forecasts. For the MG data set, the CE-ORC significantly outperformed the conventional ORC in prediction accuracy. The NMSE of prediction in the one-step and free-running predictions around two Lyapunov times decreased by $\sim 80 %$ and 75%, respectively. For the weather forecast, the CE-ORC could give a reasonable prediction for the next 24 h. The prediction errors for temperature and humidity were less than 2 Fahrenheit and 5%, respectively.

The major factors constraining the scale of CE-ORC are the dimensions of the DMD and superpixel encoding that grouped multiple neighboring DMD pixels into one superpixel to modulate both the amplitude and phase of the optical field. This implantation reduced the dimensions of the state of the reservoir and of the input of the neural network. The nonlinear activation of CE-ORC was simply measuring the light intensity that could be replaced by a more efficient optical nonlinear operation to further improve performance. When implementing CE-ORC, several important considerations should be taken into account. The size of the reservoir should be configured according to the complexity of the task. In general, more complex tasks require larger reservoir sizes. However, caution must be exercised to avoid overfitting, which can occur if the reservoir size is excessively large relative to the task complexity. In addition, ensuring the detector has good sensitivity and a high signal-to-noise ratio is essential to maintain system performance and processing speed because of the use of pinholes in the CE-ORC optical setup.

Jiamiao Yang received his PhD from Beijing Institute of Technology in 2015. He worked as a postdoctoral research associate in the Caltech Optical Imaging Laboratory at the California Institute of Technology until 2020. Currently, he is an associate professor at Shanghai Jiao Tong University. He is the author of more than 30 journal papers. His research focuses on wavefront shaping, optical measurement, and optical computing.

Biographies of the other authors are not available.

Category: Research Articles

Received: Apr. 15, 2024

Accepted: Sep. 24, 2024

Published Online: Oct. 24, 2024

The Author Email: Xunbin Wei (xwei@bjmu.edu.cn), Jiamiao Yang (jiamiaoyang@sjtu.edu.cn)

DOI:10.1117/1.APN.3.6.066006

CSTR:32397.14.1.APN.3.6.066006