Efficient stochastic parallel gradient descent training for on-chip optical processor

Yuanjian Wan; Xudong Liu; Guangze Wu; Min Yang; Guofeng Yan; Yu Zhang; Jian Wang

doi:10.29026/oea.2024.230182

Opto-Electronic Advances, Volume. 7, Issue 4, 230182(2024)

Efficient stochastic parallel gradient descent training for on-chip optical processor

Yuanjian Wan^1,2、†, Xudong Liu^1,2、†, Guangze Wu^1,2, Min Yang^1,2, Guofeng Yan^1,2, Yu Zhang^1,2, and Jian Wang^1,2、*

¹Wuhan National Laboratory for Optoelectronics and School of Optical and Electronic Information, Huazhong University of Science and Technology, Wuhan 430074, China

²Optics Valley Laboratory, Wuhan 430074, China

show less

Figures & Tables(9)

Fig. 1. (a) Conceptual diagram of the on-chip optical processor for optical switching and channel descrambling in MDM communication systems. (b) Schematic configuration of the integrated reconfigurable optical processor. θ and ϕ mean the phase shift of the phase shifters. MDM: mode-division multiplexing; MUX: multiplexer; DEMUX: demultiplexer.

Download full size

View in Article

Fig. 2. Flow chart of Stochastic Parallel Gradient Descent (SPGD) algorithm.

Download full size

View in Article

Fig. 3. Training results in electronic computer for optical switching, optical channel descrambling, and optical channel descrambling and switching. (a) Emulated light power distributions and (b) normalized light intensity distributions after training when the switching state is I₁−O₂, I₂−O₁, I₃−O₅, I₄−O₆, I₅−O₃, I₆−O₄. (d, e) Normalized light intensity distributions (d) before and (e) after training when randomly generating a set of phases in the part (1) of our chip to emulate crosstalk. (g, h) Normalized light intensity distributions (g) before and (h) after training with crosstalk when the switching state is: I₁−O₅, I₂−O₃, I₃−O₂, I₄−O₄, I₅−O₁, I₆−O₆. (c, f, i) The evaluation function changing with iteration rounds.

Download full size

View in Article

Fig. 4. (a) Schematic of experimental configuration. (b) Microscopy image of optical processor. VSA: voltage source array; PD: photodetector array.

Download full size

View in Article

Fig. 5. Online training results for optical switching at a wavelength of 1550 nm. (a) The evaluation function changing with iteration rounds when the switching state is I₁−O₃, I₂−O₁, I₃−O₄, I₄−O₆, I₅−O₂, I₆−O₅. The insets figures show the light power distributions when the round of iteration equals 50, 300, and 600, respectively. (b) The measured light power distributions after training. (c) The normalized light intensity distributions of measured results. (d, e) The measured light power distributions and normalized light intensity distributions when the switching state is I₁−O₃, I₂−O₆, I₃−O₄, I₄−O₂, I₅−O₁, I₆−O₅.

Download full size

View in Article

Fig. 6. Online training results for optical channel descrambling at a wavelength of 1550 nm. (a) The evaluation function changing with iteration rounds. The insets show the light power distributions when the round of iteration equals 1, 300, and 600, respectively. (b) The light power distributions before training. (c) The light power distributions after training. (d, e) The results of training when generating another matrix $\tilde{U}$ .

Download full size

View in Article

Fig. 7. Online training results for optical channel descrambling and switching at a wavelength of 1550 nm. (a) The evaluation function changing with iteration rounds when the switching state is I₁−O₄, I₂−O₁, I₃−O₅, I₄−O₆, I₅−O₃, I₆−O₂. The insets show the light power distributions when the round of iteration equals 1, 100, and 400, respectively. (b) The light power distributions before training. (c) The light power distributions after training. (d, e) The results of training when generating another matrix $\tilde{U}$ and the switching state is I₁−O₅, I₂−O₃, I₃−O₁, I₄−O₆, I₅−O₂, I₆−O₄.

Download full size

View in Article

Fig. 8. Experimental setup and measured results for optical channel descrambling. (a) Experimental setup for the 6×6 optical descrambling systems. (b) The measured BER performance for back-to-back, optimization without crosstalk, before optimization with crosstalk, and after optimization with crosstalk systems. (c) The measured constellation chart at the back-to-back. (d) The measured constellation chart without crosstalk. (e) The measured constellation chart before optimization with crosstalk. (f) The measured constellation chart after optimization with crosstalk. PC: polarization controller; AWG: arbitrary waveform generator; EDFA: erbium-doped fiber amplifier; VOA: variable optical attenuator; OSC: oscilloscope; DSP: digital signal processing.

Download full size

View in Article

Table 1. Performance of different algorithms.
View table
View in Article
Table 1. Performance of different algorithms.
Algorithm Numbers of update Matrix sizes
N=6 N=10 N=16 N=32
GD N(N−1)×T 690 3870 13200 93248
GA M×T 1048 9046.67 39732 171200
PSO M×T 1024 5912 31056 116145
SPGD 3×T 297.9 1092.6 4752.6 18053.1

Tools

Get Citation

Copy Citation Text

Yuanjian Wan, Xudong Liu, Guangze Wu, Min Yang, Guofeng Yan, Yu Zhang, Jian Wang. Efficient stochastic parallel gradient descent training for on-chip optical processor[J]. Opto-Electronic Advances, 2024, 7(4): 230182

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites