Chinese Optics Letters, Volume. 19, Issue 8, 082501(2021)

Optical tensor core architecture for neural network training based on dual-layer waveguide topology and homodyne detection

Shaofu Xu and Weiwen Zou*
Author Affiliations
  • State Key Laboratory of Advanced Optical Communication Systems and Networks, Intelligent Microwave Lightwave Integration Innovation Center (iMLic), Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
  • show less
    Figures & Tables(4)
    (a) Schematic of the OTC. An example scale of 3 × 3 is depicted. ILC, inter-layer coupler; Mod array, modulator array. (b) Detailed schematic of a DPU. A portion of light is dropped from the bus waveguides. PS, phase shifter; DC, directional coupler. (c) Impulse response of the HDEA. Time constant τ of the circuit is defined as voltage decays to 1/e. (d) An example of electron accumulation. Optical pulses arrive at the HDEA with interval of 1/fm. The accumulation time is T.
    (a) FC network. The matrix multiplications are implemented on OTC. ReLU after each layer is conducted in auxiliary electronics. The output is the one-hot classification vector given by the softmax function. (b) The convolutional network. Convolutions are conducted on OTC. Max pooling layers shrink the image size by half. All layers are ReLU-activated except for the pooling layers and the last layer.
    (a) Loss functions of the FC network during training. Results of the standard MBGD algorithm (Std. train) and the on-OTC training are illustrated. (b) The prediction accuracy of the FC network during training. The training accuracy and the testing accuracy of the standard MBGD algorithm are depicted without marks. The on-OTC training is depicted with marks. (c) Loss functions of the convolutional network during training. (d) The prediction accuracy of the convolutional network during training.
    Parameter visualization of the trained neural networks. (a) Trained parameters of the fourth layer in the FC network model. The standard-trained parameters are provided for reference, and the normalized deviation is depicted. (b) and (c) Distributions of trained parameters and deviations of the second and third layers of the FC network. The counts are normalized by the maximal counts. (d) Trained kernels of the first convolutional layer in the convolutional network. (e) and (f) Distributions of trained parameters and deviations of the first and second FC layers of the convolutional network. (b), (c), (e), and (f) share the same figure legends.
    Tools

    Get Citation

    Copy Citation Text

    Shaofu Xu, Weiwen Zou, "Optical tensor core architecture for neural network training based on dual-layer waveguide topology and homodyne detection," Chin. Opt. Lett. 19, 082501 (2021)

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Optoelectronics

    Received: Sep. 29, 2020

    Accepted: Dec. 21, 2020

    Published Online: Apr. 20, 2021

    The Author Email: Weiwen Zou (wzou@sjtu.edu.cn)

    DOI:10.3788/COL202119.082501

    Topics