Advanced Photonics, Volume. 7, Issue 1, 016004(2025)

Training neural networks with end-to-end optical backpropagation

James Spall1,2、†, Xianxin Guo1,2、*, and Alexander I. Lvovsky1,2、*
Author Affiliations
  • 1University of Oxford, Clarendon Laboratory, Oxford, United Kingdom
  • 2Lumai Ltd., Wood Centre for Innovation, Oxford, United Kingdom
  • show less
    Figures & Tables(5)
    Illustration of optical training. (a) Network architecture of the ONN used in this work, which consists of two fully connected linear layers and a hidden layer. (b) Simplified experimental schematic of the ONN. Each linear layer performs optical MVM with a cylindrical lens and an SLM that encodes the weight matrix. Hidden layer activations are computed using SA in an atomic vapor cell. Light propagates in both directions during optical training. (c) Working principle of SA activation. The forward beam (pump) is shown by solid red arrows and the backward (probe) by purple wavy arrows. The probe transmission depends on the strength of the pump and approximates the gradient of the SA function. For high forward intensity (top panel), a large portion of the atoms are excited to the upper level. Stimulated emission produced by these atoms largely compensates for the absorption due to the atoms at the ground level. For the weak pump (bottom panel), the excited level population is small, and the absorption is significant. (d) NN training procedure. (e) Optical training procedure. Both signal and error propagations in the two directions are fully implemented optically. Loss function calculation and parameter update are left for electronics without interrupting the optical information flow.
    Multilayer ONN characterization. (a) Scatterplots of measured-against-theory results for MVM-1 (first layer forward), MVM-2a (second layer forward), and MVM-2b (second layer backward). All three MVM results are taken simultaneously. Histograms of the signal and noise error for each MVM are displayed underneath. (b) First layer activations ameas(1) measured after the vapor cell, plotted against the theoretically expected linear MVM-1 output ztheory(1) before the cell. The green line is a best-fit curve of the theoretical SA nonlinear function. (c) Amplitude of a weak constant probe passed backward through the vapor cell as a function of the pump ztheory(1), with a constant input probe. Measurements for both forward and backward beams are taken simultaneously.
    Optical training performance. (a) Decision boundary charts of the ONN inference output for three different classification tasks, after the ONN has been trained optically (top) or in silico (bottom). (b) Learning curves of the ONN for classification of the “Rings” dataset, showing the mean and standard deviation of the validation loss and accuracy averaged over five repeated training runs. Shown above are decision boundary charts of the ONN output for the test set, after different epochs. (c) Evolution of output neuron values and output errors, for the training set inputs of the two classes. (d) Comparison between optically measured and digitally calculated gradients. Each panel shows gradients for each of the 10 weight matrix elements.
    • Table 1. Summary of network architecture and hyperparameters used in both optical and digital training.

      View table
      View in Article

      Table 1. Summary of network architecture and hyperparameters used in both optical and digital training.

      DatasetInput neuronsHidden neuronsOutput neuronsLearning rateEpochsBatches per epochBatch size
      Rings2520.01162020
      XOR0.00530
      Arches0.0125
    • Table 2. Generalization of the optical training scheme.

      View table
      View in Article

      Table 2. Generalization of the optical training scheme.

      Network layerFunctionImplementation example
      Linear layerMVMFree-space optical multiplier and photonic crossbar array
      DiffractionProgrammable optical mask
      ConvolutionLens Fourier transform
      Nonlinear layerSAAtomic vapor cell, semiconductor absorber, and graphene
      Saturable gainEDFA, SOA, and Raman amplifier
    Tools

    Get Citation

    Copy Citation Text

    James Spall, Xianxin Guo, Alexander I. Lvovsky, "Training neural networks with end-to-end optical backpropagation," Adv. Photon. 7, 016004 (2025)

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Research Articles

    Received: Aug. 7, 2024

    Accepted: Dec. 11, 2024

    Posted: Dec. 11, 2024

    Published Online: Feb. 10, 2025

    The Author Email: Guo Xianxin (xianxin.guo@lumai.co.uk), Lvovsky Alexander I. (alex.lvovsky@physics.ox.ac.uk)

    DOI:10.1117/1.AP.7.1.016004

    CSTR:32187.14.1.AP.7.1.016004

    Topics