Laser & Optoelectronics Progress, Volume. 60, Issue 22, 2220001(2023)

Influence of Hyperparameters on Performance of Optical Neural Network Training Algorithms

Wen Cao, Meiyu Liu, Minghao Lu, Xiaofeng Shao, Qifa Liu, and Jin Wang*
Author Affiliations
  • School of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, Jiangsu , China
  • show less
    Figures & Tables(18)
    Two connection topologies of MZI when the number of input ports is 8.(a) Connection topology of FFT-typed ONN;(b) connection topology of grid-typed ONN
    Flowchart of the ONN
    Flowchart of forward propagation and backward propagation of ONN
    Accuracy of ONN with the SGD algorithm under different momentum coefficients with different nonlinear functions and different number of hidden layers when the learning rate is 0.05
    Variation in the accuracy of ONN with different training algorithms with the epoch when the learning rate is 0.05
    Variation in the accuracy of ONN with different training algorithms with the epoch when the learning rate is 0.005
    • Table 1. Experimental platform parameters

      View table

      Table 1. Experimental platform parameters

      Operating systemWindows7 64 bit
      CPUIntel(R) Core(TM) i5-5200U CPU @2.20 GHz
      GPUGeForce 920M
      Software platformPyTorch1.1.0
    • Table 2. Accuracy of ONN with four training algorithms under different learning rates (Softplus as the nonlinear function)

      View table

      Table 2. Accuracy of ONN with four training algorithms under different learning rates (Softplus as the nonlinear function)

      AlgorithmR=0.5R=0.05R=0.005R=5×10-4R=5×10-5
      SGD0.0910.9600.9410.8790.459
      RMSprop0.1010.8870.9740.9590.937
      Adam0.1010.9190.9730.9600.942
      Adagrad0.9650.9600.9420.8340.246
    • Table 3. Accuracy of ONN with four training algorithms under different learning rates (ReLU as the nonlinear function)

      View table

      Table 3. Accuracy of ONN with four training algorithms under different learning rates (ReLU as the nonlinear function)

      AlgorithmR=0.5R=0.05R=0.005R=5×10-4R=5×10-5
      SGD0.0210.9110.8530.6510.435
      RMSprop0.1090.9190.9350.9010.671
      Adam0.0690.8940.9410.7120.591
      Adagrad0.9340.9100.6440.2640.132
    • Table 4. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates (Softplus as the nonlinear function)

      View table

      Table 4. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates (Softplus as the nonlinear function)

      AlgorithmR=0.5R=0.05R=0.005R=5×10-4R=5×10-5
      SGD0.9480.9040.6890.3040.135
      RMSprop0.2020.9270.9560.9070.648
      Adam0.8520.9440.9500.8820.558
      Adagrad0.9610.9390.8240.2880.079
    • Table 5. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates(ReLU as the nonlinear function)

      View table

      Table 5. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates(ReLU as the nonlinear function)

      AlgorithmR=0.5R=0.05R=0.005R=5×10-4R=5×10-5
      SGD0.0980.9050.8140.5390.290
      RMSprop0.1680.9240.9330.9040.565
      Adam0.8370.9300.9260.8630.463
      Adagrad0.9230.9160.7450.2180.162
    • Table 6. Running memory of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)

      View table

      Table 6. Running memory of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)

      AlgorithmTwo hidden layersOne hidden layer
      SGD668.124353.919
      RMSprop684.356333.144
      Adam709.480330.339
      Adagrad717.371336.585
    • Table 7. Training time of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)

      View table

      Table 7. Training time of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)

      AlgorithmTwo hidden layersOne hidden layer
      SGD449.995183.853
      RMSprop450.083186.815
      Adam447.009156.496
      Adagrad436.669198.566
    • Table 8. Running memory of ONN with four training algorithms under different learning rates

      View table

      Table 8. Running memory of ONN with four training algorithms under different learning rates

      AlgorithmR=0.5R=0.05R=0.005R=5×10-4R=5×10-5
      SGD671.940688.124696.836731.680736.724
      RMSprop720.472684.356656.360726.204732.252
      Adam728.012709.480712.940717.196740.156
      Adagrad709.597717.371709.236723.026735.144
    • Table 9. Running memory of ONN with four training algorithms without hidden layer 2 under different learning rates

      View table

      Table 9. Running memory of ONN with four training algorithms without hidden layer 2 under different learning rates

      AlgorithmR=0.5R=0.05R=0.005R=5×10-4R=5×10-5
      SGD373.418353.919344.307376.671350.531
      RMSprop339.749333.144367.170363.365333.895
      Adam348.659330.339369.867378.350339.490
      Adagrad359.918336.585373.809366.658369.934
    • Table 10. Training time of ONN with four training algorithms under different learning rates

      View table

      Table 10. Training time of ONN with four training algorithms under different learning rates

      AlgorithmR=0.5R=0.05R=0.005R=5×10-4R=5×10-5
      SGD453.210449.995385.783366.278372.436
      RMSprop398.588450.083377.086445.179388.789
      Adam439.016447.009430.769397.805423.890
      Adagrad351.312436.669424.947400.590426.575
    • Table 11. Training time of ONN with four training algorithms without hidden layer 2 under different learning rates

      View table

      Table 11. Training time of ONN with four training algorithms without hidden layer 2 under different learning rates

      AlgorithmR=0.5R=0.05R=0.005R=5×10-4R=5×10-5
      SGD161.840183.853199.215192.506158.501
      RMSprop172.940186.815193.341177.395153.627
      Adam173.191156.496155.703146.066170.774
      Adagrad165.250198.566170.376185.795190.148
    • Table 12. Accuracy, running memory, and training time of ONN with the SGD algorithm under different momentum coefficients when the learning rate is 0.05

      View table

      Table 12. Accuracy, running memory, and training time of ONN with the SGD algorithm under different momentum coefficients when the learning rate is 0.05

      Momentum coefficientAccuracyRunning memory /kBTraining time /ms
      00.960668.124449.995
      0.10.962602.880380.814
      0.50.964614.468402.444
      0.90.969625.308403.404
      0.970.943627.606394.597
      0.980.099645.633409.608
      1.00.098633.696396.202
    Tools

    Get Citation

    Copy Citation Text

    Wen Cao, Meiyu Liu, Minghao Lu, Xiaofeng Shao, Qifa Liu, Jin Wang. Influence of Hyperparameters on Performance of Optical Neural Network Training Algorithms[J]. Laser & Optoelectronics Progress, 2023, 60(22): 2220001

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Optics in Computing

    Received: Jan. 30, 2023

    Accepted: Feb. 27, 2023

    Published Online: Nov. 6, 2023

    The Author Email: Wang Jin (jinwang@njupt.edu.cn)

    DOI:10.3788/LOP230535

    Topics