Influence of Hyperparameters on Performance of Optical Neural Network Training Algorithms

Wen Cao; Meiyu Liu; Minghao Lu; Xiaofeng Shao; Qifa Liu; Jin Wang

doi:10.3788/LOP230535

Laser & Optoelectronics Progress, Volume. 60, Issue 22, 2220001(2023)

Influence of Hyperparameters on Performance of Optical Neural Network Training Algorithms

Wen Cao, Meiyu Liu, Minghao Lu, Xiaofeng Shao, Qifa Liu, and Jin Wang^*

School of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, Jiangsu , China

show less

Abstract Get PDF(in Chinese)

Figures & Tables(18)

Fig. 1. Two connection topologies of MZI when the number of input ports is 8.(a) Connection topology of FFT-typed ONN;(b) connection topology of grid-typed ONN

Download full size

Fig. 2. Flowchart of the ONN

Download full size

Fig. 3. Flowchart of forward propagation and backward propagation of ONN

Download full size

Fig. 4. Accuracy of ONN with the SGD algorithm under different momentum coefficients with different nonlinear functions and different number of hidden layers when the learning rate is 0.05

Download full size

Fig. 5. Variation in the accuracy of ONN with different training algorithms with the epoch when the learning rate is 0.05

Download full size

Fig. 6. Variation in the accuracy of ONN with different training algorithms with the epoch when the learning rate is 0.005

Download full size

Table 1. Experimental platform parameters
View table
Table 1. Experimental platform parameters
Operating system Windows7 64 bit
CPU Intel（R） Core（TM） i5-5200U CPU @2.20 GHz
GPU GeForce 920M
Software platform PyTorch1.1.0

Table 2. Accuracy of ONN with four training algorithms under different learning rates (Softplus as the nonlinear function)
View table
Table 2. Accuracy of ONN with four training algorithms under different learning rates (Softplus as the nonlinear function)
Algorithm R=0.5 R=0.05 R=0.005 R=5×10^-4 R=5×10^-5
SGD 0.091 0.960 0.941 0.879 0.459
RMSprop 0.101 0.887 0.974 0.959 0.937
Adam 0.101 0.919 0.973 0.960 0.942
Adagrad 0.965 0.960 0.942 0.834 0.246

Table 3. Accuracy of ONN with four training algorithms under different learning rates (ReLU as the nonlinear function)
View table
Table 3. Accuracy of ONN with four training algorithms under different learning rates (ReLU as the nonlinear function)
Algorithm R=0.5 R=0.05 R=0.005 R=5×10^-4 R=5×10^-5
SGD 0.021 0.911 0.853 0.651 0.435
RMSprop 0.109 0.919 0.935 0.901 0.671
Adam 0.069 0.894 0.941 0.712 0.591
Adagrad 0.934 0.910 0.644 0.264 0.132

Table 4. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates (Softplus as the nonlinear function)
View table
Table 4. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates (Softplus as the nonlinear function)
Algorithm R=0.5 R=0.05 R=0.005 R=5×10^-4 R=5×10^-5
SGD 0.948 0.904 0.689 0.304 0.135
RMSprop 0.202 0.927 0.956 0.907 0.648
Adam 0.852 0.944 0.950 0.882 0.558
Adagrad 0.961 0.939 0.824 0.288 0.079

Table 5. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates(ReLU as the nonlinear function)
View table
Table 5. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates(ReLU as the nonlinear function)
Algorithm R=0.5 R=0.05 R=0.005 R=5×10^-4 R=5×10^-5
SGD 0.098 0.905 0.814 0.539 0.290
RMSprop 0.168 0.924 0.933 0.904 0.565
Adam 0.837 0.930 0.926 0.863 0.463
Adagrad 0.923 0.916 0.745 0.218 0.162

Table 6. Running memory of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)
View table
Table 6. Running memory of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)
Algorithm Two hidden layers One hidden layer
SGD 668.124 353.919
RMSprop 684.356 333.144
Adam 709.480 330.339
Adagrad 717.371 336.585

Table 7. Training time of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)
View table
Table 7. Training time of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)
Algorithm Two hidden layers One hidden layer
SGD 449.995 183.853
RMSprop 450.083 186.815
Adam 447.009 156.496
Adagrad 436.669 198.566

Table 8. Running memory of ONN with four training algorithms under different learning rates

View table

Table 8. Running memory of ONN with four training algorithms under different learning rates

Algorithm	R=0.5	R=0.05	R=0.005	R=5×10^-4	R=5×10^-5
SGD	671.940	688.124	696.836	731.680	736.724
RMSprop	720.472	684.356	656.360	726.204	732.252
Adam	728.012	709.480	712.940	717.196	740.156
Adagrad	709.597	717.371	709.236	723.026	735.144

Table 9. Running memory of ONN with four training algorithms without hidden layer 2 under different learning rates

View table

Table 9. Running memory of ONN with four training algorithms without hidden layer 2 under different learning rates

Algorithm	R=0.5	R=0.05	R=0.005	R=5×10^-4	R=5×10^-5
SGD	373.418	353.919	344.307	376.671	350.531
RMSprop	339.749	333.144	367.170	363.365	333.895
Adam	348.659	330.339	369.867	378.350	339.490
Adagrad	359.918	336.585	373.809	366.658	369.934

Table 10. Training time of ONN with four training algorithms under different learning rates

View table

Table 10. Training time of ONN with four training algorithms under different learning rates

Algorithm	R=0.5	R=0.05	R=0.005	R=5×10^-4	R=5×10^-5
SGD	453.210	449.995	385.783	366.278	372.436
RMSprop	398.588	450.083	377.086	445.179	388.789
Adam	439.016	447.009	430.769	397.805	423.890
Adagrad	351.312	436.669	424.947	400.590	426.575

Table 11. Training time of ONN with four training algorithms without hidden layer 2 under different learning rates

View table

Table 11. Training time of ONN with four training algorithms without hidden layer 2 under different learning rates

Algorithm	R=0.5	R=0.05	R=0.005	R=5×10^-4	R=5×10^-5
SGD	161.840	183.853	199.215	192.506	158.501
RMSprop	172.940	186.815	193.341	177.395	153.627
Adam	173.191	156.496	155.703	146.066	170.774
Adagrad	165.250	198.566	170.376	185.795	190.148

Table 12. Accuracy, running memory, and training time of ONN with the SGD algorithm under different momentum coefficients when the learning rate is 0.05

View table

Table 12. Accuracy, running memory, and training time of ONN with the SGD algorithm under different momentum coefficients when the learning rate is 0.05

Momentum coefficient	Accuracy	Running memory /kB	Training time /ms
0	0.960	668.124	449.995
0.1	0.962	602.880	380.814
0.5	0.964	614.468	402.444
0.9	0.969	625.308	403.404
0.97	0.943	627.606	394.597
0.98	0.099	645.633	409.608
1.0	0.098	633.696	396.202

Tools

Get Citation

Copy Citation Text

Wen Cao, Meiyu Liu, Minghao Lu, Xiaofeng Shao, Qifa Liu, Jin Wang. Influence of Hyperparameters on Performance of Optical Neural Network Training Algorithms[J]. Laser & Optoelectronics Progress, 2023, 60(22): 2220001

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Optics in Computing

Received: Jan. 30, 2023

Accepted: Feb. 27, 2023

Published Online: Nov. 6, 2023

The Author Email: Wang Jin (jinwang@njupt.edu.cn)

DOI:10.3788/LOP230535

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology

Table 1. Experimental platform parameters

Table 1. Experimental platform parameters

Table 2. Accuracy of ONN with four training algorithms under different learning rates (Softplus as the nonlinear function)

Table 2. Accuracy of ONN with four training algorithms under different learning rates (Softplus as the nonlinear function)

Table 3. Accuracy of ONN with four training algorithms under different learning rates (ReLU as the nonlinear function)

Table 3. Accuracy of ONN with four training algorithms under different learning rates (ReLU as the nonlinear function)

Table 4. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates (Softplus as the nonlinear function)

Table 4. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates (Softplus as the nonlinear function)

Table 5. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates(ReLU as the nonlinear function)

Table 5. Accuracy of ONN with four training algorithms without hidden layer 2 under different learning rates(ReLU as the nonlinear function)

Table 6. Running memory of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)

Table 6. Running memory of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)

Table 7. Training time of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)

Table 7. Training time of ONN with four training algorithms when the learning rate is 0.05 (Softplus as the nonlinear function)

Table 8. Running memory of ONN with four training algorithms under different learning rates

Table 8. Running memory of ONN with four training algorithms under different learning rates

Table 9. Running memory of ONN with four training algorithms without hidden layer 2 under different learning rates

Table 9. Running memory of ONN with four training algorithms without hidden layer 2 under different learning rates

Table 10. Training time of ONN with four training algorithms under different learning rates

Table 10. Training time of ONN with four training algorithms under different learning rates

Table 11. Training time of ONN with four training algorithms without hidden layer 2 under different learning rates

Table 11. Training time of ONN with four training algorithms without hidden layer 2 under different learning rates

Table 12. Accuracy, running memory, and training time of ONN with the SGD algorithm under different momentum coefficients when the learning rate is 0.05

Table 12. Accuracy, running memory, and training time of ONN with the SGD algorithm under different momentum coefficients when the learning rate is 0.05