Photonics Research, Volume. 12, Issue 6, 1159(2024)
Diffractive neural networks with improved expressive power for gray-scale image classification
Fig. 1. 2D image entropy distribution of all gray-scale samples in the MNIST and Fashion-MNIST data sets [51] and binarized samples in the MNIST data set. The benchmark table showcases the performance of digital computer algorithms, which are the linear neural network (NN) and the convolutional neural network (CNN), in the first two rows [52]. In contrast, the performance of deep [23,53,54] and single-layer DNNs (SL-DNNs) in numerical simulations and experiments is also presented in the remaining table.
Fig. 2. Properties of the diffraction matrices
Fig. 3. Schematic and photo of the architecture of the multilayer DNN. It is a combination of a DMD, multiple DNN cells, each of which contains a phase-only SLM and NPBS, and a camera. An experimental setup for a double-layer DNN is shown. A linear polarizer, LP1, serves to adjust the polarization direction of the light to be parallel to the direction of the horizontal axis of the SLM panel. Another linear polarizer, LP2, serves as the analyzer. A half-wave plate, HW, is placed between LP1 and DMD to increase the component of the light with the same polarization direction as the desired direction.
Fig. 4. Simulation and experimental result of gray-scale MNIST data set. (a) Images of MNIST handwritten digits are intensity-based eight-level gray scale. Ten light intensity regions are manually selected. The target region with the maximum intensity determines the classification result. (b) The confusion matrix and energy distribution percentage show numerical test results of blindly testing 10,000 samples, and it achieves the accuracy of 97.90%. (c) The confusion matrix and energy distribution percentage for the experimental results. All 10,000 samples in the test set are tested, and the double-layer DNN achieves the accuracy of 95.10%.
Fig. 5. Simulation and experimental result of Fashion-MNIST data set. (a) Images of Fashion-MNIST handwritten digits are intensity-based eight-level gray scale. Ten light intensity regions are manually selected. The target region with the maximum intensity determines the classification result. (b) The confusion matrix and energy distribution percentage show numerical test results of blindly testing 10,000 samples, and it achieves the accuracy of 86.02%. (c) The confusion matrix and energy distribution percentage for the experimental results. All 10,000 samples in the test set are tested, and the double-layer DNN achieves the accuracy of 80.61%.
Fig. 6. Performance of a double-layer DNN with different Fresnel numbers. (a) In a double-layer DNN, there is three-segment free-space diffraction. We let the first and the last diffraction processes to be the same, where
Get Citation
Copy Citation Text
Minjia Zheng, Wenzhe Liu, Lei Shi, Jian Zi, "Diffractive neural networks with improved expressive power for gray-scale image classification," Photonics Res. 12, 1159 (2024)
Category: Image Processing and Image Analysis
Received: Nov. 29, 2023
Accepted: Feb. 29, 2024
Published Online: May. 24, 2024
The Author Email: Wenzhe Liu (wliubh@connect.ust.hk), Lei Shi (lshi@fudan.edu.cn), Jian Zi (jzi@fudan.edu.cn)
CSTR:32188.14.PRJ.513845