Photonics Research, Volume. 12, Issue 6, 1159(2024)

Diffractive neural networks with improved expressive power for gray-scale image classification

Minjia Zheng1... Wenzhe Liu2,5,*, Lei Shi1,2,3,4,6,*, and Jian Zi1,2,3,47,* |Show fewer author(s)
Author Affiliations
  • 1State Key Laboratory of Surface Physics, Key Laboratory of Micro- and Nano-Photonic Structures (Ministry of Education) and Department of Physics, Fudan University, Shanghai 200433, China
  • 2Institute for Nanoelectronic Devices and Quantum Computing, Fudan University, Shanghai 200433, China
  • 3Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
  • 4Shanghai Research Center for Quantum Sciences, Shanghai 210315, China
  • 5e-mail: wliubh@connect.ust.hk
  • 6e-mail: lshi@fudan.edu.cn
  • 7e-mail: jzi@fudan.edu.cn
  • show less
    Figures & Tables(6)
    2D image entropy distribution of all gray-scale samples in the MNIST and Fashion-MNIST data sets [51] and binarized samples in the MNIST data set. The benchmark table showcases the performance of digital computer algorithms, which are the linear neural network (NN) and the convolutional neural network (CNN), in the first two rows [52]. In contrast, the performance of deep [23,53,54] and single-layer DNNs (SL-DNNs) in numerical simulations and experiments is also presented in the remaining table.
    Properties of the diffraction matrices M for optical diffraction and DNNs. (a) Schematic view of the free-space optical diffraction. Each element in M for optical diffraction is identical to the other three elements symmetrically positioned along the two diagonals (red dashed lines). (b) Schematic view of a single-layer DNN. One symmetric axis of matrix elements, which is the diagonal from the bottom-left to the top-right of M, is disrupted. (c) Schematic view of a double-layer DNN. The last symmetric axis of matrix elements, which is the diagonal from the top-left to the bottom-right of M, is disrupted. M with different Fresnel numbers F have different properties. When the optimal F is properly chosen, elements in M will be independent to take and DNN will have promising performance.
    Schematic and photo of the architecture of the multilayer DNN. It is a combination of a DMD, multiple DNN cells, each of which contains a phase-only SLM and NPBS, and a camera. An experimental setup for a double-layer DNN is shown. A linear polarizer, LP1, serves to adjust the polarization direction of the light to be parallel to the direction of the horizontal axis of the SLM panel. Another linear polarizer, LP2, serves as the analyzer. A half-wave plate, HW, is placed between LP1 and DMD to increase the component of the light with the same polarization direction as the desired direction.
    Simulation and experimental result of gray-scale MNIST data set. (a) Images of MNIST handwritten digits are intensity-based eight-level gray scale. Ten light intensity regions are manually selected. The target region with the maximum intensity determines the classification result. (b) The confusion matrix and energy distribution percentage show numerical test results of blindly testing 10,000 samples, and it achieves the accuracy of 97.90%. (c) The confusion matrix and energy distribution percentage for the experimental results. All 10,000 samples in the test set are tested, and the double-layer DNN achieves the accuracy of 95.10%.
    Simulation and experimental result of Fashion-MNIST data set. (a) Images of Fashion-MNIST handwritten digits are intensity-based eight-level gray scale. Ten light intensity regions are manually selected. The target region with the maximum intensity determines the classification result. (b) The confusion matrix and energy distribution percentage show numerical test results of blindly testing 10,000 samples, and it achieves the accuracy of 86.02%. (c) The confusion matrix and energy distribution percentage for the experimental results. All 10,000 samples in the test set are tested, and the double-layer DNN achieves the accuracy of 80.61%.
    Performance of a double-layer DNN with different Fresnel numbers. (a) In a double-layer DNN, there is three-segment free-space diffraction. We let the first and the last diffraction processes to be the same, where F1=F3. The second diffaction process can be described by F2. (b) Performance of the double-layer DNN with different combinations of F1 and F2.
    Tools

    Get Citation

    Copy Citation Text

    Minjia Zheng, Wenzhe Liu, Lei Shi, Jian Zi, "Diffractive neural networks with improved expressive power for gray-scale image classification," Photonics Res. 12, 1159 (2024)

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing and Image Analysis

    Received: Nov. 29, 2023

    Accepted: Feb. 29, 2024

    Published Online: May. 24, 2024

    The Author Email: Wenzhe Liu (wliubh@connect.ust.hk), Lei Shi (lshi@fudan.edu.cn), Jian Zi (jzi@fudan.edu.cn)

    DOI:10.1364/PRJ.513845

    CSTR:32188.14.PRJ.513845

    Topics