Deep learning is a form of machine learning that attempts to imitate the principles of the human brain for data interpretation. It has been applied to many specific tasks, including image classification
Opto-Electronic Advances, Volume. 7, Issue 2, 230005-1(2024)
Pluggable multitask diffractive neural networks based on cascaded metasurfaces
Optical neural networks have significant advantages in terms of power consumption, parallelism, and high computing speed, which has intrigued extensive attention in both academic and engineering communities. It has been considered as one of the powerful tools in promoting the fields of imaging processing and object recognition. However, the existing optical system architecture cannot be reconstructed to the realization of multi-functional artificial intelligence systems simultaneously. To push the development of this issue, we propose the pluggable diffractive neural networks (P-DNN), a general paradigm resorting to the cascaded metasurfaces, which can be applied to recognize various tasks by switching internal plug-ins. As the proof-of-principle, the recognition functions of six types of handwritten digits and six types of fashions are numerical simulated and experimental demonstrated at near-infrared regimes. Encouragingly, the proposed paradigm not only improves the flexibility of the optical neural networks but paves the new route for achieving high-speed, low-power and versatile artificial intelligence systems.
Introduction
Deep learning is a form of machine learning that attempts to imitate the principles of the human brain for data interpretation. It has been applied to many specific tasks, including image classification
In the last several decades, optical neural networks (ONNs)
Currently, D2NN has been widely studied for various tasks, including broadband pulse shaping
Here, we proposed the pluggable diffractive neural networks (P-DNN), which can realize the switching of various recognition tasks such as handwritten digits and fashions by switching the pluggable components in the network. It improves the flexibility of network design while effectively reducing the consumption of computing resources and training time. We designed two-layer cascaded metasurfaces
Results and discussion
The framework of the proposed P-DNN can be used to recognize various types of datasets such as handwritten digits and fashions, as shown in
Figure 1.
The P-DNN system is trained based on the optical diffraction theory. According to the Huygens-Fresnel principle
According to the Rayleigh-Sommerfeld diffraction theory
for the first hidden layer with
where
We explored an optimization algorithm flow based on transfer learning. The training criterion is to maximize each normalized signal corresponding to the detection region, while minimizing the total signal outside the detection region. Considering the feasibility of the experiment, a phase-only modulation method was adopted and the sigmoid function was used to restrict the phase parameter in the range of 0–2π. During optimization, we used a mean squared error (MSE) loss function to evaluate the difference between the output and truth by light intensity of different detector regions. According to the loss function, the phase parameters are randomly optimized by using stochastic gradient descent and error backpropagation algorithm. More detailed model training and derivation are described in the Methods section and Supplementary information.
More specifically, the P-DNN training process can be divided into two steps (
Figure 2.
According to the multi-task P-DNN design process, three phases can be optimized by using handwritten digital datasets and fashion datasets, corresponding to one sharing layer and two classification layers respectively. As a proof-of-concept, the cascaded metasurfaces are designed to realize corresponding phase modulation. The metasurfaces are composed of rectangular amorphous silicon nanofin deliberately designed on a glass substrate (
Figure 3.
The rigorous coupled wave analysis (RCWA) simulation method was used to design and optimize amorphous silicon nanofins. The electromagnetic response of each periodic array of nanostructures can be calculated. The height of the nanofins is fixed at 600 nm and the period is 500 nm in the x and y directions. The incident wavelength is 800 nm. Then, we obtained the simulated magnitudes of the circularly cross- and co-circularly polarized transmission coefficients of the nanofins by sweeping the length and width of the nanofins in 5 nm steps from 70 nm to 300 nm (
We build experimental system to demonstrate the performance of metasurface-based multi-task P-DNN (
Figure 4.
We used two 3D displacement platforms to help align the cascaded metasurfaces. The distance between the two metasurfaces is set to 500 μm. Furthermore, a large pixel consisting of a 10×10 array of nanofins was intentionally created by repeatedly placing nanofins with the same azimuth angle. The size of the metasurface is 500×500 μm2 and contains 100×100 pixels. The side length of each super-pixel is 5 μm.
In digit classification simulation, the P-DNN handwritten digits classification component has been iteratively trained for 10 periods, and the simulation test accuracy has reached 91.8% (
Figure 5.
Furthermore, we test the classification performance of P-DNN on more complex image datasets, by replacing the classification layer plug-in for fashion recognition. The datasets consist of six different fashion products (T-shirts, pants, jumpers, sneakers, bags, and ankle boots). It is worth noting that the training of the fashion plugin has achieved high classification accuracy after only five training epochs. Compared to the training without transfer learning, the training time and parameters are reduced by half. We used 6000 fashion images as the test dataset, the numerical test has an accuracy rate of 90.2% (
Figure 6.
Conclusions
Generally, the D2NN can only achieve a single task after the design is completed and the physically manufactured network cannot be modified. If other tasks need to be achieved, D2NN needs to be completely retrained and manufactured. Here we demonstrate that P-DNN can be applied to recognize various tasks in the near-infrared band by switching internal plug-ins. To verify the feasibility, we design cascaded metasurfaces for handwritten digits and fashion recognition. It can work similarly to the pluggable components in optical communication. According to user requirements, classification layer (pluggable layer) plug-ins can be customized to be used in combination with common layer plug-ins to achieve a variety of specific tasks (not limited to classification tasks). The experimental test accuracies were 91.3% and 90.0%, respectively, which were in good agreement with the numerical simulation results. Such plug-ins have good reconfigurability and can easily complete more tasks through plugging and unplugging. It has the characteristics of ultra-low energy consumption and light-speed calculation, and may provide extra flexibility for neural networks. Our proposed P-DNN can achieve a wide range of applications, such as intelligent optical filtering in microscopy imaging and real-time object detection in autonomous driving systems.
Material and method
Training of the P-DNN
Our P-DNN is trained using Python version 3.7.0. and TensorFlow framework version 2.4.1 (Google Inc.) on a desktop computer (Intel(R) Core(TM) i5-10500 CPU @3.10 GHz with 32 GB RAM, running the Windows 10 operating system (Microsoft)). In the training process, mean square error loss is selected as the loss function, which is usually used for target classification of machine learning, and the Adam optimizer is used to update the phase value of each layer in the network. We used 36000 handwritten digital images and fashion images as training datasets with a training batch size of 8 and a learning rate of 0.01.
Fabrication
The dielectric metasurface of amorphous silicon (α-Si) nanofins was fabricated on SiO2 substrate. Firstly, amorphous silicon films with a thickness of 600 nm were prepared by plasma-enhanced chemical vapor deposition. Subsequently, the polymethyl methacrylate resist layer was spin-coated. The pattern is then patterned using standard electron beam lithography. After development, a 30 nm thick chromium layer is plated on the surface of the sample. Finally, we performed the lift-off process in hot acetone and employed inductively coupled plasma reactive ion etching to transfer the desired structure from chromium to silicon.
[1] A Krizhevsky, I Sutskever, GE Hinton. ImageNet classification with deep convolutional neural networks. Commun ACM, 60, 84-90(2017).
[3] HQ Zhou, YT Wang, X Li et al. A deep learning approach for trustworthy high-fidelity computational holographic orbital angular momentum communication. Appl Phys Lett, 119, 044104(2021).
[4] YM Guo, LB Zhong, L Min et al. Adaptive optics based on machine learning: a review. Opto-Electron Adv, 5, 200082(2022).
[5] S Krasikov, A Tranter, A Bogdanov et al. Intelligent metaphotonics empowered by machine learning. Opto-Electron Adv, 5, 210147(2022).
[7] G Hinton, L Deng, D Yu et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag, 29, 82-97(2012).
[8] R Collobert, J Weston, L Bottou et al. Natural language processing (almost) from scratch. J Mach Learn Res, 12, 2493-2537(2011).
[9] H Markram. The blue brain project. Nat Rev Neurosci, 7, 153-160(2006).
[10] YC Shen, NC Harris, S Skirlo et al. Deep learning with coherent nanophotonic circuits. Nat Photonics, 11, 441-446(2017).
[11] J Feldmann, N Youngblood, M Karpov et al. Parallel convolutional processing using an integrated photonic tensor core. Nature, 589, 52-58(2021).
[12] XY Xu, MX Tan, B Corcoran et al. 11 TOPS photonic convolutional accelerator for optical neural networks. Nature, 589, 44-51(2021).
[13] E Goi, X Chen, QM Zhang et al. Nanoprinted high-neuron-density optical linear perceptrons performing near-infrared inference on a CMOS chip. Light Sci Appl, 10, 40(2021).
[14] F Ashtiani, AJ Geers, F Aflatouni. An on-chip photonic deep neural network for image classification. Nature, 606, 501-506(2022).
[15] S Zarei, MR Marzban, A Khavasi. Integrated photonic neural network based on silicon metalines. Opt Express, 28, 36668-36684(2020).
[16] H Chen, JN Feng, MW Jiang et al. Diffractive deep neural networks at visible wavelengths. Engineering, 7, 1483-1491(2021).
[17] J Liu, QH Wu, XB Sui et al. Research progress in optical neural networks: theory, applications and developments. PhotoniX, 2, 5(2021).
[18] X Zhang, LL Huang, RZ Zhao et al. Basis function approach for diffractive pattern generation with Dammann vortex metasurfaces. Sci Adv, 8, eabp8073(2022).
[19] X Lin, Y Rivenson, NT Yardimci et al. All-optical machine learning using diffractive deep neural networks. Science, 361, 1004-1008(2018).
[20] RZ Zhao, LL Huang, YT Wang. Recent advances in multi-dimensional metasurfaces holographic technologies. PhotoniX, 1, 20(2020).
[21] YX Zhang, MB Pu, JJ Jin et al. Crosstalk-free achromatic full Stokes imaging polarimetry metasurface enabled by polarization-dependent phase optimization. Opto-Electron Adv, 5, 220058(2022).
[22] T Badloe, S Lee, J Rho. Computation at the speed of light: metamaterials for all-optical calculations and neural networks. Adv Photon, 4, 064002(2022).
[23] M Veli, D Mengu, NT Yardimci et al. Terahertz pulse shaping using diffractive surfaces. Nat Commun, 12, 37(2021).
[24] C Qian, X Lin, XB Lin et al. Performing optical logic operations by a diffractive neural network. Light Sci Appl, 9, 59(2020).
[25] PP Wang, WJ Xiong, ZB Huang et al. Orbital angular momentum mode logical operation using optical diffractive neural network. Photon Res, 9, 2116-2124(2021).
[26] ZB Huang, YL He, PP Wang et al. Orbital angular momentum deep multiplexing holography via an optical diffractive neural network. Opt Express, 30, 5569-5584(2022).
[27] SS Rahman, A Ozcan. Computer-free, all-optical reconstruction of holograms using diffractive networks. ACS Photonics, 8, 3375-3384(2021).
[28] D Mengu, A Ozcan. All-optical phase recovery: diffractive computing for quantitative phase imaging. Adv Opt Mater, 10, 2200281(2022).
[29] JX Li, YC Hung, O Kulce et al. Polarization multiplexed diffractive computing: all-optical implementation of a group of linear transformations through a polarization-encoded diffractive network. Light Sci Appl, 11, 153(2022).
[30] JS Shi, L Zhou, TG Liu et al. Multiple-view D2NNs array: realizing robust 3D object recognition. Opt Lett, 46, 3388-3391(2021).
[31] SS Rahman, JX Li, D Mengu et al. Ensemble learning of diffractive optical networks. Light Sci Appl, 10, 14(2021).
[32] T Yan, JM Wu, TK Zhou et al. Fourier-space diffractive deep neural network. Phys Rev Lett, 123, 023901(2019).
[33] C Liu, Q Ma, ZJ Luo et al. A programmable diffractive deep neural network based on a digital-coding metasurface array. Nat Electron, 5, 113-122(2022).
[34] TK Zhou, X Lin, JM Wu et al. Large-scale neuromorphic optoelectronic computing with a reconfigurable diffractive processing unit. Nat Photonics, 15, 367-373(2021).
[35] YJ Li, RY Chen, B Sensale-Rodriguez et al. Real-time multi-task diffractive deep neural networks via hardware-software co-design. Sci Rep, 11, 11013(2021).
[36] XH Luo, YQ Hu, XN Ou et al. Metasurface-enabled on-chip multiplexed diffractive neural networks in the visible. Light Sci Appl, 11, 158(2022).
[37] P Georgi, QS Wei, B Sain et al. Optical secret sharing with cascaded metasurface holography. Sci Adv, 7, eabf9718(2021).
[38] Y Lecun, L Bottou, Y Bengio et al. Gradient-based learning applied to document recognition. Proc IEEE, 86, 2278-2324(1998).
[40] JW Goodman. Introduction to Fourier Optics and Holography(2005).
[41] L Mandel, E Wolf. Some properties of coherent light. J Opt Soc Am, 51, 815-819(1961).
[42] L Marrucci, C Manzo, D Paparo. Optical spin-to-orbital angular momentum conversion in inhomogeneous anisotropic media. Phys Rev Lett, 96, 163905(2006).
Get Citation
Copy Citation Text
Cong He, Dan Zhao, Fei Fan, Hongqiang Zhou, Xin Li, Yao Li, Junjie Li, Fei Dong, Yin-Xiao Miao, Yongtian Wang, Lingling Huang. Pluggable multitask diffractive neural networks based on cascaded metasurfaces[J]. Opto-Electronic Advances, 2024, 7(2): 230005-1
Category: Research Articles
Received: Jan. 14, 2023
Accepted: Apr. 24, 2023
Published Online: May. 24, 2024
The Author Email: Wang Yongtian (YTWang), Huang Lingling (HuangLL)