Advanced Photonics, Volume. 6, Issue 5, 056001(2024)
Ultra-wide FOV meta-camera with transformer-neural-network color imaging methodology
Fig. 1. Neural meta-camera model. The meta-camera consists of the ultra-wide FOV metalens and the transformer-based neural network for full-color imaging. Green arrows show the process of image recovery. The captured image from the meta-camera is reconstructed by the image recovery neural network constructed by the KD paradigm (yellow arrows, prior knowledge and data-driven). The neural network is initialized by the prior data set from the simulated PSFs of the metalens, and then the measured data set from meta-camera is input to drive the refinement of the initialized neural network. To capture information at multiscale, we use U-shaped hierarchical neural networks. Considering the spatial distribution characteristics of the simulated PSFs from the metalens, the U-shaped network with an attention mechanism is adopted to cope with its nonuniformity.
Fig. 2. Ray optics design and characterization of the ultra-wide FOV metalens. (a) Ray-tracing simulation results of ultra-wide FOV metalens (left) of 140 deg. The red/green/blue/yellow rays have four crossing points at the same image plane passing through the aperture, substrate, metasurface, and cover glass of sensor. Spot diagrams (right) show the diffuse spots with the incident angles of 0 deg, 20 deg, 40 deg, and 70 deg are located inside the Airy circle (black solid). (b) Simulated MTF curves at different FOVs and the black solid line indicate the diffraction limit. Schematic of a meta-atom of the metasurface, consisting of a silicon nanopost with the height (
Fig. 3. Proposed KD paradigm for training image recovery neural network. (a) Prior knowledge, i.e., PSFs, obtained from the design parameters of the metalens is applied to the original images to generate the prior data set. This prior data set is used to train an initialized neural network. (b) By utilizing the data collection and processing flow we have established, data from corresponding scenarios are collected to drive further fine-tuning of the model, enabling it to cope with more intricate image degradation in actual scenarios. The measured data set in the data-driven scenario includes images (e.g., LCD screen projection images) captured by the metalens and a conventional commercial lens (Sigma Art Zoom lens). As shown by the black dotted line, the neural network is updated through backpropagation with the same loss function in both stage (a) and stage (b). After the model parameters updates of two stages, the neural network is employed to recover imaging in the corresponding scenario.
Fig. 4. Image recovery results of our neural network for images of naked ultra-wide FOV metalens are compared with results from UNet & KD paradigm and other traditional methods. (a) Schematic illustration of the data acquisition system for naked ultra-wide FOV metalens. The object projected by a 5.5-in. LCD screen is collected by the naked ultra-wide FOV metalens with a working distance of 2 cm and redirected to a micro-magnification system with an objective lens (Olympus, MPLFLN10xBD), an adapter tube lens (1-62922, Navitar), and a CMOS sensor (Sony, A7M3). (b) Compared to UNet & KD paradigm and other traditional image recovery algorithms (e.g., MSRCR, Laplacian), our image recovery neural network produces ultra-wide FOV, full-color and high-quality images corrected for central bright speckle, chromatic aberrations, and distortion. Examples of recovered images include complex scenes, such as cartoons with orange alphabets, yellow buses in the shade, and concerts under blue lights. Detail insets are illustrated below each row. Compared to ground-truth capture (the rightmost column) using a conventional commercial lens (Sigma Art 24-70mm DG DN), our neural network accurately reproduces fine details and colors in images. More comparison images (e.g., grids, letters, and oranges) are shown in Figs. S12–S14 in the
Fig. 5. Neural meta-camera for imaging. (a) Photograph of the meta-camera system (left) by integrating the miniature meta-camera (top-right) with a CMOS image sensor, and the schematic illustration of its structural mechanism (bottom-right) including an aperture, sleeve, and base for shading and waterproofing. (b) Schematic diagram of meta-camera test. The ground-truth images are projected on the LCD screen and captured directly by the meta-camera. (c) Comparison recovery results from images captured by ultra-wide FOV metalens only and the meta-camera at the working distance of 2 cm. Cartoon images from an alarm clock and a blue bed show that chromatic aberrations and central bright speckle are greatly improved after recovery by neural networks. More comparison images (e.g., doll, coral, and concert) are shown in Figs. S15–S16 in the
Get Citation
Copy Citation Text
Yan Liu, Wen-Dong Li, Kun-Yuan Xin, Ze-Ming Chen, Zun-Yi Chen, Rui Chen, Xiao-Dong Chen, Fu-Li Zhao, Wei-Shi Zheng, Jian-Wen Dong, "Ultra-wide FOV meta-camera with transformer-neural-network color imaging methodology," Adv. Photon. 6, 056001 (2024)
Category: Research Articles
Received: Apr. 9, 2024
Accepted: Apr. 18, 2024
Posted: Apr. 19, 2024
Published Online: May. 22, 2024
The Author Email: Zheng Wei-Shi (zhwshi@mail.sysu.edu.cn), Dong Jian-Wen (dongjwen@mail.sysu.edu.cn)