Photonics Insights, Volume. 4, Issue 2, R03(2025)
Revolutionizing optical imaging: computational imaging via deep learning Story Video , On the Cover
Fig. 3. Deep-learning-based wavefront reconstruction method for adaptive systems. (a) A simplified network diagram for CARMEN[36]. (b) System controller design based on BP artificial neural network[37]. (c) An architecture with 900 hidden layer neurons. (d) The original image and network output and the spot center algorithm were found[38].
Fig. 5. Image-based wavefront aberration estimation method. (a) Experimental optical system setup for learning based Shack–Hartmann wavefront sensor[44]. (b) Sketch map of the object-independent wavefront sensing approach using deep LSTM networks[45]. (c) Schematic and experimental diagrams of the deep learning wavefront sensor[46].
Fig. 7. Design framework for automatic generation of free-form initial structures[51].
Fig. 8. Deep-learning-based method for automatic initial structure generation. (a) Overview of the deep learning framework used by Geoffroi
Fig. 10. Large field-of-view computational thin plate lens imaging method. (a) The RRG-GAN network architecture. (b) RRG-GAN restoring results based on manual dataset[61].
Fig. 11. U-Net (Res-U-net)++ network structure and imaging effects at different defocus distances. (a) Schematic of the network structure. The raw image is used as the input for the model, and the reconstructed image is the output of the model. (b) Image quality of the Cooke triplet lens and the doublet lens systems with the defocus amount varying within the range of
Fig. 12. Deep-learning-based aberration compensation method. (a) End-to-end deep learning-based image reconstruction pipeline for single and dual MDLs. (b) Examples of images with reconstruction artifacts[64].
Fig. 14. Architecture of the customized AlexNet deep learning model. (a) Input cell images of size 30 pixel × 30 pixel were resized to 50 pixel × 50 pixel. (b) Went through eight convolutional layers[71].
Fig. 15. Schematic of the signal enhancement network and enhancement effect used by Rajkumar
Fig. 16. Framework and results of the hologram information reconstruction network. (a) Network flowchart by Yair
Fig. 17. Architecture of the RedCap model and results. (a) Architecture of proposed RedCap model for holographic reconstruction. (b) Reconstructed image comparison[78].
Fig. 20. PhaseStain workflow and virtual staining results. (a) PhaseStain workflow. (b) Virtual H&E staining of label-free skin tissue using the PhaseStain framework[83].
Fig. 23. Deep neural network based on mask properties. (a) Overview of imaging pipeline by Zhou
Fig. 24. Schematic of the network framework of DNN-FZA. (a) Image acquisition pipeline and reconstruction for the DNN-FZA camera. (b) Architecture of the U-Net. (c) Up- and down-projection units in DBPN[92].
Fig. 26. Schematic of the network of the method proposed by Pan
Fig. 27. GI incorporating CNN. (a) Experimental setup of GI for detecting scattered light. (b) Diagram of CNN architecture for GI. (c) Experimental results of GI image without CNN. (d) Experimental results of GI image reconstructed using CNN[102].
Fig. 28. Network architecture of the PGI system and result diagrams[102]. (a) CNN architecture in the PGI system. (b) Comparison of the DL reconstruction and CS reconstruction results for different sampling rates and turbidity scenarios.
Fig. 29. (a) Diagram of DNN network architecture. (b) Training process of CGIDL. (c) Plot of experimental results of different methods[106].
Fig. 30. (a) Structure of recurrent neural network combining convolutional layers. (b) Comparison of results[110].
Fig. 31. (a) Structure of DNN network. (b) Comparison of the original image and the effects of different methods[112].
Fig. 32. Network architecture of the proposed DL-FSPI system and experimental results[113]. (a) DCAN network architecture used. (b) Experimental results of the DL-FSPI system.
Fig. 33. (a) Architecture of SPCI-Net layer[115]. (b) Diagram of the experimental setup. (c) Visualization of different methods under high-SNR conditions. (d) Visualization of different methods under low-SNR conditions.
Fig. 34. Comparison of OGTM network architecture and experimental results[118]. (a) OGTM network architecture and migration learning network. (b) Comparison of OGTM experimental results.
Fig. 35. Applications of light-field high-dimensional information acquisition.
Fig. 36. (a) Experimental setup. (b) CNN architecture. (c) Testing results of “seen objects through unseen diffusers[20].
Fig. 37. (a) Experimental setup uses a DMD as the object. (b) Test results of the complex object dataset with two characters, three characters, and four characters[140].
Fig. 38. Facial speckle image reconstruction by SpT UNet network[137]. (a) SpT UNet network architecture and (b) reconstructed image results.
Fig. 39. The architecture of the proposed denoiser network. (a)–(e) Single image super-resolution performance comparison for Butterfly image[144].
Fig. 40. (a) CNN’s architecture. (b) Visual comparison of deblurring results on images “Boat” and “Couple” in the presence of AWGN with unknown strength[143].
Fig. 41. (a), (b) Schematic illustration for imaging through a scattering medium. (c), (d) Schematic of the PSE-deep method. (e) The comparison of the reconstruction results from different methods[150].
Fig. 42. (a) Experimental setup. (b) The structure of GAN, (
Fig. 43. Active and passive NLOSs[161
Fig. 44. (a) Non-line-of-sight (NLOS) physics-based 3D human pose estimation. (b) Isogawa M DeepRL-based photons to 3D human pose estimation framework under the laws of physics[203].
Fig. 45. Overview of the proposed dynamic-excitation-based steady-state NLOS imaging framework[207].
Fig. 46. (a) The structure of the two-step DNN strategy. (b) The corresponding cropped speckle patterns, their autocorrelation, and the reconstructed images with the proposed two-step method[214].
Fig. 47. Flowchart of PCIN algorithm for NLOS imaging reconstruction[219]. The speckle image captured by the camera is put into CNN, and PCIN iteratively updates the parameters in CNN using the loss function constructed by the speckle image and forward physical model. The optimized parameters are utilized to obtain a high-quality reconstructed image.
Fig. 51. Network architecture and experimental results of Lin
Fig. 53. Underlying principle of TR-PAM and theoretical prediction of PA response[269]. (a) Laser-induced thermoelastic displacement (u) and subsequent PA response based on the optical-absorption-induced thermoelastic expansion for vascular tissues. (b) The principle of TR-PAM: elasticity and calcification estimations from vascular PA time response characteristics. (c) Experimental setup of the TR-PAM.
Fig. 54. The process of using the 4D spectral–spatial computational PAD combined with experiments for dataset acquisition and system optimization for deep learning[271]. (a) Relevant parameters can be set before data acquisition, and the distribution of the model optical field and detector acoustic field under a collimated Gaussian beam in the model is shown. (b) Feedback on relevant performance optimization parameters is provided to the experimental system after simulating calculations. (c) Experimental system. (d) The dataset is used for training the spread-spectrum network model. (e) The dataset is used for training the depth-enhanced network model. (f) The low-center-frequency detector skin imaging results obtained in the experiment are input into the trained spread-spectrum model to obtain the output image. (g) The skin imaging results under conventional scattering obtained in the experiment are input into the trained depth-enhanced model to obtain the output image.
Fig. 55. PtyNet-S network architecture and experimental prediction results[273]. (a) Structure of PtyNet-S. (b) Phase reconstruction results of the PtyNet-S in tungsten test pattern.
Fig. 56. Using networks to reduce color artifacts and improve image quality for different slices[282]. (a1), (a2) FPM raw data. (b1), (b2) Network input. (c1), (c2) Network output. (d1), (d2) FPM color image. (e1), (e2) True value. (f) Train two generator-discriminator pairs using two mismatched image sets. (g) Generator A2B accepts FPM input and outputs almost stained images.
Fig. 57. Network architecture and results comparison[285]. (a) Network architecture combined with two models. (b) Amplitude contrast. From left to right, the original image is restored using GS algorithm, MFP algorithm, and the algorithm proposed in this review. (c) Phase contrast. From left to right, the original image is restored using GS algorithm, MFP algorithm, and the algorithm proposed in this review.
Fig. 58. Reconstruction methods and results under different overlapping rates[287]. (a) Image reconstruction process with low overlap rate. (b) Image reconstruction process with high overlap rate. (c) Phase recovery results of different methods with low overlap rate; from top to bottom are alternate projection (AP) phase recovery algorithm, PtychNet, cGAN-FP, and truth value. (d) Phase recovery results of different methods with high overlap rate; from top to bottom are AP phase recovery algorithm, PtychNet, cGAN-FP, and truth value.
Fig. 59. Network architecture and result analysis[31]. (a) Work flow of Fourier lamination dynamic imaging reconstruction based on deep learning. (b) Use the temporal dynamic information reconstructed by the proposed CNN and compare it with the true value. (c) Network architecture of generative adversarial network (cGAN) for FPM dynamic image reconstruction.
Fig. 60. The network architecture and simulation results[292]. (a) The network architecture. (b1), (b2) High-resolution amplitude and phase images for simulation. (c1)–(c3) The output of the CNN based on (a) and different wave vectors.
Fig. 61. Network architecture and result analysis. (a) Deep-SLAM procedure[298]. (b)–(d) Wide-FOV and isotropic Deep-SLAM imaging.
Fig. 62. (a) DR-Storm network architecture. (b) Comparison of experimental STORM data of Deep-STORM and DRL-STORM[300]. (b1) The sum of 500 frames of original images. (b2) Intensity distribution along the dotted white line. (b3), (b4) Images reconstructed using Deep-STORM and DR-Storm, respectively.
Fig. 63. UNet-RCAN architecture and result analysis[304]. (a) UNet-RCAN network. (b) Restoration results of UNet-RCAN, 2D-RCAN, CARE, pix2pix, and deconvolution on noisy 2D-STED images for β-tubulin in U2OS cells in comparison to the ground-truth STED data.
Fig. 64. Network training process and phase unwrapping results[311]. (a) Training and testing of the network. (b) CNN results for samples obtained from the test image set. Wrapped, real, and unwrapped phase images of the CNN output. (c) Compare the phase heights at both ends of the centerline represented by the real and unwound phases of the CNN output.
Fig. 65. Experimental setup, PhaseNet architecture, and results[314]. (a) The true values. (b) Reconstruction results of PhaseNet. (c) Three-dimensional images of the reconstructed results. (d) Error mapping between the true value and PhaseNet reconstruction results. (e) Deep-learning-based holographic microscope. (f) Detailed schematic of the PhaseNet architecture.
Fig. 66. (a) Schematic diagram of NFTPM[318]. (b) The physics prior (forward image formation model) of NFTPM. (c1), (c2) Phases of HeLa cells retrieved by AI-TIE. (d1), (d2) Phases of HeLa cells retrieved by NFTPM.
Fig. 67. Flowchart of deep-learning-based phase retrieval method and the 3D reconstruction results of different approaches[331]. (a) The principle of deep-learning-based phase retrieval method: first, the background map A is predicted from the single-frame stripe image I by CNN1; then, the mapping of the stripe pattern I and the predicted background map A to the numerator term M and denominator term D of the inverse tangent function is realized by CNN2; finally, a high-precision wrapped phase map can be obtained by the arctangent function. (b) Comparison of the 3D reconstructions of different fringe analysis approaches (FT, WFT, the deep-learning-based method, and 12-step phase-shifting profilometry).
Fig. 69. Flowchart of DLMFPP: The projector sequentially projects the stripe pattern onto the dynamic scene so that the corresponding modulated stripe images encode the scene at different time[345]. The camera then captures the multiplexed images with longer exposure time and obtains the spatial spectrum by Fourier transform. A synthesized scene consisting of the letters “MULTIPLEX” is used to illustrate the principle.
Fig. 71. (a) The network design of MVSNet[351]. (b) The pipeline of reconstructing a 3D model using sparse RGB-D images.
Fig. 76. Flowchart of the algorithm NSAE-WFCM[399]. NSAE represents the pixel neighborhood-based SAE with several hidden layers, and WFCM denotes the feature-WFCM under various land cover types.
Fig. 78. (a) Network architecture and result of PFNet. (b) Network architecture and result of Sun’s network[410].
Fig. 79. (a) Mueller matrix microscope and the schematic of Li’s system[434,435]. (b) Results of classification experiments on algal samples of Ref. [434]. (c) Results of classification experiments on algal samples of Ref. [401]. (d) Network architecture and results of classification experiments on algal samples of Ref. [436].
Fig. 80. Polarization-imaging-based dual-mode machine learning framework for quantitative diagnosis of cervical precancerous lesions[443].
Fig. 82. Overall structure of the graph-attention convolution into auto-coding and estimated abundance maps for the Samson dataset[464].
Fig. 85. (a) Schematic diagram of the optical system[478]. (b) Simplified block diagram of the experiment. (c) Reconstructed images obtained by DD1-Net and DD2-Net.
Fig. 89. MMF-Net[495]. (a) Model framework. (b) Source image A. (c) Source image B. (d) Fused image.
Fig. 90. Multi-focus image fusion network[498]. (a) Model framework. (b) The “Model Girl” source image pair and their fused images obtained with different fusion methods.
Fig. 91. Effects of different multi-focus image fusion networks[500
Fig. 92. EFCNN[509]. (a) Network structure. (b) Source images. (c) Fusion image.
Fig. 93. Unsupervised multi-exposure image fusion network DeepFuse[512]. (a) Network architecture. (b) Underexposed image. (c) Overexposed image. (d) Fusion result.
Fig. 94. GANFuse image fusion network[515]. (a) Network architecture. (b) Overexposure image. (c) Underexposer image. (d) Fusion result.
Fig. 95. Image fusion network based on multi-scale feature cascades and non-local attention[519]. (a) Network architecture. (b) Infrared image. (c) Visible image. (d) Fusion result.
Fig. 96. Unsupervised infrared and visible image fusion network based on DenseNet[521]. (a) Network structure. (b) Infrared image. (c) Visible image. (d) Fusion image.
Fig. 98. Medical fusion method[531]. (a) Network architecture. (b) Fusion results for CT and MRI images. (c) Fusion results for MRI and PET images. (d) Fusion results for MRI and SPET images.
Fig. 101. RMFF-UPGAN[540]. (a) Network architecture. (b) EXP. (c) D_P. (d) Fusion result. (e) Ground truth.
Fig. 102. FFDNet network[547]. (a) Network structure. (b) Noisy image. (c) Denoised image.
Fig. 103. ERDF network[549]. (a) Architecture of the proposed lightweight zero-shot network. (b) Qualitative comparison of denoising for different methods along with the corresponding PSNR.
Fig. 104. DRANet network[550]. (a) Network structure. (b) Ground truth A. (c) Noisy image A. (d) Denoised image A. (e) Ground truth B. (f) Noisy image B. (g) Denoised image B.
Fig. 105. RCA-GAN[552]. (a) Network structure. (b) The ground truth. (c) Noisy images. (d) Denoised images.
Fig. 106. ALDIP-SSTV network[555]. (a) Network structure. (b) Noisy image A. (c) Denoised image A. (d) Noisy image B. (e) Denoised image B.
Fig. 107. Four-branch image noise reduction network[557]. (a) Network structure. (b) Noisy image A. (c) Denoised image A. (d) Noisy image B. (e) Denoised image B.
Fig. 108. Low-light image enhancement network[561]. (a) The network structure. (b) Input image and enhancement results.
Fig. 109. Improved UM-GAN network[564]. (a) Network structure. (b) Low-light inputs and enhancement images.
Fig. 110. MBLLEN network[566]. (a) Network structure. (b) Input image. (c) Enhancement result.
Fig. 111. UWGAN[568]. (a) Network structure. (b) Input image A. (c) Enhancement image A. (d) Input image B. (e) Enhancement image B.
Fig. 112. UWGAN. (a) Network structure[570]. (b) Input image A. (c) Enhancement image A. (d) Input image B. (e) Enhancement image B.
Fig. 113. Semi-supervised image de-fogging network[574]. (a) Network architecture. (b) Image containing fog. (c) Fog removal image.
Fig. 114. SWCGAN. (a) Network structure[579]. (b) Low-resolution image. (c) Super-resolution recovered image.
Fig. 115. Image compression network[584]. (a) Network framework. (b) Original image 1. (c) Compressed image 1. (d) Original image 2. (e) Compressed image 2.
Get Citation
Copy Citation Text
Xiyuan Luo, Sen Wang, Jinpeng Liu, Xue Dong, Piao He, Qingyu Yang, Xi Chen, Feiyan Zhou, Tong Zhang, Shijie Feng, Pingli Han, Zhiming Zhou, Meng Xiang, Jiaming Qian, Haigang Ma, Shun Zhou, Linpeng Lu, Chao Zuo, Zihan Geng, Yi Wei, Fei Liu, "Revolutionizing optical imaging: computational imaging via deep learning," Photon. Insights 4, R03 (2025)
Category: Review Articles
Received: Dec. 30, 2024
Accepted: Mar. 13, 2025
Published Online: Apr. 9, 2025
The Author Email: Chao Zuo (zuochao@njust.edu.cn), Zihan Geng (geng.zihan@sz.tsinghua.edu.cn), Yi Wei (yiwei124@mit.edu), Fei Liu (feiliu@xidian.edu.cn)
CSTR:32396.14.PI.2025.R03