Advanced Photonics, Volume. 7, Issue 5, 054002(2025)

Deep learning for computational imaging: from data-driven to physics-enhanced approaches

Fei Wang, Juergen W. Czarske, and Guohai Situ*
Figures & Tables(20)
Schematic diagram of a computational imaging system. The object f is modulated by properly designed encoding components, forming the measurement g=H(f). The object image f* can be reconstructed from g provided prior information about the underlying imaging system and f.
Differences in measurements and priors required by different reconstruction methods for the same imaging quality. Measurements: the amount of data recorded directly by the detector containing object information. Priors: explicitly defined image priors, image formation operators, and implicitly expressed priors based on a parameter model. Versatility: linked to universality. Interpretability: connected to XAI. Image quality: related to contrast, SSIM, and correlation coefficients. Each method introduces different types of prior information in unique ways, resulting in differences in versatility, interpretability, and the required measurements to achieve similar imaging quality. The circle diameter represents the versatility, whereas the gray level is related to interpretability. A quantitative evaluation can be found in Sec. 5.5.
Representation illustrating the relationship between objects and images in traditional imaging and computational imaging. The decrease in area in measurement space Y compared with object space X signifies information loss during the imaging process. Traditional imaging loses a significant amount of information, whereas computational imaging, through the introduction of encoding, can record more information than traditional imaging.
Illustration of forward propagation and backpropagation in a multilayer neural network. The process of forward propagation involves sequentially processing the input data through each layer, yielding the prediction of the given input data at the output layer. Backpropagation, on the other hand, propagates the error between the prediction and the label from the output layer back to the input layer. The gradient used for optimizing network parameters at each layer can be computed based on information obtained during both forward and backward calculations. For the sake of conciseness, we have not taken the bias into account in this context as it can also be integrated as part of the weight parameter.
Key components of modern deep learning. (a) Neural network architectures; (b) individual neurons; (c) different connection ways of neurons; (d) regularization strategies and useful tricks; (e) loss function types.
Data-driven deep learning for the inverse problem in computational imaging. (a) Training of neural networks using a large amount of paired data in the measurement space and the object space through supervised learning. (b) Using the trained network model to establish the mapping from the image space to object space and provide predictions of the object image for unseen test data.
Advantages of deep-learning-based computational imaging. Deep learning introduces rich implicit image priors from data through online training and offline testing, allowing computational imaging systems to reduce sampling rates, improve image reconstruction speed, enhance imaging quality, and avoid forward physical modeling. CT, computed tomography; FPP, fringe projection profilometry; SPI, single-pixel imaging; GS, Gerchberg–Saxton; CS, compressive sensing; ART, algebraic reconstruction technique.
Challenges of deep-learning-based computational imaging. Deep learning requires large amounts of data to train a multi-layer neural network, resulting in difficulties in (a) acquiring training data, (b) high computational complexity, (c) poor generalization, and (d) low interpretability.
Categorizations of physics-enhanced deep-learning-based computational imaging. The integration of physics priors, consisting of the image formation model and the inverse restoration criterion, with deep learning is reflected in the three fundamental elements of deep learning: data, network, and loss function.
Integration of physics priors with input–output data. (a) Generation of training data using a fixed physical model; (b) generation of training data using a parameterized physical model, where the parameters of the physical model are optimized along with the weights of the neural network during training; (c) utilization of the physical model to process the output of a pretrained denoising network.
Construction of neural networks using physics priors. (a) Construction of a diffraction neural network using a diffraction propagation model; (b) addition of physically meaningful feature extraction layers to traditional neural network architectures; (c) unfolding physics-driven iterative optimization algorithms into neural networks, where each layer of the network involves computation using the physical model.
Using physics models for computing physics-consistency loss functions (PCLFs). (a) Optimizing parameters of a randomly initialized neural network using PCLF with current measurements; (b) fine-tuning a pretrained image reconstruction model using PCLF with current measurements; (c) optimizing the sampling vector of a pretrained generative model using PCLF with current measurements; (d) training a neural network using PCLF defined by measurements corresponding to multiple objects; (e) using PCLF defined by input measurements as a regularization term in traditional supervised deep learning.
Comparison of typical physics-enhanced inverse reconstruction algorithms. (a) Deep learning methods incorporating physics priors. (b) Optimization algorithms enhanced by deep learning techniques.
Examples of deep-learning-based methods incorporating physics priors. (a) Resolution enhancement using a neural network model trained on simulation data, reproduced with permission from Ref. 97 © 2018 Springer Nature. (b) High-quality, full-color, wide FOV imaging using end-to-end designed nano-optics, reproduced with permission from Ref. 142 (CC-BY). (c) Holographic reconstruction using a diffractive neural network, reproduced with permission from Ref. 187 © 2021 American Chemical Society. (d) Deep learning for speckle imaging with interpretable speckle-correlation for preprocessing, reproduced with permission from Ref. 109 © 2021 Chinese Laser Press. (e) Unrolling a model-based optimization algorithm for lensless imaging, reproduced with permission from Ref. 113 © 2020 Optical Society of America. (f) Holographic reconstruction with a neural network model trained by self-supervised learning, reproduced with permission from Ref. 220 (CC-BY). (g) Spectrum analyzer using physical-model and data-driven model combined neural network, reproduced with permission from Ref. 227 © 2023 Wiley-VCH GmbH.
Examples of deep-learning-based methods incorporating physics priors. (a) Resolution enhancement using a neural network model trained on simulation data, reproduced with permission from Ref. 97 © 2018 Springer Nature. (b) High-quality, full-color, wide FOV imaging using end-to-end designed nano-optics, reproduced from Ref. 142 (CC-BY). (c) Holographic reconstruction using a diffractive neural network, reproduced with permission from Ref. 187 © 2021 American Chemical Society. (d) Deep learning for speckle imaging with interpretable speckle-correlation for preprocessing, reproduced with permission from Ref. 109 © 2021 Chinese Laser Press. (e) Unrolling a model-based optimization algorithm for lensless imaging, reproduced with permission from Ref. 113 © 2020 Optical Society of America. (f) Holographic reconstruction with a neural network model trained by self-supervised learning, reproduced from Ref. 220 (CC-BY). (g) Spectrum analyzer using physical-model and data-driven model combined neural network, reproduced with permission from Ref. 227 © 2023 Wiley-VCH GmbH.
Comparison of required training data and required physical knowledge for computational imaging techniques based on different physics-enhanced deep learning approaches. Blending physical knowledge and deep learning usually results in a compromise between the required physics and training data. This suggests that physical knowledge can be used to reduce the required training data, whereas the data from the imaging system can also be used to reduce the required physics.
Comparison of typical inverse reconstruction algorithms for image reconstruction results under different measurement counts: single-pixel imaging example. (a) Results for the English letter “Q”: (a1) ground truth image; (a2) linear correlation algorithm; (a3) compressed sensing algorithm; (a4) data-driven deep learning; (a5) physics-enhanced deep learning. (b) Results for our logo: (b1) ground truth image; (b2) linear correlation algorithm; (b3) compressed sensing algorithm; (b4) data-driven deep learning; (b5) physics-enhanced deep learning. The data-driven deep learning algorithm uses a U-Net-like model trained on the EMNIST dataset, and the physics-enhanced deep learning method refines the trained U-Net using a physics-driven fine-tuning approach reported in Ref. 95. The results in (a2)–(a5) and (b2)–(b5) were all obtained with a measurement number of 819, and the image resolution was 64×64 pixels.
Typical computational imaging applications based on different physics-enhanced deep learning schemes.82" target="_self" style="display: inline;">82,90" target="_self" style="display: inline;">90,91" target="_self" style="display: inline;">91,93" target="_self" style="display: inline;">93–95" target="_self" style="display: inline;">–95,97" target="_self" style="display: inline;">97,109" target="_self" style="display: inline;">109,113" target="_self" style="display: inline;">113,115" target="_self" style="display: inline;">115,124" target="_self" style="display: inline;">124,142" target="_self" style="display: inline;">142,143" target="_self" style="display: inline;">143,146" target="_self" style="display: inline;">146,149" target="_self" style="display: inline;">149,151" target="_self" style="display: inline;">151,175" target="_self" style="display: inline;">175–289" target="_self" style="display: inline;">–289
  • Table 1. Abbreviations and symbols.

    View table
    View in Article

    Table 1. Abbreviations and symbols.

    AbbreviationsSymbols
    AI4SAI for scienceXObject space
    AIGCAI generative contentsYMeasurement space
    ARTAlgebraic reconstruction techniquegElement in X
    AWGNAdditive white Gaussian noisefElement in Y
    BPBack propagationHForward model
    CGHComputer-generated holographyNNoise distribution
    CIComputational imagingPProbability density
    CNNConvolutional neural networkαRegularization parameter
    CSCompressed sensingθParameters in NN
    CTComputed tomographyWWeights
    D2NNDiffractive deep neural networkbBiases
    DGIDeep gradient descentσ(·)Activation function
    DIPDeep image priorηLearning rate
    DLDeep learningRθNN mapping operator
    DNNDeep neural networkxNN input
    FFDNETFast and flexible denoising CNNyNN label
    FPMFourier ptychographic microscopyxii’th data of x
    FPPFringe projection profilometryyii’th data of y
    GANGenerative adversarial networkDTraining set
    GDGradient descentLLoss function
    GSGerchberg–SaxtonθLGradient
    INRImplicit neural representationN# training data
    MAEMean absolute errorBBatch size
    MAPMaximum a posterioriP# object-measurement pair
    MLEMaximum likelihood estimationQ# measurement pair
    MRIMagnetic resonance imaging
    NNNeural network
    PCLFPhysics-consistency loss function
    PnPPlug-and-play
    RNNRecurrent neural network
    SBPSpace-bandwidth product
    SGDStochastic gradient descent
    SLMSpatial light modulator
    SNRSignal-to-noise ratio
    SPISingle pixel imaging
  • Table 2. Advantages and disadvantages of various DL-based methods in CI.

    View table
    View in Article

    Table 2. Advantages and disadvantages of various DL-based methods in CI.

    MethodAdvantages (A) and disadvantages (D)
    Data-driven DLA: Fast, no need for physical modeling
    D: Requiring sufficient training data, poor generalization, poor interpretability, time-consuming training process
    Learning from simulationA: Fast, easy to generate training data
    D: Reliance on accurate physical modeling, poor generalization, poor interpretability, time-consuming training process
    End-to-end optical designA: Adaptive system design with differentiable optics, fast, easy to generate training data
    D: Reliance on accurate physical modeling, poor generalization, poor interpretability, time-consuming training process
    Plug-and-playA: Strong scalability, capability of using existing denoising networks without retraining
    D: Slow reconstruction, reliance on accurate physical modeling
    Diffractive neural networkA: High speed, low power consumption
    D: Poor image quality, poor generalization, low flexibility in network architecture design
    Interpretable layersA: Fast, improving image quality, generalization, and interpretability
    D: Reliance on accurate physical modeling, requires sufficient training data, time-consuming training process
    Unrolling/unfoldingA: Fast, reducing training data requirements, enhancing generalization and interpretability
    D: Reliance on accurate physical modeling, still a data-driven approach
    UntrainedA: No need for training data, no generalization issues, good interpretability, suitable for large-scale imaging (with INR)
    D: Slow reconstruction, reliance on accurate physical modeling
    Physics-driven fine-tuningA: Fast, extendable to out-of-domain data, no downstream data required, good interpretability
    D: Time-consuming fine-tuning process, reliance on accurate physical modeling
    Generative priorA: Strong scalability, can use various generative models
    D: Time-consuming fine-tuning process, reliance on accurate physical modeling
    Self-supervised learningA: No need for ground truth data, fast
    D: Reliance on accurate physical modeling, poor generalization, poor interpretability, time-consuming training process
    Consistency regularizationA: Fast, reducing training data requirements
    D: Reliance on accurate physical modeling, limited generalization, limited interpretability, time-consuming training process
Tools

Get Citation

Copy Citation Text

Fei Wang, Juergen W. Czarske, Guohai Situ, "Deep learning for computational imaging: from data-driven to physics-enhanced approaches," Adv. Photon. 7, 054002 (2025)

Download Citation

EndNote(RIS)BibTexPlain Text
Save article for my favorites
Paper Information

Category: Reviews

Received: Feb. 7, 2025

Accepted: Jul. 21, 2025

Posted: Jul. 21, 2025

Published Online: Sep. 4, 2025

The Author Email: Guohai Situ (ghsitu@siom.ac.cn)

DOI:10.1117/1.AP.7.5.054002

CSTR:32187.14.1.AP.7.5.054002

Topics