Advanced Photonics Nexus, Volume. 4, Issue 2, 026005(2025)

Adaptable deep learning for holographic microscopy: a case study on tissue type and system variability in label-free histopathology

Jiseong Barg1, Chanseok Lee1, Chunghyeong Lee1, and Mooseok Jang1,2、*
Author Affiliations
  • 1Korea Advanced Institute of Science and Technology, Department of Bio and Brain Engineering, Daejeon, Republic of Korea
  • 2Korea Advanced Institute of Science and Technology, KAIST Institute for Health Science and Technology, Daejeon, Republic of Korea
  • show less

    Holographic microscopy has emerged as a vital tool in biomedicine, enabling visualization of microscopic morphological features of tissues and cells in a label-free manner. Recently, deep learning (DL)-based image reconstruction models have demonstrated state-of-the-art performance in holographic image reconstruction. However, their utility in practice is still severely limited, as conventional training schemes could not properly handle out-of-distribution data. Here, we leverage backpropagation operation and reparameterization of the forward propagator to enable an adaptable image reconstruction model for histopathologic inspection. Only given with a training dataset of rectum tissue images captured from a single imaging configuration, our scheme consistently shows high reconstruction performance even with the input hologram of diverse tissue types at different pathological states captured under various imaging configurations. Using the proposed adaptation technique, we show that the diagnostic features of cancerous colorectal tissues, such as dirty necrosis, captured with 5× magnification and a numerical aperture (NA) of 0.1, can be reconstructed with high accuracy, whereas a given training dataset is strictly confined to normal rectum tissues acquired under the imaging configuration of 20× magnification and an NA of 0.4. Our results suggest that the DL-based image reconstruction approaches, with sophisticated adaptation techniques, could offer an extensively generalizable solution for inverse mapping problems in imaging.

    Keywords

    1 Introduction

    Holographic microscopy has recently garnered significant attention across various fields, including biomedicine, material science, and metrology and inspection for advanced manufacturing techniques, due to its exceptional capability to extract phase information from a single intensity measurement.15 Optical phase information provides a detailed view of morphology at the wavelength scale, enabling label-free detection of three-dimensional morphological changes smaller than 1  μm. This capability is crucial for accurately observing and measuring intricate structural variations in numerous scientific and industrial applications. Unlike traditional interferometric systems, which are bulky, sensitive, and require precise alignment, in-line holographic microscopy employs an intensity measurement setup similar to conventional microscopy, simplifying the imaging process and enhancing practicality.612 However, the inverse mapping of the complex-valued field from the intensity measurement (hologram) in in-line holographic microscopy is inherently complex due to the absence of a direct one-to-one mapping relationship. This necessitates the use of sophisticated phase retrieval algorithms to accurately recover the phase information, a task that becomes particularly challenging in histopathology, where imaging intricate tissue structures demands high-fidelity phase retrieval.

    Although foundational algorithms such as the Gerchberg–Saxton method and Wirtinger flow have historically addressed this challenge,1319 they often demand multiple measurements or iterations, exhibit sensitivity to noise, and fail to leverage domain-specific constraints efficiently. This leads to extended acquisition times, reduced throughput, and computational inefficiencies, underscoring the urgent need for more advanced techniques that maintain accuracy while minimizing complexity.

    The advent of deep learning (DL) has revolutionized the field of phase retrieval,2028 offering new ways to incorporate data constraints in solving this complex problem. By leveraging the statistical relationships between the distributions of holograms and two-dimensional (2D) complex-valued fields, DL methods have demonstrated state-of-the-art phase reconstruction capabilities, providing faster and more accurate results. However, DL-based phase retrieval techniques face a significant challenge: the lack of adaptability to out-of-distribution (OOD) data.20,23 OOD data refer to instances that differ substantially from the training dataset due to variations in object types or imaging system parameters.

    Such challenges are not unique to phase retrieval but are prevalent across the biomedical imaging field, where various domain adaptation and OOD detection strategies—such as feature alignment, image translation, and pseudo-labeling—have been investigated to enhance robustness. These approaches have been successfully applied in diverse imaging contexts, including anomaly detection in retinal images, domain adaptation for chest X-rays, and robust segmentation in histopathological studies,2937 reflecting the broader need to handle data shifts across different imaging conditions.

    In conventional holographic imaging for histopathology, different users may employ various imaging systems with distinct lens magnifications, wavelengths, and sensor sizes. In addition, microscopic morphology varies significantly across different organs. As a result, the OOD problem becomes particularly pronounced because DL models trained on specific datasets or imaging configurations often fail to perform reliably under varying imaging parameters or sample types, posing a key obstacle to the widespread application of DL methods. This challenge primarily stems from the supervised learning process, which relies on one-to-one mapping between the network’s inputs and outputs. During training, the neural network learns the statistical relationships and patterns present in the training dataset. Consequently, the model excels at predicting data similar to its training set. However, when exposed to OOD data, which deviate significantly from the training distribution, the learned representations and mappings become unreliable. Furthermore, obtaining comprehensive training data for all possible scenarios is impractical, particularly in fields such as histopathology.

    To tackle the challenge of developing DL models capable of generalizing beyond their training conditions, recent research has explored various strategies to address the OOD data problem. Huang et al.23 demonstrate generalized phase retrieval using GedankenNet, a zero-shot learning with artificially generated complex field images and physics-consistency loss, thus avoiding extensive training data requirements. However, the need for multiple holograms as input limited its applicability to real-time imaging. Conversely, Lee et al.20 addressed OOD data resulting from perturbations in object-to-sensor distance by employing a distance-parameterized physical forward model and an OT-CycleGAN38 network based on Kantorovich dual optimal transport theory.39 Nonetheless, their method did not account for other imaging system variables, such as numerical aperture (NA), magnification, and wavelength, due to the limitations of the conventional forward model. These studies highlight the ongoing efforts and significant need to develop DL models that can effectively generalize phase retrieval across various imaging conditions.

    Here, we present an adaptive DL-based method for holographic reconstruction in histopathology, designed to handle variations in tissue types and imaging system parameters. Our approach leverages backpropagation operation to mitigate discrepancies between OOD data and training data, coupled with a refined forward model that eliminates parameter degeneracy, thereby enabling the network to effectively manage various system configurations. To validate the robustness of our proposed method for phase reconstruction on OOD data, we trained the network with a limited dataset composed of complex-valued field images of rectum tissue captured using a single imaging system with specific parameters: a 532-nm wavelength, 20× magnification, 0.4 NA objective lens, and a 6.5-μm pixel size sensor. The trained network demonstrated successful holographic reconstruction across diverse tissue types, including colon, small bowel, breast, and appendix, captured under varying system configurations, with magnifications of objective lenses ranging from 5× to 20×, NAs from 0.1 to 0.4, and sensor pixel sizes from 2.4 to 6.5  μm. In addition, we reconstructed the complex-valued field of cancerous rectum tissues (adenocarcinoma), demonstrating the practical applicability of our method in histopathological scenarios. This advancement represents a major step toward developing a universal inverse solver that comprehensively addresses the OOD data problem. Such solvers would ensure that DL models maintain effectiveness when applied to new, unseen data, thereby enhancing the reliability of phase imaging across various clinical and research environments and significantly expanding their practical applicability.

    2 Principles

    2.1 DL Approaches to Inverse Problems in Holographic Microscopy

    Considering the image formation process of in-line digital holographic microscopy, as illustrated in Fig. 1, numerous parameters—lens magnifications, wavelengths, sensor sizes, NAs, and image plane-to-sensor distances (propagation distances)—along with tissue types, play intertwiningly to influence the characteristics of the hologram. In addition, taking into account the influence of stochastic noise, the formation process of the hologram can be mathematically expressed asIk,γ=F(Uk;γ)+n,γ=(λ,M,NA,d,p).

    (Top row) Simplified schematic of in-line digital holographic microscopy. The complex-valued transmittance of a tissue sample of type k is converted into a hologram on a sensor array, with system parameters including wavelength (λ), numerical aperture (NA), magnification (M), propagation distance (d), and pixel size (p). (Bottom row) Framework of the proposed method. The backpropagated hologram displays obscured structural features with interference-related artifacts, enabling generalization across tissue types. For the training of our DL network, we employed a self-supervised training scheme using the refined forward model with effective reparameterization. This reparameterization enabled the representation of the hologram space using only two variables (peff and deff), instead of the original five variables (λ, NA, M, d, and p), thereby enhancing adaptability across various imaging configurations. Finally, our DL network reconstructs the complex-valued field from a single backpropagated hologram, demonstrating robust holographic reconstruction performance across various imaging scenarios.

    Figure 1.(Top row) Simplified schematic of in-line digital holographic microscopy. The complex-valued transmittance of a tissue sample of type k is converted into a hologram on a sensor array, with system parameters including wavelength (λ), numerical aperture (NA), magnification (M), propagation distance (d), and pixel size (p). (Bottom row) Framework of the proposed method. The backpropagated hologram displays obscured structural features with interference-related artifacts, enabling generalization across tissue types. For the training of our DL network, we employed a self-supervised training scheme using the refined forward model with effective reparameterization. This reparameterization enabled the representation of the hologram space using only two variables (peff and deff), instead of the original five variables (λ, NA, M, d, and p), thereby enhancing adaptability across various imaging configurations. Finally, our DL network reconstructs the complex-valued field from a single backpropagated hologram, demonstrating robust holographic reconstruction performance across various imaging scenarios.

    Here, k represents the types of tissue imaged by the microscope, γ=(λ,M,NA,d,p) are variables defining the characteristics of the in-line digital holography system, and n represents stochastic noise in the imaging process. This expression accurately captures the image formation process, indicating that holograms generated by different imaging systems and specimens form subsets of the overall hologram distribution. Therefore, the training and inverse mapping process of conventional DL approaches can be described as argminθE(Uk,Ik,γ)DT{L[Uk,Gθ(Ik,γ)]}.

    The network optimizes its weights by minimizing the discrepancy between the actual complex field and the reconstructed result, using the paired training dataset DT. Upon completing the training process, the optimized network, Gθ*, acts as an inverse solver capable of retrieving phase information, U˜k=Gθ*(Ik,γ),(Uk,Ik,γ)DT.

    However, this capability is confined to parameter combinations within the trained data distribution. Consequently, without a paired dataset that spans the full range of variable combinations, the phase retrieval problem cannot be generalized in a DL network trained in a supervised manner. In holographic microscopy, the imaging configurations can vary widely: magnifications can range from 4× to 50×, pixel sizes from 2 to 10  μm, and wavelengths can span the visible spectrum, from 400 to 700 nm. These extensive ranges in imaging configurations illustrate the complexity and variability that DL models contend with. As a result, the practical application of DL models in real-world scenarios is significantly hindered, as they are unable to generalize beyond the specific conditions of their training environment.

    To address these challenges, we implemented two main strategies: (1) numerical backpropagation through a physical forward model and (2) a self-supervised network based on a physical forward model with refined parameterization, to enhance the adaptability of DL-based phase retrieval for histopathology, as illustrated in Fig. 1.

    2.2 Backpropagation for Enhanced Adaptability to Tissue Morphology

    To improve the generalization capability of an inverse solver against tissue types k (i.e., to circumvent the overfitting of a DL model to a specific tissue type given in a training dataset), we intentionally introduced a backpropagation operator to disturb the statistical distribution of input data. Specifically, a hologram measured at z=d1 is numerically propagated back to the image plane using the angular spectrum method (i.e., digitally propagated by the distance of d1). This backpropagated hologram can be considered as the correct object field superimposed with twin-image and self-interference terms (see Note S1 in the Supplementary Material). We found that the additional artifact terms constitute a data domain less affected by tissue morphology (Fig. 1). The proposed approach is similar to domain adaptation techniques36,40,41 in the sense that it disturbs the data space and makes the data sampled from different tissue types (i.e., OOD tissue types) indistinguishable in the input space, thus extending the adaptability of the neural network.

    Although perturbative operations on the input data domain can help reduce distinctions among different data distributions (e.g., diffraction patterns from different tissue types), they may degrade the network’s reconstruction performance if not properly controlled. This introduces an inherent trade-off between preserving sufficient information to solve the inverse problem with high accuracy and the potential introduction of artifacts that obscure domain distinctions (i.e., improving adaptability for unseen data). For instance, continuously adding Gaussian noise could eventually homogenize all data into Gaussian noise, removing interdomain distinctions at the cost of reconstruction accuracy. However, our findings indicate that backpropagation inherently achieves a balanced trade-off between adaptability and accuracy, enabling enhanced shape adaptability without compromising the network’s reconstruction performance. Although a few studies have examined the use of backpropagation in implementing DL phase retrieval networks,25,4244 these approaches have not addressed the challenge of generalizing to diverse histopathological tissue morphologies. In contrast, to the best of our knowledge, this is the first study to leverage histopathological tissue samples to demonstrate a neural network’s shape generalization capability and to systematically incorporate contrastive learning as a framework for quantitatively analyzing latent representations, thereby verifying and understanding the achieved shape adaptability. Contrastive learning has been widely utilized in medical imaging tasks—such as segmentation and classification—to establish well-structured latent feature spaces that enhance representational quality and support robust domain generalization.4551 Building upon these advances, we apply contrastive learning to quantitatively measure the similarity of latent representations across diverse tissue morphologies, thereby providing a principled foundation for validating and interpreting our shape adaptability approach.

    2.3 Effective Reparameterization for Enhanced Adaptability to Imaging Configuration

    In our proposed method, we incorporate a refined physical forward model to facilitate the generalization of the network to system variables for in-line digital holography. The conventional forward model for in-line holographic microscopy entails five parameters, namely, wavelength (λ), numerical aperture (NA), magnification (M), pixel size (p), and propagation distance (d), to describe light propagation, as follows:52,53H(m,n;d,λ,p,M,NA)=exp[i2πd1λ2(mNp)2(nNp)2],s.t.  (mNp)2+(nNp)2(NAλM)2.

    It is assumed that all holographic microscopy configurations satisfy the Nyquist sampling condition for complex-valued fields. The inequality represents the constraint imposed by the objective lens, which confines the spatial frequency spectrum of the light propagating through the microscopy system. In the Fresnel diffraction regime, we formalize a refined physical forward model with two nondegenerate parameters, peff and deff, H(m,n;peff,deff)=exp{iπdeff[(mNpeff)2+(nNpeff)2]},s.t.  (mNpeff)2+(nNpeff)21.

    Here, deff and peff are defined as NA2dλM2 and NApλM, respectively. Effective distance, deff, is a unitless parameter for characterizing the extent to which the object’s complex field undergoes diffraction. As a consistent measure of the diffraction effects, which eliminates the ambiguity in the mapping of object field to diffraction pattern mediated by physical system parameters, it offers a more generalized and adaptable description of the imaging process. For example, two optical imaging systems—one with λ=400  nm, M=20, and NA=0.4 and another with λ=625  nm, M=10, and NA=0.25—both result in the same effective distance, leading to the same diffraction effect, irrespective of their propagation distance.

    Effective pixel size, peff, is a unitless parameter for characterizing the resolution capabilities of an optical imaging system. Unlike the physical pixel size of the detector, effective pixel size provides a standardized measure of the system’s resolution relative to the spatial frequency content of the hologram, representing the proportion of meaningful information recorded by the pixel array of the sensor. In a typical setup, the values of effective pixel size range from 0.05 to 0.25. It should be noted that, when the effective pixel size surpasses 0.25, the hologram image does not fully satisfy the Nyquist sampling condition due to the presence of the self-interference terms. Further explanation of the derivation of the refined forward model is available in Note S2 in the Supplementary Material.

    By condensing the original five variables into two nondegenerate parameters, the refined forward model eliminates the inherent degeneracy found in traditional models. This simplification allows for a more straightforward and efficient representation of holograms across a wide range of wavelengths, magnifications, and NAs, making it easier for neural networks to learn and generalize from the data. The removal of variable degeneracy also enables uniform sampling across the hologram data space.

    Figure 2 shows the sampling results of hologram space. For the conventional forward model, parameters—except for the wavelength—are randomly sampled from a typical range of parameters in in-line holographic microscopy. As shown in Fig. 2(a), the degeneracy causes the hologram space, represented by peff and deff, to be unevenly sampled and concentrated in only a subset of the entire space. Conversely, the sampling results of the refined forward model, where peff and deff are directly sampled, demonstrate the uniform coverage of the hologram space, as shown in Fig. 2(b). The comparison of the sampled regions in the conventional forward model results and the refined forward model emphasizes that the refined forward model guarantees a balanced dataset, whereas traditional models based on physical parameters often lead to biased learning, confined to specific subsets of the hologram distribution. Detailed discussions regarding the degeneracy problem are available in Note S3 in the Supplementary Material.

    Comparison of sampling results of hologram space using conventional and refined forward models. (a) Sampling results using the conventional forward model, where pixel size (2 to 10 μm), propagation distance (5 to 20 mm), NA (0.1 to 0.4), and magnification (5× to 20×) were randomly sampled 10,000 times from a uniform distribution for each parameter, with a fixed wavelength of 532 nm. The sampled values were converted to effective pixel size (peff) and effective distance (deff) and plotted. The orange dotted box highlights the region corresponding to the sampling range used in the refined forward model for our experiments. (b) Sampling results using the refined forward model, where peff and deff were randomly sampled 10,000 times from uniform distributions with ranges of 0.05 to 0.25 and 1.3 to 5.4, respectively.

    Figure 2.Comparison of sampling results of hologram space using conventional and refined forward models. (a) Sampling results using the conventional forward model, where pixel size (2 to 10  μm), propagation distance (5 to 20 mm), NA (0.1 to 0.4), and magnification (5× to 20×) were randomly sampled 10,000 times from a uniform distribution for each parameter, with a fixed wavelength of 532 nm. The sampled values were converted to effective pixel size (peff) and effective distance (deff) and plotted. The orange dotted box highlights the region corresponding to the sampling range used in the refined forward model for our experiments. (b) Sampling results using the refined forward model, where peff and deff were randomly sampled 10,000 times from uniform distributions with ranges of 0.05 to 0.25 and 1.3 to 5.4, respectively.

    2.4 Self-Supervised Complex Field Reconstruction from Single Imaging System Data

    In this study, we employed a self-supervised approach to train a deep neural network for reconstructing the complex-valued field of a tissue section from a single hologram. By utilizing numerically generated holograms derived from complex field data obtained through off-axis interferometry, our method eliminates the necessity for measured hologram data during training, thereby streamlining the data acquisition process.

    2.4.1 Proposed framework: refined forward model and backpropagation

    Our proposed DL training scheme integrates backpropagation and a refined physical forward model to enhance adaptability and generalization across diverse imaging configurations. The training process involves the following steps:

    1. 1.Scaled complex field generation: Starting with a dataset of complex-valued fields, we generate modified complex fields by applying cropping or zero-padding operations in Fourier space. This results in complex-valued fields corresponding to randomly sampled effective pixel sizes.
    2. 2.Hologram synthesis: The scaled complex-valued fields are numerically propagated using randomly sampled effective distances and corresponding effective pixel sizes. This synthesis is performed within the refined forward model, which incorporates a random sampling of effective pixel size and effective propagation distance to facilitate generalization to arbitrarily configured holographic imaging systems.
    3. 3.Backpropagation enhancement: The generated holograms undergo backpropagation to the image plane before being input into the neural network, enhancing the network’s adaptability to varying shapes and structures.
    4. 4.End-to-end training: The network is trained in an end-to-end manner, optimizing the loss function based on the discrepancy between the network’s output and the scaled ground truth complex fields through iterative minimization.

    For the proposed method, we sampled the effective pixel size from 0.05 to 0.25 and the effective distance from 1.353 to 5.413. These ranges correspond to an imaging system with a wavelength of 532 nm, a magnification of 300/9, and an NA of 0.4, covering pixel sizes from 2.21 to 11.08  μm and physical propagation distances from 5 to 20 mm. We refer to this framework as the “proposed” model.

    2.4.2 Baseline model for ablation study

    To evaluate the effectiveness of our proposed enhancements, we defined a “baseline” model. This model utilizes a conventional physical forward model to generate holograms from given complex-valued fields with the following characteristics:

    1. Only the physical propagation distance is randomly sampled within the range of 5 to 20 mm.
    2. Other variables, including wavelength (532 nm), magnification (300/9), NA (0.4), and pixel size (6.5  μm), are fixed.

    This approach aligns with common practices in previous studies, which typically employ fixed imaging parameters and directly use the synthesized holograms as inputs. By adopting this baseline configuration, we closely approximate a standard scenario characterized by limited variability. Consequently, comparing our proposed model against this baseline isolates the contributions of backpropagation and the refined forward model, enabling a clear demonstration of our approach’s advantages over conventional methods. Visual depictions of the training procedures for each approach, along with further details on the training process, are provided in Sec. 5.2 and Fig. S1 in the Supplementary Material.

    3 Results

    3.1 Latent Space Comparison of Histopathologic Tissue Types

    To investigate the effect of the backpropagation operator on the generalization of tissue morphology, we performed contrastive learning on the measured complex field data from five distinct tissue types: rectum, colon, small bowel, breast, and appendix, with the aim of quantitatively evaluating the morphological variations across these diverse tissue types. To further categorize the data, we defined three tissue data categories: in-focused field, propagated field, and backpropagated hologram. The in-focused field presents inherent morphological differences among tissue types, whereas the other two categories represent the input domains of baseline and proposed methods, respectively. Specifically, we employed the SimCLR algorithm54 to learn the latent space representation of 25 types of data, encompassing the five tissue types in each of the three categories with two propagation distances, 5 and 15 mm. For network training, we used a total of 14,400 complex images, including 576 complex fields measured for each tissue type, each sized 256  pixel×256  pixel. We then examined the trained latent space using 572 new patches for each tissue type that were not encountered during training. For the testing phase, we utilized 512-dimensional intermediate vectors for latent space visualization and Fréchet distance (FD) calculation. Lower FD scores indicate higher morphological similarity. As shown in Fig. 3, the latent space representation of in-focused fields and propagated fields shows distinct clusters for each tissue type. Specifically, tissue types such as rectum and colon, and appendix and breast, clustered together, indicating that the distribution of tissue data forms several subsets based on their morphological characteristics. The clusters in the latent space of propagated fields elucidate the OOD data problems that arise in conventional phase retrieval DL models when trained with limited types of tissue data. Conversely, the backpropagated holograms used in the proposed method form a unified distribution, suggesting that backpropagation can mitigate distinctions among subsets in the input domain, effectively serving as domain adaptation. This outcome strongly correlates with the representative images in Fig. 3, where the structural details visible in in-focused and propagated field images are obscured in backpropagated holograms. For the contrastive learning analysis results at both 5 and 15 mm, see Fig. S2 in the Supplementary Material.

    Effect of the backpropagation operator. Contrastive learning was applied to complex-valued fields from five tissue types: rectum, colon, small bowel, breast, and appendix, across three data categories: “in-focused field,” “propagated field,” and “backpropagated hologram,” as shown in the left panels. The upper row shows the real values, and the lower row shows the imaginary values for each data category. “In-focused field” refers to complex-valued fields captured via off-axis interferometry. Propagated field was generated through numerical propagation of the in-focused field. Backpropagated hologram was simulated by numerically backpropagating the intensity map of propagated field. Both propagated field and backpropagated hologram were simulated with a propagation distance of 15 mm. Middle panels show 2D visualizations obtained from the SimCLR network, where dimensionality reduction of the 512D intermediate vectors was performed using t-SNE. The right panels display FDs between rectum tissue and other tissue types, representing statistical distances within the 512D latent space of the SimCLR network. Scale bar: 20 μm.

    Figure 3.Effect of the backpropagation operator. Contrastive learning was applied to complex-valued fields from five tissue types: rectum, colon, small bowel, breast, and appendix, across three data categories: “in-focused field,” “propagated field,” and “backpropagated hologram,” as shown in the left panels. The upper row shows the real values, and the lower row shows the imaginary values for each data category. “In-focused field” refers to complex-valued fields captured via off-axis interferometry. Propagated field was generated through numerical propagation of the in-focused field. Backpropagated hologram was simulated by numerically backpropagating the intensity map of propagated field. Both propagated field and backpropagated hologram were simulated with a propagation distance of 15 mm. Middle panels show 2D visualizations obtained from the SimCLR network, where dimensionality reduction of the 512D intermediate vectors was performed using t-SNE. The right panels display FDs between rectum tissue and other tissue types, representing statistical distances within the 512D latent space of the SimCLR network. Scale bar: 20  μm.

    The FD scores in Fig. 3 quantify the statistical distances between the distribution of rectum tissue and other tissue types. Both the in-focused field and the propagated field exhibit high FD scores for all tissue types except for the colon, which is consistent with the t-distributed stochastic neighbor embedding (t-SNE) analysis. The FD scores for in-focused and propagated fields follow a consistently increasing order: colon, small bowel, breast, and appendix, highlighting the varying degrees of morphological differences between the rectum and other tissue types. However, the backpropagated holograms present significantly lower FD scores for all tissue types, suggesting that incorporating the backpropagation operation enhances the network’s ability to generalize across different tissue morphologies.

    3.2 Improved Adaptability to Tissue-induced OOD Data

    Figure 4 presents the reconstruction results from both networks—baseline and proposed—trained exclusively with rectum tissue data. To comparatively evaluate the networks’ shape adaptability, we performed the test with the holograms of other tissue data types (i.e., OOD data), including colon, small bowel, breast, and appendix tissues, along with those of rectum tissues (i.e., in-distribution data). For the measured holograms of the rectum, both networks achieved highly accurate reconstruction results, as indicated by high Pearson correlation coefficient (PCC) values. However, when tested with OOD tissue types, the reconstruction performance of both networks deteriorated. Notably, the degree of degradation in reconstruction results corresponded to previous SimCLR results. Tissue types with lower FD scores, such as colon and small bowel, exhibited less degradation, whereas those with higher FD scores, such as breast and appendix, showed more significant degradation. These findings suggest that our latent space analysis method successfully quantifies the extent of morphological differences among various tissue types. Compared with the baseline method, our proposed method demonstrated superior reconstruction performance for all tissue types tested. Specifically, appendix tissues, which had the highest FD scores in both the in-focus and propagated fields, showed the most significant degradation with the baseline method, whereas the proposed method demonstrated substantial improvement.

    Demonstration of enhanced generalization capability for OOD tissue types. Rectum tissue, used for training both the baseline and proposed models, represents the in-distribution data. In contrast, colon, small bowel, breast, and appendix tissues represent OOD tissue types that were not encountered during training. (a) Holographic reconstruction results from both the baseline and proposed methods. The input holograms were captured with an image-to-sensor distance (propagation distance) of 20 mm. (b) Mean PCC scores of reconstructed images across a distance range of 5 to 20 mm. Each bar in panel (b) is accompanied by an error bar indicating the standard deviation, derived from 375 distinct 512 pixel×512 pixel patches for each tissue type (25 patches per distance). Scale bar: 20 μm.

    Figure 4.Demonstration of enhanced generalization capability for OOD tissue types. Rectum tissue, used for training both the baseline and proposed models, represents the in-distribution data. In contrast, colon, small bowel, breast, and appendix tissues represent OOD tissue types that were not encountered during training. (a) Holographic reconstruction results from both the baseline and proposed methods. The input holograms were captured with an image-to-sensor distance (propagation distance) of 20 mm. (b) Mean PCC scores of reconstructed images across a distance range of 5 to 20 mm. Each bar in panel (b) is accompanied by an error bar indicating the standard deviation, derived from 375 distinct 512  pixel×512  pixel patches for each tissue type (25 patches per distance). Scale bar: 20  μm.

    We further confirmed the effectiveness of the backpropagation operation in improving adaptability to variations in tissue morphologies using complex-valued fields of liver tissues as the training dataset (see Fig. S3 in the Supplementary Material). Notably, in contrast to rectum tissues, liver tissues exhibit distinctive morphological features of densely packed hepatocytes with polygonal shape and relatively large size. In addition, the proposed scheme demonstrates enhanced reconstruction performance compared with a zero-shot approach,23 which is specifically trained for shape adaptability using a physical consistency loss applied to an artificial complex-valued field dataset with morphological diversity, and which requires two holograms for inference (see Fig. S4 in the Supplementary Material).

    3.3 Improved Adaptability to System-induced OOD Data

    To demonstrate the enhanced adaptability of the proposed method to perturbations in system variables, we evaluated the network’s performance using rectum data captured under different imaging configurations. As in previous experiments, we compared the reconstruction results of the baseline and proposed methods, both trained on the rectum tissue dataset from a single imaging system configured with system parameters of a wavelength of 532 nm, a magnification of 300/9, an NA of 0.4, and a pixel size of 6.5  μm. This reference configuration yields an effective pixel size of 0.1466. Figure 5 presents the reconstruction results from various in-line holographic microscopy systems, including the one with the reference configuration (i.e., system 1). Systems 2 and 3, representing common OOD scenarios in in-line digital holography setups based on conventional benchtop microscopy, employed different objective lenses while using the same image sensor as the training system (i.e., maintaining the same physical pixel size). Specifically, the magnification and NA were set to 150/9 and 0.25, and 75/9 and 0.1 for systems 2 and 3, respectively, resulting in effective pixel sizes of 0.1833 and 0.1466. We note that, even for the same tissue type and the same effective pixel size, the appearance of the hologram and the complex-valued field can be significantly varied depending on specific imaging configurations, as exampled in systems 1 and 3.

    Demonstration of enhanced generalization capability for OOD imaging configurations. The schematic on the left displays the imaging systems used for capturing the holograms. System 1, which has identical imaging configurations to the rectum training data, represents the in-distribution imaging configuration with peff value of 0.1466. Systems 2 to 4 represent OOD imaging configurations with peff values of 0.1833, 0.1466, and 0.0677, respectively. The ground truth phase and the holographic reconstruction results for the baseline and proposed methods are shown on the right, using holograms with propagation distances of 20 mm (5.413), 12 mm (5.075), 15 mm (4.06), and 12 mm (5.075) for systems 1 to 4, respectively. The values in parentheses are corresponding effective distances for each configuration. Scale bar: 30 μm.

    Figure 5.Demonstration of enhanced generalization capability for OOD imaging configurations. The schematic on the left displays the imaging systems used for capturing the holograms. System 1, which has identical imaging configurations to the rectum training data, represents the in-distribution imaging configuration with peff value of 0.1466. Systems 2 to 4 represent OOD imaging configurations with peff values of 0.1833, 0.1466, and 0.0677, respectively. The ground truth phase and the holographic reconstruction results for the baseline and proposed methods are shown on the right, using holograms with propagation distances of 20 mm (5.413), 12 mm (5.075), 15 mm (4.06), and 12 mm (5.075) for systems 1 to 4, respectively. The values in parentheses are corresponding effective distances for each configuration. Scale bar: 30  μm.

    In contrast to the baseline model, which generated blurred and morphologically less accurate phase images, the proposed model produced reliable reconstruction results for systems 2 and 3. Likewise, for system 4, which utilizes a different objective lens and image sensor (i.e., with magnification, NA, and pixel size set to 150/9, 0.25, and 2.4  μm, respectively, yielding an effective pixel size of 0.0677), the reconstruction result from the baseline model was severely degraded. However, the proposed model presented highly accurate reconstruction performance, demonstrating its robustness and superior adaptability. Figure S5 in the Supplementary Material quantitatively analyzes the reconstruction performance of both baseline and proposed methods, with physical propagation distances set to 5 to 20 mm for system 1, 4 to 12 mm for systems 2 and 4, and 5 to 15 mm for system 3, corresponding to effective distances of 1.353 to 5.413 for system 1, 1.691 to 5.075 for systems 2 and 4, and 1.353 to 4.06 for system 3. We note that the proposed method consistently outperforms the baseline method across a wide range of effective pixel sizes and distances.

    To confirm that the effective parameterization within the refined forward model contributes to enhanced adaptability across diverse imaging configurations, we conducted additional experiments using holograms with deff and peff ranges that extend beyond the training range. Notably, the proposed model exhibits reliable reconstruction performance even outside the random sampling range for both effective pixel size and distance. For example, the effective distance was sampled within a range of 1.353 to 5.413, corresponding to the physical distance of 5 to 20 mm in the reference configuration on which the baseline was trained. We found that the distinction between the proposed and baseline methods becomes clearer when the physical distance extends beyond the training range (see Fig. S6 in the Supplementary Material for evaluation analysis results over an extended range of propagation distances, 1 to 50 mm, corresponding to deff values of 0.271 to 13.534), indicating that effective parameterization for even sampling of the hologram distribution allows the proposed method to robustly learn from diverse holograms, thus enhancing adaptability to previously unseen hologram distributions. However, we note that when imaging parameters deviate extremely from the trained range, the performance of the proposed method may degrade, as reflected in the PCC scores shown in Fig. S6 in the Supplementary Material. We observed similar results for the effective pixel size beyond the sampling range (see Fig. S7 in the Supplementary Material for detailed analysis over a wide range of pixel sizes, from 2.21 to 22.1  μm, corresponding to peff values of 0.05 to 0.5).

    3.4 Comprehensive Generalization to Tissue and System Variability

    Building on the results from previous experiments, we aimed to demonstrate the robustness and versatility of the proposed method in handling both diverse tissue types and imaging system configurations. To this end, we examined the reconstruction performance for the holograms of the appendix and colon, captured using an imaging system equipped with a 10× magnification objective lens, a 0.25 NA, and a sensor pixel size of either 6.5 or 2.4  μm. The baseline and proposed network models were consistent with previous experiments, both trained on rectum tissue data acquired using a 20× objective lens, a 0.4 NA, and a 6.5  μm sensor pixel size. As depicted in Figs. 6(a) and 6(b), the proposed method consistently produced accurate reconstruction results across the full range of distances, irrespective of tissue type and imaging system configuration. In contrast, the baseline method demonstrated significant degradation in performance, failing to reliably retrieve phase information due to its limited adaptability to varying tissue morphologies and system configurations.

    Generalization capability for OOD holograms with distinct tissue types and imaging configurations. The test holograms were captured from OOD tissues (appendix and colon) and OOD imaging systems (objective lens with 10× magnification and 0.25 NA) and evaluated using networks trained on the baseline and proposed methods with rectum data captured from imaging systems equipped with a 20×, 0.4 NA objective lens, and 6.5 μm pixel size. (a) Holographic reconstruction results of the appendix with a pixel size of 6.5 μm at different propagation distances. The upper row displays the results from the baseline method, whereas the lower row shows the results from the proposed method, at increasing propagation distances. (b) Holographic reconstruction results of the colon with a pixel size of 2.4 μm at different propagation distances. Scale bar: 20 μm.

    Figure 6.Generalization capability for OOD holograms with distinct tissue types and imaging configurations. The test holograms were captured from OOD tissues (appendix and colon) and OOD imaging systems (objective lens with 10× magnification and 0.25 NA) and evaluated using networks trained on the baseline and proposed methods with rectum data captured from imaging systems equipped with a 20×, 0.4 NA objective lens, and 6.5  μm pixel size. (a) Holographic reconstruction results of the appendix with a pixel size of 6.5  μm at different propagation distances. The upper row displays the results from the baseline method, whereas the lower row shows the results from the proposed method, at increasing propagation distances. (b) Holographic reconstruction results of the colon with a pixel size of 2.4  μm at different propagation distances. Scale bar: 20  μm.

    Furthermore, we confirmed the generalization capability for the network trained exclusively on liver tissue data, which have substantially different morphological characteristics from appendix and colon tissues (see Fig. S8 in the Supplementary Material). All the quantitative evaluation results corresponding to Figs. S3–S6 and S8 are shown in Tables S1-S5 in the Supplementary Material.

    3.5 Robust Holographic Reconstruction for Cancerous Tissues across Diverse Imaging Systems

    Holographic imaging techniques hold significant promise in disease diagnosis, particularly for cancer detection. Recently, the reconstructed phase images of cancerous tissues have enabled advanced diagnostic techniques such as DL for virtual staining and automated diagnosis.5560 However, the diverse morphologies exhibited by cancerous tissues, which often differ from normal tissues and vary significantly for different cancer types, pose a challenge in DL-based image reconstruction, necessitating a DL approach that can accommodate diverse tissue morphologies caused by varying pathological states. In addition to addressing this morphological diversity, for these DL approaches to be widely adopted across various hospitals and research facilities, they need to be adaptable to different microscopic imaging configurations.

    We verified the effectiveness of our proposed method in reconstructing holographic images of cancerous tissues (moderately differentiated rectum adenocarcinoma) through various imaging systems. Figure 7(a) illustrates the experimental setup: the network, trained using normal rectum tissues captured by a single imaging system (20×, 0.4 NA, and 6.5  μm pixel size), was tested on holograms of cancerous tissues obtained under different magnifications (5× to 20×) and NAs (0.1 to 0.4). As shown in Fig. 7(b), the proposed method consistently reconstructs detailed phase information across all tested imaging conditions. Notably, in the reconstruction results from the 20×, 0.4 NA hologram, critical diagnostic features such as dirty necrosis are accurately reconstructed, as indicated by white arrows in Fig. 7(b). Dirty necrosis, characterized by cellular debris including dead epithelial cells, neutrophils, and nuclear fragments, serves as a hallmark of gland-forming cancers, which were not shown in normal tissues.6164 This result underscores the method’s capability to provide critical diagnostic information. In contrast, the baseline model generates degraded reconstruction results, failing to preserve the detailed morphological features of the cancerous tissues, as confirmed by lower evaluation metrics (see Fig. S9 and Table S6 in the Supplementary Material).

    Robust phase reconstruction of colorectal cancer tissues across diverse imaging configurations using the proposed method. (a) Schematic of the training and testing process. Normal rectum tissue images, captured with a 20× objective lens, 0.4 NA, and a sensor with a 6.5 μm pixel size, were used for training. The trained model was tested on cancerous rectum tissue holograms acquired using imaging systems with varying configurations (5× to 20× objective lenses, 0.1 to 0.4 NA, and a sensor with a 6.5 μm pixel size). (b) Holographic reconstruction results for holograms of cancerous tissues. Ground-truth phase images and the corresponding reconstruction results by the proposed method are shown for systems with 5×/0.1 NA, 10×/0.25 NA, and 20×/0.4 NA. Blue dotted boxes in the low-magnification images (5× and 10×) indicate the corresponding fields of view seen at higher magnification images (10× and 20×, respectively). Yellow dotted boxes in the 20× images indicate the zoomed-in regions of interest, highlighting areas of dirty necrosis in the cancerous tissue. White arrows in the zoomed-in regions point to cell nuclei within these areas.

    Figure 7.Robust phase reconstruction of colorectal cancer tissues across diverse imaging configurations using the proposed method. (a) Schematic of the training and testing process. Normal rectum tissue images, captured with a 20× objective lens, 0.4 NA, and a sensor with a 6.5  μm pixel size, were used for training. The trained model was tested on cancerous rectum tissue holograms acquired using imaging systems with varying configurations (5× to 20× objective lenses, 0.1 to 0.4 NA, and a sensor with a 6.5  μm pixel size). (b) Holographic reconstruction results for holograms of cancerous tissues. Ground-truth phase images and the corresponding reconstruction results by the proposed method are shown for systems with 5×/0.1 NA, 10×/0.25 NA, and 20×/0.4 NA. Blue dotted boxes in the low-magnification images (5× and 10×) indicate the corresponding fields of view seen at higher magnification images (10× and 20×, respectively). Yellow dotted boxes in the 20× images indicate the zoomed-in regions of interest, highlighting areas of dirty necrosis in the cancerous tissue. White arrows in the zoomed-in regions point to cell nuclei within these areas.

    To further assess robustness under more challenging conditions, we tested the proposed method on colorectal cancer tissues to present various pathological states (well differentiated, moderately differentiated, and poorly differentiated) using imaging systems operating at significantly different wavelengths (e.g., 610 and 480 nm) and magnification (10× and 0.25 NA) than those employed during training (532 nm, 20×, and 0.4 NA). As detailed in Fig. S10 and Table S7 in the Supplementary Material, the proposed method outperforms the baseline approach across all tested configurations.

    Finally, we compared our method against several widely adopted reference methods—PhaseGAN,28 OT-CycleGAN,20 and GedankenNet-Phase23—under these extreme clinical application scenarios. As shown in Fig. S11 and Table S8 in the Supplementary Material, our method surpasses these established approaches, demonstrating robust performance despite substantial perturbations in imaging configurations and pathological states.

    4 Discussion

    In this study, we proposed a DL-based holographic reconstruction method designed to enhance the adaptability of neural networks to OOD data. Based on a contrastive learning scheme, we first confirmed that the backpropagation operation acts as a domain adaptation technique, effectively reducing discrepancies among tissues with varying morphologies. By integrating the backpropagation operation and a forward model with effective parameterization into the network’s learning scheme, the proposed method exhibited exceptional stability in handling the inherent variability in in-line digital holographic microscopy for histopathology. Furthermore, the method proved its clinical applicability, particularly in cancer diagnosis, by accurately reconstructing microscopic morphological features of cancerous tissues, despite being trained exclusively on normal tissue data. We anticipate that, given the diversity in microscopic biological structures, pathological states, and imaging configurations, such reliable and accurate reconstruction under varying conditions would be crucial for the practical use of digital holography in biomedical applications, such as cell mechanics and disease diagnosis. To provide an overview of the range of conditions evaluated in this work, we present a summary of the examined tissue types, imaging configurations, and pathological states in Table S9 in the Supplementary Material.

    Interestingly, the backpropagation operation and the effective parameterization of the forward model synergistically improve adaptability in holographic reconstruction. For instance, we achieved robust phase reconstruction from holograms with varying diffraction resolution limits. Even under ideal in-focus interferometric imaging conditions, phase images of the same tissue slides can have appearances that vary significantly depending on the NA, magnification, and pixel size of the imaging system. Along with the parameterization technique, the backpropagation operation, which renders shape adaptability, allows robust holographic reconstruction with a resolution physically consistent with the law of diffraction. Therefore, the spatial-bandwidth product (SBP) of an output complex-valued field map is matched with the SBP of the field captured by the input imaging system, not restricted to the SBP of the images presented in the training dataset. In addition, this combined approach provides the network with superior robustness to stochastic variables, such as shot noise in digital holography, as evidenced by the results presented in Fig. S12 in the Supplementary Material and detailed in Note S5 in the Supplementary Material. However, when the imaging configuration deviates significantly from the training range or the imaging system is exposed to high levels of stochastic noise, the performance of the proposed method may degrade, as shown in Figs. S6, S7, and S10 in the Supplementary Material.

    The backpropagation operation used in our training scheme can be considered an input domain adaptation technique. Based on physical insights or domain-specific expertise, an alternative input domain can be configured to offer improved adaptation capability. For instance, through a random mixing operation, the domain of tissue holograms can be converted into the domain of speckle patterns presenting the same statistical features regardless of the tissue type. Moreover, considering existing domain adaptation techniques are often applied not only in the input domain but also in feature and output domains,36,41 efforts to integrate these complementary domain adaptation strategies can lead to more accurate and reliable holographic imaging in diverse clinical and research settings.

    In conclusion, our study presents an adaptable reconstruction model for in-line digital holography and demonstrates its robustness in handling diverse tissue types across different pathological states and imaging configurations in the histopathologic inspection. This adaptability makes our framework applicable beyond digital holography. The combined approach of domain adaptation techniques and effective representation of physical forward models can be widely used to extend the adaptability of image reconstruction models for other modalities. We anticipate that our proposed framework will be particularly useful in addressing the challenges of diversity in tissue types, patient-specific anatomical differences, and instrument-wise variabilities in medical imaging modalities, such as ultrasound imaging, computed tomography, and magnetic resonance imaging.

    5 Methods and Materials

    5.1 Experimental Data Acquisition and Preparation

    As the source of tissue data, tissue slides were acquired from SuperBioChips Laboratories. Images of rectum, small bowel, colon, appendix, breast, and liver tissues were from AA9 (normal organs, unstained, 4  μm), and images of the rectum adenocarcinoma were from CD4 (colon and rectum cancer, unstained, 4  μm).

    Complex-valued field data used in this paper were acquired using interferometry and multi-height phase retrieval (MHPR). Interferometry was employed to image normal tissue with 20× magnification and a 0.4 NA, which provided the training data for the networks and the experimental data depicted in Figs. 35 and Figs. S2, S3, S4, S6, S7, S12, S14, and S15 in the Supplementary Material. For this acquisition, we used a custom-built Mach–Zehnder interferometer [see Fig. S13(a) in the Supplementary Material]. The laser source (Cobolt Samba, 532 nm) was spatially filtered and collimated before being split into sample and reference beams via a beam splitter. In the sample beam path, the incident plane wave was weakly scattered by the object placed at the object plane and relayed through the objective lens and a 300-mm tube lens. The sample beam interfered with the reference beam at the image plane, and the resulting interference intensity map was recorded using a scientific complementary metal–oxide–semiconductor (sCMOS) camera (pco.edge 5.5, pixel size 6.5  μm). The complex-valued field of the object was retrieved using a Fourier transform–based off-axis holography reconstruction algorithm. This was followed by phase unwrapping and ramp correction to address phase wrapping and linear background phase gradients.

    MHPR was employed to reconstruct complex field data of normal tissue at 10× and 5× magnifications and cancerous tissue at 20×, 10×, and 5× magnifications. For hologram acquisition using MHPR, we constructed a custom in-line holography setup, as illustrated in Fig. S13(b) in the Supplementary Material, equipped with 20×, 10×, and 5× objective lenses and a 300-mm tube lens, utilizing a charge-coupled device (CCD) camera (FLIR, BFS-U3-63S4M-C, pixel size 2.4  μm) and an sCMOS camera (pco.edge 5.5, pixel size 6.5  μm). Holograms were captured by translating the sensor with a motion controller (Newport, ESP300, Irvine, California, United States) that offers a sensitivity of 5 nm. Specifically, the camera was moved in 1-mm increments, allowing us to obtain holograms over a wide range of propagation distances up to 50 mm. The acquired holograms were then used to reconstruct the complex field data using custom-built MATLAB code for MHPR. These holograms were utilized not only for the reconstruction of complex field data but also as inputs for the trained network in the experiments.

    The acquired complex field data initially had dimensions of 1400  pixel×1400  pixel. For the training dataset, the complex fields of rectum and liver tissues were cropped into 256  pixel×256  pixel. To generate a large dataset, patches were created by cropping the 1400×1400 images with a 32-pixel stride, resulting in a total of 5600 patches for both rectum and liver data. By applying data augmentation techniques such as random vertical and horizontal flips, the total number of patches used for training increased to 22,400. For qualitative evaluation, holograms and complex fields of 1024  pixel×1024  pixel, cropped from the initial 1400×1400 images that were never encountered during training, were used. For quantitative evaluation, these images were cropped into 256×256 or 512  pixel×512  pixel, depending on the experiment.

    5.2 Network Architecture and Training Details

    We employed the same U-Net-based encoder–decoder architecture for both the baseline and proposed methods.65 This architecture reduces the spatial dimensions while increasing the number of feature map channels in the encoding path, with the reverse occurring in the decoding path. The encoder consists of repeated residual blocks66 and 2×2 average pooling layers. Each residual block contains two consecutive basic blocks, each with a leaky rectified linear unit (ReLU) (slope of 0.2), instance normalization,67 and a 3×3 convolution. In addition, a shortcut block with a 1×1 convolution is included, and the output of the residual block is the sum of the outputs from the basic blocks and the shortcut block. The decoder mirrors this structure, consisting of residual blocks and 2×2 transposed convolutions. Skip connections are implemented through squeeze-and-excitation networks.68 To process complex-valued data, the real and imaginary parts of the complex field are assigned to the first and second channels, respectively. For system variable estimation, as described in Note S3 in the Supplementary Material, we adopted an encoder-structured network inspired by previous work.69 This network includes convolutional blocks with kernel sizes of 7×7, 5×5, 3×3, and 1×1, along with 2×2 average pooling and global average pooling. The 1×1 convolution is used for regression on the extracted feature maps, whereas the other convolution kernels are used for feature extraction.

    The proposed and baseline methods were implemented in Python using PyTorch and NumPy. Network weights were initialized with Xavier normalization,70 using a gain of 1. The training process employed the Adam optimizer,71 with parameters β1=0.5 and β2=0.9 and a learning rate of 0.0001 for the complex field generator, which was reduced by a factor of 0.95 every 2500 iterations. Our training scheme utilized 5600 small-patch ground-truth images (requiring 5.2  GB of storage) and produced a network comprising roughly 28 million parameters, which could be executed on a GPU with at least 1.7 GB of RAM. We used a batch size of 16 for the holographic reconstruction network and 32 for the parameter estimation network, training the networks for up to 50,000 iterations. Training took 8 and 5 h, respectively, on a system with 128 GB of memory and an Nvidia GeForce RTX 3090 GPU. For testing, we typically employed images of around 1024  pixel×1024  pixel, resulting in an average processing time of 0.4  s per image.

    Both the baseline and proposed methods were trained using L1 and mean squared error (MSE) loss functions implemented in PyTorch. The real and imaginary components of the reconstructed and ground-truth complex fields were computed separately. The L1 and MSE losses were weighted equally, with each assigned a value of 0.5.

    5.3 Implementation of Related Work

    We compared our proposed method with several established reference approaches: PhaseGAN,28 OT-CycleGAN,20 GedankenNet, and GedankenNet-Phase.23 PhaseGAN was implemented using the publicly available code from PhaseGAN GitHub. OT-CycleGAN, developed by Lee et al., was implemented based on the shared code from OT-CycleGAN GitHub. Both GedankenNet and GedankenNet-Phase were implemented using the code provided at GedankenNet GitHub. For all reference methods, we utilized the hyperparameters as specified in the reference code and modified only the forward model to align with our experimental conditions. By adopting these reference methods, we ensured a comprehensive comparison against widely recognized techniques in the field. This selection allows us to effectively demonstrate the innovations and improvements introduced by our proposed method, particularly in terms of adaptability under varying tissue types, imaging configurations, and pathological states.

    The implementation of the SimCLR algorithm was based on the shared code available at Sim-CLR GitHub. For training the SimCLR network, we simulated the propagation fields and backpropagated holograms of various tissues using a subset of the minibatch containing measured complex fields. Through random data augmentation, we generated two correlated versions of each data point, which served as the network’s input. The SimCLR network consists of an encoder and a projection head, both of which have two multi-layer perceptron (MLP) layers and ReLU activation. The encoder outputs 512-dimensional intermediate vectors from complex images of 256  pixel  ×256  pixel, whereas the projection head reduces these to 128-dimensional latent vectors. Network training was carried out using the NT-Xent loss, which brings vectors from the same augmented pairs closer together and pushes vectors from different pairs farther apart.

    In contrastive learning schemes, random data augmentation plays a crucial role by making the prediction task more challenging, thereby enhancing the network’s representation capability. Unlike the RGB 3-channel images used in the original SimCLR paper, our input data consisted of two-channel images representing the real and imaginary parts of the complex fields, which are interdependent. Therefore, we adopted resized cropping, horizontal flipping, and vertical flipping as augmentation techniques, excluding color distortion to preserve the channel relationships. For network training, we used a total of 14,400 complex images, including 576 measured complex fields for each tissue type, each sized at 256  pixel×256  pixel, as well as the propagation fields and backpropagated holograms derived through the forward model. After training, we examined the latent space using 572 new patches for each tissue type that were not encountered during training.

    5.4 Evaluation Metric

    5.4.1 Pearson correlation coefficient

    The complex field reconstruction results were evaluated using the PCC. The PCC is a statistical measure that quantifies the strength and direction of the linear relationship between two continuous variables, with values ranging from 1 to 1. It is calculated by dividing the covariance of the two variables by the product of their standard deviations. A PCC value close to 1 indicates a strong positive correlation, 1 indicates a strong negative correlation, and 0 indicates no correlation.

    5.4.2 Multiscale structural similarity

    The multi-scale structural similarity (MS-SSIM) metric is an extension of the structural similarity index, designed to evaluate the perceived visual quality of images at multiple scales. Unlike simple pixel-wise comparisons, MS-SSIM assesses structural similarity by considering luminance, contrast, and structural information at different levels of resolution. The resulting value ranges from 0 to 1, where 1 indicates perfect structural similarity and 0 represents no similarity. As MS-SSIM accounts for variations in image structure at multiple scales, it often provides a more comprehensive assessment of image fidelity than single-scale metrics.

    5.4.3 Peak signal-to-noise ratio

    Peak signal-to-noise ratio (PSNR) is a widely used measure of image reconstruction quality that compares a reconstructed image to a ground-truth reference. It is expressed in decibels and is derived from the mean squared error between the two images. Higher PSNR values indicate better image quality, as they signify lower error and noise relative to the original image. As PSNR provides an intuitive indication of signal fidelity, it has been commonly employed in imaging research to gauge the effectiveness of restoration or reconstruction algorithms.

    5.4.4 Fréchet distance

    The distances among the latent vector distributions shown in Fig. 3 were evaluated using the FD. The FD, also known as the Wasserstein-2 distance, is a metric that measures the similarity between two probability distributions. It is particularly effective for comparing distributions of data points in multidimensional space. In this paper, we calculate the FD using 512-dimensional latent vectors obtained from the SimCLR network. The FD accounts for both the means and covariances of the distributions, capturing both the central tendency and the spread of the data. Higher FD scores indicate greater distances among distributions, whereas lower FD scores represent closer distributions.

    Jiseong Barg is currently a PhD student in the Department of Bio and Brain Engineering at the Korea Advanced Institute of Science and Technology (KAIST). He received his BS degree in biomedical engineering from the Sungkyunkwan University in 2022 and his MS degree from the Department of Bio and Brain Engineering at the KAIST in 2024. His research focuses on leveraging deep learning to overcome the limitations of conventional imaging techniques in various optical bioimaging applications.

    Chanseok Lee is currently working toward a PhD in the Department of Bio and Brain Engineering at the KAIST. He received his BS degree in the Department of Biomedical Engineering from Hanyang University and his MS degree in the Department of Bio and Brain Engineering from the KAIST. His current research focuses on physics integrated machine learning and computational optical imaging.

    Chunghyeong Lee is currently a PhD student in the Department of Bio and Brain Engineering at the KAIST. He received his BS degree in physics from the KAIST in 2020 and his MS degree from the Department of Bio and Brain Engineering at the KAIST in 2023. His research interests lie in complex optics, metasurfaces, and computational optical imaging.

    Mooseok Jang is an associate professor in the Department of Bio and Brain Engineering at the KAIST. He received his BS degree in physics from the KAIST in 2009 and his PhD in electrical engineering from the California Institute of Technology (Caltech) in 2016. His research interests include optical imaging, complex optics, acousto-optics, and the application of machine learning techniques to microscopic imaging.

    [1] P. Ferraro, A. Wax, Z. Zalevsky. Coherent Light Microscopy: Imaging and Quantitative Phase Analysis, 46(2011).

    [4] T. Kreis. Handbook of Holographic Interferometry: Optical and Digital Methods(2006).

    [5] V. Astratov, C. Hu, G. Popescu. Quantitative phase imaging: principles and applications. Label-Free Super-Resolution Microscopy, 1-24(2019).

    [13] R. W. Gerchberg. A practical algorithm for the determination of plane from image and diffraction pictures. Optik, 35, 237-246(1972).

    [29] Z. Hong et al. Out-of-distribution detection in medical image analysis: a survey(2024).

    [38] C. Villani. Optimal Transport: Old and New, 338(2009).

    [46] K. Chaitanya et al. Contrastive learning of global and local features for medical image segmentation with limited annotations, 12546-12558(2020).

    [48] S. A. Rizvi et al. Local contrastive learning for medical image recognition, 1236(2024).

    [52] J. W. Goodman. Introduction to Fourier Optics(2005).

    [54] T. Chen et al. A simple framework for contrastive learning of visual representations, 1597-1607(2020).

    [67] V. Dumoulin, J. Shlens, M. Kudlur. A learned representation for artistic style(2016).

    [70] X. Glorot, Y. Bengio. Understanding the difficulty of training deep feedforward neural networks, 249-256(2010).

    [71] D. P. Kingma. Adam: a method for stochastic optimization(2014).

    Tools

    Get Citation

    Copy Citation Text

    Jiseong Barg, Chanseok Lee, Chunghyeong Lee, Mooseok Jang, "Adaptable deep learning for holographic microscopy: a case study on tissue type and system variability in label-free histopathology," Adv. Photon. Nexus 4, 026005 (2025)

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Research Articles

    Received: Oct. 10, 2024

    Accepted: Jan. 20, 2025

    Published Online: Feb. 19, 2025

    The Author Email: Jang Mooseok (mooseok@kaist.ac.kr)

    DOI:10.1117/1.APN.4.2.026005

    CSTR:32397.14.1.APN.4.2.026005

    Topics