Single-shot volumetric fluorescence imaging with neural fields

Oumeng Zhang; Haowen Zhou; Brandon Y. Feng; Elin M. Larsson; Reinaldo E. Alcalde; Siyuan Yin; Catherine Deng; Changhuei Yang

doi:10.1117/1.AP.7.2.026001

1 Introduction

Fluorescence imaging is an indispensable tool in biological research, as it enables real-time observation of live organisms due to its high sensitivity, biological specificity, and noninvasive nature. However, conventional two-dimensional (2D) fluorescence microscopy cannot capture the complete three-dimensional (3D) structure of biological samples. To address this, techniques such as confocal and light-sheet fluorescence microscopy have been developed and are widely used. Despite their advantages, these methods require scanning multiple axial planes, significantly increasing the acquisition time and limiting the spatial–temporal throughput. The extensive scanning time is impractical for numerous applications where imaging large fields of view (FOVs) is required, such as when performing lab studies of the complex biological processes associated with the rhizosphere1—the media–root interfaces of plants.

Single-shot volumetric fluorescence (SVF) imaging techniques have been developed to address the challenges in scanning-based 3D imaging methods.2^–6 These methods encode volumetric data into a single 2D image, which allows for computational reconstruction of the object. One prominent technique is the light-field microscope,2^–4 which utilizes a standard microlens array to enable 3D capabilities. Recently, lensless architectures using coded masks or randomized microlens diffusers have also been demonstrated.5^,7^–10

Although these approaches are effective, they also come with associated limitations. A significant challenge with these systems, with the exception of the Fourier light-field microscope,11^,12 is that their point spread functions (PSFs) are not laterally shift-invariant, which necessitates extensive calibration to define the PSF accurately and restricts measurement to the precalibrated volume. Further, the shift-variant nature of the PSF complicates the image analysis—reconstructions typically require optimization algorithms that impose sparsity constraints. Another drawback of using coded masks or microlens arrays is that the large size decreases the peak signal level of the PSF by spreading photons over a larger area. In addition, photons from various axial and lateral positions overlapping on the sensor further degrade the signal-to-noise ratio (SNR). To address some of these issues while improving resolution and depth of field (DOF), Miniscope3D13 and Fourier DiffuserScope14 replace the tube lens in a standard fluorescence microscope with an optimized phase mask, but they still suffer from some of the limitations associated with noncompact PSFs.

PSF engineering is an alternative approach that modulates the fluorescence at the back focal plane (BFP) of the objective lens in the imaging system to encode the axial position of the emitters within compact PSFs. Numerous 3D PSFs have been proposed over the past two decades,15^,16 including the astigmatic PSF,17 which varies in elongation direction and magnitude with defocus; the corkscrew18 and double-helix19 PSFs, which feature revolving spots around the emitter; and the tetrapod20 and pixOL21 PSFs, which are optimized for maximizing Fisher information. Another group of PSFs, including the bisected,22 quadrated,23 and tri-spot PSFs,24 images subapertures off the pupil center to induce lateral displacements in the focal point; the displacement direction depends on the position of the subaperture relative to the center of the BFP, and the displacement amount is proportional to the defocus.

Although these engineered PSFs have proven effective primarily in single-molecule localization microscopy25^–28 where isolated point sources are imaged, reconstructing more complex geometries from these PSFs remains challenging. Unlike methods such as Miniscope3D, compact PSFs usually cannot satisfy the multiplexing requirement of compressed sensing. Therefore, recovering the 3D object using a single image often results in ambiguities in depth measurements.29^,30 Although using multiple images with engineered PSFs has the ability to substantially reduce these ambiguities, such as in the Fourier light-field microscope11^,12 and complex-field and fluorescence microscopy using the aperture-scanning technique (CFAST),30 these methods require either capturing multiple images sequentially at the expense of temporal resolution or utilizing different areas of the detector to perform spatial multiplexing, thus sacrificing the FOV. A recent approach, the polarized spiral PSF,29 integrates polarizers with a double-helix phase mask and employs orthogonally polarized detection channels from a polarization camera to achieve single-shot 3D imaging without sacrificing either the temporal resolution or the FOV. However, it has not completely eliminated the ambiguity problem.

Another critical aspect of SVF imaging is the reconstruction algorithm; a robust reconstruction algorithm is essential for accurately and precisely reconstructing 3D scenes from 2D measurements captured with engineered PSFs. The Richardson–Lucy (RL) algorithm11^,31^,32 is broadly used due to its effectiveness in recovering 3D structures. However, its performance degrades under noisy conditions, particularly in fluorescence imaging where the signal is limited. Given these limitations, it is worth exploring the use of neural fields in SVF. The topic of neural field is a recent prominent and prevalent technique for 3D scene representations and graphics rendering.33^,34 Neural field techniques perform mapping between spatial coordinates and image intensity values using a compact multilayer perceptron (MLP) model and optionally learnable positional embeddings. This approach has recently been applied to improve microscopic systems. For example, it has been implemented to represent a continuous 3D refractive index in intensity diffraction tomography and to generate continuous images across the axial dimension,35 but it requires extensive optimization time on high-performance graphics processing units (GPUs). Further developments reduced the computational resources by creating an additional hash encoding layer at the input of the neural network for 2D microscopic imaging systems.36 More adaptations of the neural fields approach include a 2D version for lensless imaging phase retrieval37 and the modeling of space–time dynamics with four-dimensional neural fields in imaging through scattering,38 computational adaptive optics,39 and structured illumination microscopy.40 Advancements continued with the exploitation of redundancies in Fourier ptychographic microscopes to augment the MLP model with a compact learnable feature space, thereby speeding up the image stack reconstructions and reducing data storage burdens.41 Most recently, the integration of neural fields with diffusion models has been proposed to aid volumetric reconstruction.42

In this paper, we present QuadraPol PSF, an engineered PSF designed for SVF imaging that is easy to implement and overcomes the limitations of existing techniques. In addition, we introduce a reconstruction algorithm based on neural fields that outperform the widely used RL deconvolution. Our work is inspired by the algorithm architectures from Refs. 38, 41, and 42, which we redesigned to enhance the capabilities of our imaging system, as demonstrated through fluorescence imaging over a centimeter-scale lateral FOV, with a depth of 5 mm, a lateral resolution of $\sim 7 μ m$ , and an axial resolution of $\sim 240 μ m$ . QuadraPol PSF enables all-in-focus imaging of bacterial colonies on sand surfaces and 3D visualization of plant root morphology. Although our focus has been on the rhizosphere, the high-quality structural images produced suggest that this approach can be effectively generalized to other contexts as well. Our results highlight the potential of QuadraPol PSF combined with neural fields to markedly propel biological research by enabling rapid, high-resolution 3D imaging of intricate biological structures across a large FOV.

2 Methods

2.1 Imaging System

The experimental setup of our SVF imaging system with the QuadraPol PSF is shown in Fig. 1(a). We modulate a 568-nm laser (Coherent Sapphire 568 LP) transmitted through a single-mode fiber (Thorlabs P3-460B-FC-2, Newton, New Jersey, United States) using a quarter-wave plate (QWP, Thorlabs WPQ10M-561) to produce circularly polarized excitation at the sample. An achromatic doublet lens (Thorlabs AC254-080-A) is used as the objective lens (OL), providing access to the BFP without the need for additional $4 f$ systems, thus simplifying the setup. The collected fluorescence is then filtered by a dichroic mirror (DM, Semrock Di01-R488/561) and a bandpass filter (BF, Semrock FF01-523/610, Rochester, New York, United States) and modulated by a custom four-polarization polarizer (4-Pol). The modulated fluorescence is then focused onto a polarization camera (The Imaging Source DZK 33UX250, Charlotte, North Carolina, United States) using a tube lens (TL, Thorlabs AC254-150-A). This compact system is mounted on a $z$ stage (Newport 436, controlled by Newport CONEX-LTA-HS) and a custom-built $x y$ gantry system (Ballscrew SFU1605 with NEMA23 Stepper Motor) capable of scanning a $200 mm \times 200 mm$ area.

Figure 1.(a) Schematic of single-shot volumetric fluorescence imaging using the QuadraPol PSF. OL, objective lens; TL, tube lens; QWP, quarter-wave plate; DM, dichroic mirror; BF, bandpass filter. A four-polarization custom polarizer (4-Pol) is positioned at the BFP of the imaging system to modulate the emission light, with a polarization camera (PolCam) capturing the modulated fluorescence. The transmission axes of the polarizer and PolCam are 0, 45, 90, and 135 deg. (b) Assembling the custom polarizer by aligning two coverslips and four laser-cut polymer polarizers among 3D-printed holders. (c) Representative image of a point source captured by the polarization camera, visualized using (i) raw pixel readouts, (ii) a polarization image color-coded in the hue-saturation-value (HSV) scheme (AoLP as hue, DoLP as saturation, and intensity as value), and (iii) four separate images for each polarization channel. Scale bar: $50 μ m$ .

Download full size

View all figures

The 4-Pol custom polarizer [Fig. 1(b)] is constructed by cutting four pieces from a thermoplastic polymer film linear polarizer (Thorlabs LPVISE2X2) with a laser cutter (Universal Laser Systems PLS6.75) and positioning them between two circular coverslips (VWR 48380-046). Two 3D-printed (Anycubic Photon Mono X 6Ks) holders apply clamping forces to secure the polarizers and coverslips, preventing displacement. One holder features a 9-mm diameter aperture to ensure the shift invariance of the imaging system’s PSF, corresponding to a numerical aperture (NA) of 0.056. In addition, 0.6-mm lines are printed along the center of the holder to block unmodulated fluorescence from passing through gaps between the polarizers.

The polarization camera integrates a polarizer microarray atop the CMOS sensor. Typically, the raw pixel data [Fig. 1(c-i)] are processed to visualize the angle of linear polarization (AoLP), which indicates the polarization direction and the degree of linear polarization (DoLP), which quantifies the proportion of polarized fluorescence [Fig. 1(c-ii)].43^,44 Alternatively, the data can be decomposed into images corresponding to the four polarization channels, as shown in Fig. 1(c-iii).

2.2 Point Spread Function Design and Experimental Calibration

The design of the QuadraPol PSF is inspired by the quadrated PSF,23 the multiview reflector (MVR) microscope,45 and CFAST.30 These techniques divide the pupil into four sections. Specifically, the quadrated phase mask applies different phase ramps to each segment, generating a four-spot PSF in a single imaging channel. In contrast, the MVR and CFAST systems utilize reflective mirrors and light-blocking apertures, respectively, to create distinct image channels for each pupil segment. A key characteristic of these techniques is the encoding of the axial position of the emitter through the different lateral displacements of the focused spots, each associated with one of the four equal segments of the pupil. However, these approaches come with limitations. The single-image capture with the quadrated PSF can lead to ambiguities in 3D reconstruction, whereas the MVR and CFAST setups sacrifice either FOV or temporal resolution. The QuadraPol PSF distinguishes itself by modulating each pupil section using a polarizer with a unique transmission axis, allowing the polarization camera to simultaneously capture four imaging channels without losing FOV or temporal resolution.

Given the imaging system’s low NA, the theoretical PSF [Fig. 2(a)] for the axial position $z$ in each polarization channel ( $pol \in {0 \deg, 45 \deg, 90 \deg, 135 \deg}$ ) can be described by (see Note S1 in the Supplementary Material for more details)46^,47 ${PSF}_{z, pol} = \sum_{i \in [p, s]} F {A_{pol, i} (u, v) \exp {j [k z \sqrt{1 - u^{2} - v^{2}} + P (u, v)]}}^{2},$ (1)where $k = 2 π / λ$ represents the wavenumber, and $(u, v)$ are the coordinates at the BFP. The pupil phase is consistent across all polarization channels; a uniform phase, $P (u, v) = 0$ , is used for calculating the unaberrated theoretical PSF. The amplitude modulation patterns for $p$ and $s$ polarizations are shown in Fig. 2(a). These patterns assign values of 1, 0.5, and 0 for amplitude mask with $p$ polarization and 0, $\pm 0.5$ , and 0 for $s$ polarization, corresponding to the angles between the transmission axis of the polarizers and detection channel at 0, 45, and 90 deg, respectively. Note that due to cross talk among polarization channels separated by 45 deg, the PSF in each channel exhibits a primary spot and two weaker side spots, rather than a singular focal point (Fig. S1 in the Supplementary Material). These images exhibit apparent lateral shifts in different directions across different polarization channels as the object defocuses; this is analogous to viewing the same object from four unique perspectives, which fundamentally enables 3D reconstruction.

Figure 2.Amplitude and phase of the pupil and PSFs at different heights. (a) Theoretical PSFs without aberration, (b) simulated PSFs using the retrieved phase, and (c) experimental PSFs. Question marks indicate that the phase and amplitude for the experimental PSF are not accessible. Scale bar: 2 mm for the pupil images and 0.2 mm for the PSF images.

Download full size

View all figures

Given the presence of aberrations in our imaging system, primarily introduced by the custom-made polarizer, we use the vectorial implementation of phase retrieval48 to compensate for these imperfections. This approach enables us to refine a series of retrieved PSFs [Fig. 2(b)] that more closely match those obtained experimentally [Fig. 2(c)] compared with theoretically obtained ones [Fig. 2(a)]. These experimental PSFs are generated by capturing images from fluorescent beads (Thermo Fisher F8858, Waltham, Massachusetts, United States) that were axially scanned over an 8-mm range with a 0.1-mm step size. The retrieved phase of the pupil is shown in Fig. 2(b), whereas we assume that the pupil amplitude remains the same as that in Fig. 2(a). Both experimental and retrieved PSFs are implemented in the image reconstruction in Sec. 3.4.

2.3 Reconstruction Algorithm

Previous studies have demonstrated the effectiveness of the modified RL deconvolution algorithm in reconstructing 3D volumetric scenes using multiple input images11^,30; the algorithm iteratively updates the estimated object $o$ as $o^{(k + 1)} = o^{(k)} \frac{\sum_{pol} (I_{pol} \otimes {PSF}_{z, pol}^{*})}{\sum_{pol} {[\sum_{z} (o_{z}^{(k)} \otimes {PSF}_{z, pol})] \otimes {PSF}_{z, pol}^{*}}},$ (2)where ${PSF}^{*}$ represents the original PSF with a 180-deg rotation in the $x y$ plane, $I_{pol}$ represents the intensity from one of the polarization channels, and $\otimes$ denotes the convolution operator.

One limitation of the RL deconvolution is its performance under noisy situations. Inspired by the forward-imaging model, which captures four independent perspectives of the 3D scene, we customized and designed the neural fields to enhance the performance of our system. The algorithm is composed of a two-stage optimization process. The first stage was to initialize our neural fields with the RL-deconvolved image volume. The neural field learns a compact learnable feature space consisting of a 3D feature tensor ( $M$ ) and a feature tensor ( $u$ ),41 and it also learns an MLP network including two nonlinear layers with ReLU activation functions and one linear layer. Each layer of the MLP in our neural field has $Q$ neurons with an additional offset neuron. The number of neurons is designed to match the number of feature channels in the feature space. The feature space is created from the Hadamard product of a feature tensor $M$ with a size of $(N / 2) \times (N / 2) \times Q$ and a feature tensor $u$ with a size of $Z \times Q$ , where $N$ is the number of pixels along the $x$ or $y$ axis of the raw image, $Q$ is the number of feature channels, and $Z$ denotes the number of predefined $z$ coordinates. The detailed parameters and hyperparameters of the neural fields can be found in Note S2 in the Supplementary Material.49^,50 The optimization process seeks to find a mapping function ( $ϕ$ ) between feature tensor space and the image volume, $ϕ (x) : R^{2} \cdot R \mapsto R^{3}, C (f (x), ϕ (x)),$ (3)where $x = (x, y, z)$ represents the 3D coordinate system, and $C$ is a set of constraints bounded by the image volume $f (x)$ . This initialization process can significantly shorten the rendering time. After the first stage, the neural field can render an image volume as an initial guess ( $g$ ).

In the second stage, we can render our estimated measurements, $I_{pol} (x, y) = \sum_{z} | g_{z} (x) \otimes {PSF}_{z, pol} |^{2},$ (4)where $I_{pol} (x, y)$ is the estimated measurements from the neural field rendered image volume, $g_{z}$ denotes the rendered image volume at one $z$ plane, and $\otimes$ is the convolution operator. At our second stage of optimization, we optimized the neural field to minimize the difference between the estimated measurements and the experimental intensity measurements [Fig. 3(a)].

Figure 3.Framework of using neural fields to extend the quality and depth range of the imaging system. (a) The RL-deconvolved image volume guides the initialization of the model with a compact learnable feature space and MLP. After model initialization, the model is further optimized for the image volume. The estimated image volume goes through the forward model of the imaging system to generate the estimated measurements. These measurements are compared with the acquired measurements and then to update the model weights and parameters. (b) Once the model is optimized, the parameters and weights are fixed. It can render an image stack with continuous sampling. The operator $\otimes$ denotes a convolution operation, and $\oplus$ indicates a summation operation along the $z$ axis.

Download full size

View all figures

After the two-stage optimization, the parameters and weights in the neural field are fixed. We can continuously sample the feature tensor ( $u$ ) to render a denser image volume along the axial ( $z$ ) axis, as shown in Fig. 3(b). Throughout the whole reconstruction process, no pretraining or training data sets are needed, and the algorithm is only supervised by a single capture. The average optimization time over 196 FOVs in the plant root experiment (see Sec. 3.4 for more details) on an Nvidia A100-80GB GPU device is $\sim 22 s$ .

2.4 Sample Preparation

Starting from a single colony, Escherichia coli-mScarlet-I was grown in an LB medium overnight at 37°C. On the next day, the overnight culture was washed once in a minimal medium and diluted to an optical density of 1. To prepare for imaging, 1 mL of cell culture was added to 10 g of autoclaved fine sand (Fischer Science Education Sand, Cat. No. S04286-8) in a small Petri dish and briefly mixed by vortexing before imaging.

Soft white wheat seeds were obtained from Handy Pantry (Lot 190698). Seeds were sterilized by incubating in 70% ethanol for 2 min, washing 3 times with autoclaved water, incubating with a bleach (50% volume fraction) and Triton X (0.1% volume fraction) solution for 3 min, and finally washing 5 times with autoclaved water. Seeds were then plated on 0.6% phytagel (Sigma Aldrich, Cat. No. P8169, St. Louis, Missouri, United States) containing 0.5× MS medium (Sigma Aldrich, Cat. No. M5519) and transferred to a growth chamber with a day/night cycle of 16/8 h at 25°C for 7 days. On day 7, seedlings were collected and incubated in $100 μ mol/L$ merocyanine 540 (MC540) in phosphate-buffered saline for 15 min and then rinsed in deionized water before imaging.

3 Results

3.1 Simulation Evaluation of System Performance

We first assessed the resolution of the QuadraPol PSF by simulating images of two closely positioned point sources. To determine the resolution, we applied the Rayleigh criterion, which requires at least a 20% dip between the peaks of the two reconstructed spots. We used a high spatial sampling rate to evaluate the resolution more accurately. In the lateral direction, the QuadraPol PSF achieves an in-focus resolution of $4.3 μ m$ . However, this resolution rapidly degrades to $\sim 7 μ m$ with a $\pm 0.3 - mm$ defocus. The resolution remains under $9 μ m$ , with a defocus extending up to 1.8 mm [Fig. 4(a)]. To optimally adapt to these resolutions in our experimental setup, we used a camera sampling rate of $3.7 μ m$ in the object space (with a magnification of 1.875) that corresponds to a resolution limit of $7.4 μ m$ , achieving a DOF of $\sim 4 mm$ with relatively uniform lateral resolution. This extended DOF is notably longer than the 0.26 mm DOF achievable with a standard PSF, which is due to the smaller effective NA of the QuadraPol PSF.

$Performance evaluation of the QuadraPol PSF using simulated data. (a) Lateral and (b) axial resolutions determined by the Rayleigh criterion as functions of axial position and signal level. Images show representative data with Poisson shot noise and reconstruction cross sections using RL deconvolution and neural fields. Shaded areas represent the diffraction (λ/2NA) and sampling limits. Scale bar: 20 μm in xy view and 200 μm in xz view. (c)–(e) Reconstruction results for various line structures; insets show simulated images. Parameters for simulating the QuadraPol PSF (NA=0.52, magnification M=−5.2, and camera pixel size 2.2 μm) are adjusted to match those reported for Miniscope3D.13" target="_self" style="display: inline;">13 Scale bar: 100 μm; color bar: height in μm. (f) and (g) Object and images using the polarized spiral (PS2F) and QuadraPol PSFs with (f) vertical and (g) horizontal lines at z=±0.35 mm. Scale bar: 50 μm; color bar: AoLP. Insets show the PSFs. (h) Comparison between the QuadraPol PSF and other SVF techniques.$

Figure 4.Performance evaluation of the QuadraPol PSF using simulated data. (a) Lateral and (b) axial resolutions determined by the Rayleigh criterion as functions of axial position and signal level. Images show representative data with Poisson shot noise and reconstruction cross sections using RL deconvolution and neural fields. Shaded areas represent the diffraction ( $λ / 2 NA$ ) and sampling limits. Scale bar: $20 μ m$ in $x y$ view and $200 μ m$ in $x z$ view. (c)–(e) Reconstruction results for various line structures; insets show simulated images. Parameters for simulating the QuadraPol PSF ( $NA = 0.52$ , magnification $M = - 5.2$ , and camera pixel size $2.2 μ m$ ) are adjusted to match those reported for Miniscope3D.¹³ Scale bar: $100 μ m$ ; color bar: height in $μ m$ . (f) and (g) Object and images using the polarized spiral ( ${PS}^{2} F$ ) and QuadraPol PSFs with (f) vertical and (g) horizontal lines at $z = \pm 0.35 mm$ . Scale bar: $50 μ m$ ; color bar: AoLP. Insets show the PSFs. (h) Comparison between the QuadraPol PSF and other SVF techniques.

Download full size

View all figures

In addition, we compared our neural field reconstruction algorithm with the RL deconvolution. Although both algorithms perform similarly under ideal conditions, neural fields exhibit significantly improved robustness with noisy images [Fig. 4(a) and Fig. S2 in the Supplementary Material]. For instance, at a low signal level where a total of $3.16 \times 10^{4} photons$ are captured from both emitters, the RL deconvolution results in a highly noisy reconstruction, whereas the neural fields can still distinctly resolve the two spots and offer a twofold resolution improvement. This improvement is attributed to the fact that the neural fields tend to favor smooth reconstructions,34^,38^,40^,51 effectively leading to a suppression of noise in the reconstruction process.

The axial resolution within the 4-mm DOF ranges from 230 to $250 μ m$ , except for positions close to the focal plane ( $| z | < 0.1 mm$ ), where it degrades to $405 μ m$ [Fig. 4(b)]. This degradation is attributed to the symmetry, which causes a gradual depth gradient of the in-focus QuadraPol PSF; both $+ z$ and $- z$ defocus result in an expansion of the PSF. However, this symmetry is disrupted in the presence of aberrations; when aberrated PSFs were simulated, an improvement in axial resolution near the focal plane was observed (Fig. S3 in the Supplementary Material). We also experimentally demonstrated this improvement in Sec. 3.2. Similar to the case of lateral direction, using neural fields leads to a better axial resolution compared with RL deconvolution under noisy conditions, although the improvement is less pronounced than in the lateral direction, particularly at very low signal levels. This is likely due to the design of the network’s feature space, which provides more degrees of freedom in the lateral direction than in the axial direction, given the lower axial resolution of the imaging system compared with its lateral resolution. Although such a configuration can significantly accelerate the algorithm and reduce memory usage, it can also limit the improvements in axial resolution. Again, we emphasize that this comparison is under ideal conditions and does not account for forward model mismatches that may occur in experimental settings. The space-bandwidth product (SBP) of our system is $\sim 5.2$ million voxels over the $3.8 mm \times 4.5 mm$ FOV and 4-mm depth range. Note that the SBP for the biological experiments in later sections is greater than the value reported here. We are not confined to the 4-mm depth range, as we can still reconstruct the object with slightly reduced resolution at greater defocus distances.

One key advantage of the QuadraPol PSF is its ability to resolve denser objects more effectively. By separating polarization channels, the compact QuadraPol PSF reduces the mixing of information from different 3D positions, thus relaxing the sparsity constraint in the reconstruction algorithm compared with other SVF techniques such as the Miniscope3D.13 To demonstrate this, we simulated three different line structures and generated images using both the Miniscope3D PSF and the QuadraPol PSF, analyzing them with the same RL deconvolution algorithm without sparsity constraints. For a small structure where the raw image using the Miniscope3D shows minimal overlap, both methods accurately resolve the structure [Fig. 4(c)]. In addition, the reconstruction quality of the Miniscope3D for defocused axial positions is superior due to its extended DOF. However, for larger objects where there is substantial information mixture in the raw Miniscope3D images [Figs. 4(d) and 4(e)], it struggles to accurately recover the structure. In contrast, the QuadraPol PSF still resolves the object effectively with significantly fewer artifacts. This enhanced performance is similarly observed with other dense objects (Note S3 in the Supplementary Material).

Another challenge in SVF imaging is depth ambiguity. Compact PSFs, such as the double-helix PSF,19 often produce identical images for specific structures at different heights (Fig. S4 in the Supplementary Material), which makes it impossible to accurately estimate depth without error. The polarized spiral PSF29 was developed to address this issue by integrating orthogonally polarized polarizers on the left and right sides of a double-helix phase mask, creating two spots with orthogonal polarizations captured using a polarization camera. Although this method resolves the ambiguity for vertical line structures [Fig. 4(f)], challenges remain for horizontal lines [Fig. 4(g)]. Clearly, to eliminate ambiguity in all directions, at least three independent images are necessary, highlighting the need to use the full capability of the polarization camera. As shown in Figs. 4(f) and 4(g), the QuadraPol PSF effectively solves this issue, allowing both horizontal and vertical lines at different depths to be resolved without ambiguity.

A comparison of the three aspects mentioned above between the QuadraPol PSF and other SVF techniques is shown in Fig. 4(h). The most significant advantage of the QuadraPol PSF, which we would like to emphasize, is its ability to eliminate depth ambiguity and reconstruct 3D scenes without relying on sparsity constraints, distinguishing it from other SVF techniques such as Miniscope3D, Fourier DiffuserScope, and ${PS}^{2} F$ . In addition, compared with the Fourier light-field microscope, which shares the aforementioned advantages, the QuadraPol PSF achieves a higher SBP, as it does not compromise the FOV. Furthermore, the QuadraPol PSF provides a better SNR, despite the photon loss due to the linear polarizers (Note S3 in the Supplementary Material).

3.2 Experimental Validation Using Fluorescent Beads

We next validated the QuadraPol PSF with fluorescent beads placed on a tilted coverslip [Fig. 5(a)]. The 3D reconstruction is shown in Fig. 5(b), where the reconstruction algorithm accurately resolves the fluorescent beads on the coverslip without noticeable inaccuracies. A quantitative analysis of depth accuracy is provided in Fig. S5 in the Supplementary Material. In addition, we quantified the full width at half-maximum (FWHM) values for the reconstructed images of 72 fluorescent beads from a $z$ scan [Fig. 5(c)]. The FWHM values are $4.7 \pm 1.0 μ m$ and $4.3 \pm 0.9 μ m$ in the $x$ and $y$ directions, respectively, for in-focus beads. The average FWHM remains within $5.0 μ m$ for $| z | \leq 0.6 mm$ . When beads are defocused by 1 mm, the FWHM increases to an average of $6.0 μ m$ . Note that although the FWHM is not directly comparable to the resolution determined using the Rayleigh criteria in Sec. 3.1, the trend observed as a function of $z$ is consistent with our simulations [Fig. 4(a)]. In the $z$ direction, the average FWHM is consistently below $130.4 μ m$ across a DOF of 1.2 mm and increases to an average of $155.7 μ m$ when defocused by 1 mm. We do not observe resolution degradation near the focal plane, unlike what was seen in simulations using theoretical PSFs [Fig. 4(b)]. This reaffirms that the system’s aberrations actually improve the depth resolution of the QuadraPol PSF. Additional resolution analysis (Fig. S6 in the Supplementary Material) based on the Rayleigh criterion using synthesized images from bead data is consistent with the simulation results shown in Figs. 4(a) and 4(b).

Figure 5.Imaging fluorescent beads on a 45-deg tilt coverslip using the QuadraPol PSF. (a) Raw image of the fluorescent beads. Scale bar: 1 mm. (b) 3D rendering of the reconstructed beads using MATLAB function “isosurface.” Grid size: 1 mm. (c) FWHM values for the reconstructed beads. Lines represent the average; shaded areas represent the standard deviation.

Download full size

View all figures

3.3 All-in-Focus Imaging of Bacterial Colony on Sand Surfaces

Bacterial activities play a crucial role in the biochemical processes within the rhizosphere.52 Given the highly complex and scattering nature of soil environments, many lab-based environmental and biological experiments use sand as a proxy.53^–55 Sand offers better physical and chemical simplicity and uniformity, providing a well-controlled setting for scientific analyses. Visualizing bacterial colonies in sand represents a significant advancement toward in situ studies of bacterial behavior. However, a key challenge arises from the typical size of sand particles, which creates a nonflat surface that can cause many areas within the FOV to be out of focus when captured in a single snapshot by a standard microscope without a $z$ scan.

To address this challenge, we use 3D imaging with the QuadraPol PSF to visualize E. coli tagged with mScarlet-I on the sand surface [Fig. 6(a)]. Multiple FOVs are stitched together using the Microscopy Image Stitching Tool plugin in ImageJ.56 To determine the correct focusing height map [Fig. 6(b)] and generate an all-in-focus image [Fig. 6(c)],57^,58 we use a $100 \times 100 pixel$ window (side length of 0.35 mm) sliding across the entire FOV. The sharpest focus $z$ position for each window is selected based on image contrast, defined as the difference between the 99th and 1st percentile values in each slice in the reconstructed $z$ stack (Fig. S7 in the Supplementary Material). The zoomed regions [Fig. 6(d)] show that we successfully resolve sharp reconstructions of the bacterial colonies on the sand particles from blurred raw images across the entire sample area, which has a depth variation of 2 mm.

Figure 6.All-in-focus imaging of E. coli tagged with mScarlet-I on sand surfaces. (a) Raw polarized fluorescence image. The inset shows a photograph of the sample captured using a smartphone camera. (b) Height map recovered using the all-in-focus algorithm. Color bar: height in mm. (c) All-in-focus fluorescence image. (d) Zoomed regions of interest in panels (a)–(c). Annotations represent the $z$ range for each region. Scale bar: 20 mm in the inset of panel (a), 2 mm in panels (b) and (c), and 0.5 mm in panel (d).

Download full size

View all figures

3.4 3D Imaging of Plant Roots with Neural Fields

Plant roots play a critical role in the rhizosphere, serving as the primary interface between the plant and the soil, and significantly influencing the activities of soil microbes. However, visualizing roots poses significant challenges due to their typically large depth variation, which complicates the acquisition of focused images across the entire volume of interest. Here, we conduct 3D imaging of wheat roots stained with MC540 within a 5-mm-thick glass container [Fig. 7(a), Starna Cells 93-G-5]. The raw image [Fig. 7(b)] shows how depth is encoded in the polarization of detected fluorescence. Two zoomed regions show different polarization characteristics: one shows the top of the root polarized at 0 deg and the bottom at 90 deg, whereas the other exhibits the opposite pattern, indicating different defocus directions for these root segments.

Figure 7.Volumetric imaging of wheat roots using the QuadraPol PSF. (a) Reconstruction using neural fields. The inset shows a photograph captured using a smartphone camera. (b) Raw polarized fluorescence image. (c) $x y$ view; (d) and (e) $x y$ and $x z$ views of the zoomed regions in panel (a), reconstructed using (i) deconvolution with experimental PSF, (ii) deconvolution with retrieved PSF, and (iii) neural fields. Scale bar: 5 mm in panels (a) and (b), 0.5 mm in the zoomed region of panel (b), and 1 mm in panels (c)–(e). Color bar: depth in mm. Note that the color in the $x y$ view is composed of the entire reconstructed volume, reflecting the intensity contribution from all $z$ slices; this view predominantly displays a color corresponding to the central height of the root.

Download full size

View all figures

We performed the reconstruction using the RL deconvolution with both the experimental and retrieved PSFs, as well as our proposed method in Sec. 2.3 based on neural fields. For the neural-fields method, we used experimental PSF for $| z | \leq 2 mm$ and retrieved PSF otherwise during the two-step optimization process (Note S4 in the Supplementary Material). This is due to the degraded quality of experimental PSFs from the low SNR at large defocus distances. In contrast, using both PSFs in a single deconvolution results in sharp discontinuities at the $z$ position transitioning between experimental and retrieved PSFs (Note S4 in the Supplementary Material). The image volumes with color-coded depth are presented in Fig. 7(a). Observations from full FOV images (Fig. S8 in the Supplementary Material) highlighted that for thicker segments of the roots, the deconvolution results with the experimental PSF appear noisier compared with the method using neural fields. Meanwhile, deconvolution with the retrieved PSF shows a broader spread in $z$ (Fig. S9 in the Supplementary Material), due to minor mismatches between the experimental and retrieved PSFs that reduce $z$ axis accuracy and precision. The neural-field method addresses these issues by maintaining the accuracy and precision of the deconvolution with the experimental PSF while avoiding its SNR problems.

We further examine the zoomed region of a thicker part of the root [Fig. 7(c)], where the neural-field reconstruction [Fig. 7(c-iii)] shows significantly sharper cell walls compared with the deconvolution results with either the experimental [Fig. 7(c-i)] or the retrieved PSF [Fig. 7(c-ii)]. This improvement is attributed to our neural-field method, jointly taking the contribution from fluorescence signals at extended depth regions and accurate PSFs of the experimental calibrations. The enhancements in both the lateral and axial directions are also observed in out-of-focus thinner regions of the root [Figs. 7(d) and 7(e)], where $x y$ views using the neural-field method show sharper images and the $y z$ views exhibit a narrower spread compared with the deconvolution results using the retrieved PSF; the width mirrors the precision of the experimental PSF while overcoming its limitations of low SNR at larger defocus. Improved reconstruction quality with neural fields is also demonstrated in additional zoomed regions (Figs. S10 and S11 in the Supplementary Material), showing mitigation of artifacts from deconvolution with the experimental PSF, and visualization of fine structures such as root hairs, which cannot be resolved using deconvolution with the retrieved PSF.

4 Discussion and Conclusion

Our proposed method combines the QuadraPol PSF with neural fields to achieve SVF imaging. A major innovation in hardware is the usage of the four-polarization custom polarizer, which simultaneously encodes depth information across four images (three of which are independent) captured by a polarization camera. This setup maximizes the capabilities of the polarization camera, completely eliminating the estimation ambiguity while maintaining a compact footprint, which allows the QuadraPol PSF to enable SVF without a sparsity constraint and thus achieve a better SNR compared with the current state-of-the-art SVF techniques. With a single FOV of $3.8 mm \times 4.5 mm$ , we experimentally demonstrate the ability to capture a $\sim 100 - {mm}^{3}$ cubic volume with lateral and axial resolutions of $\sim 7$ and $240 μ m$ , respectively. The extended DOF offered by this system reduces acquisition time by $\sim 20$ times compared with traditional $z$ -scan methods for capturing the same volume. The effectiveness of the QuadraPol PSF is shown in our all-in-focus imaging of bacterial colonies on sand surfaces, where it captures sharp images of most of the colonies in a single snapshot.

For the reconstruction algorithm, neural fields demonstrate better performance compared with deconvolution, as demonstrated in our plant root imaging experiments, where neural fields produce sharper images. The neural field approach also offers additional advantages such as creating a continuous representation of the object and significantly decreasing data storage requirements by an order of magnitude. For instance, storing plant root data as discrete images with 81 axial slices requires $\sim 62 GB$ , whereas neural fields of the same data require only 6.12 GB with the same data precision. Furthermore, this physics-based method is free from any pre-training or any data set collection. It can also be easily adapted to different system settings.

SVF imaging using the QuadraPol PSF and neural fields provides a powerful tool for large-FOV imaging at high spatial–temporal resolutions. The design principle can be easily adapted to meet specific application requirements. For instance, our current setup prioritizes high temporal resolution using a polarization camera with a large effective pixel size, thus sacrificing spatial resolution. However, systems requiring a higher spatial sampling rate from the detector could use a standard camera combined with polarization optical elements and temporal multiplexing to address this limitation. One current limitation of the QuadraPol PSF is that it relies on the expansion of four simultaneously-in-focus spots to estimate depth, which inherently limits the imaging depth. If the source-to-focal-plane distance increases, the PSFs in all four polarization channels expand, causing photons to spread across a larger area and consequently reducing the SNR at large defocus distances. In other words, due to the limited DOF extension of the QuadraPol PSF, its SBP still has room for improvement compared with the Miniscope3D13 or the Fourier DiffuserScope14 [Fig. 4(h)]. Integrating the QuadraPol PSF with DOF extension methods used in these other systems, such as using an engineered metasurface to create PSFs in four different polarization channels with distinct focal planes, should improve the SBP. Further, these methods can potentially overcome the limitation of low-depth resolution near the focal plane without introducing aberrations. We anticipate that these sophisticated improvements of QuadraPol would be worth implementing as this technology matures. Given the presence of a polarization camera in the QuadraPol PSF, a natural extension of the method would be to simultaneously measure polarization.59 Similar to what has been demonstrated with the MVR microscope,45 the QuadraPol PSF should exhibit high polarization sensitivity. Previous work in polarization imaging, such as 3D Mueller matrix imaging60^–62 and Stokes correlometry,63 has established frameworks for quantitative analysis of light polarization. This direction holds particular promise, as fluorescence polarization has been shown to be critical in plant studies64^,65 and other areas of biological research.66

In summary, our experimental results highlight the potential of SVF imaging using the QuadraPol PSF and neural fields, particularly for applications such as studying microbial interactions within the rhizosphere. In addition, we expect that our SVF system can be applied across various fields, including clinical and biomedical research. For instance, we tested our method on synthetic lymph node vasculature data67 (Fig. S12 in the Supplementary Material), experimental mouse kidney data (Fig. S13 in the Supplementary Material), and demonstration of 3D particle image velocimetry (Fig. S14 in the Supplementary Material). The simplicity of the optical setup also holds the potential for miniaturization, which would be useful for in vivo imaging in freely behaving animals. Moreover, in terms of technological advancement, our system introduces an approach to SVF design: instead of mapping 3D objects to 2D images, our method maps 3D objects to 3D measurements, with polarization serving as the third dimension. We anticipate this innovative approach will inspire future developments in SVF imaging system designs.

Oumeng Zhang is a postdoctoral researcher in the Biophotonics Lab at the California Institute of Technology (Caltech). He received his BS degree in electrical engineering from Shanghai Jiao Tong University in 2014 and his MS and PhD degrees in electrical engineering from Washington University in St. Louis, in 2017 and 2022, respectively. His research interests include computational imaging, single-molecule microscopy, and imaging system design.

Haowen Zhou is a PhD candidate in electrical engineering at Caltech. His research focuses on computational microscopy, physics-based machine learning and AI for health. He is a Schmidt GRA Fellow (2025), a recipient of SPIE optics and photonics scholarship (2024), and Gupta S2I Fellow (2022).

Brandon Y. Feng is a postdoctoral associate at MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). His research includes physics-based AI, computer vision, and machine learning.

Elin Larsson is a PhD candidate in bioengineering at Caltech. Her research includes strain engineering of environmental bacterial strains and advancing understanding of chemical signaling between entomopathogenic nematodes and their endosymbiotic bacteria.

Reinaldo E. Alcalde is a postdoctoral scholar in biology and biological engineering at Caltech. His interdisciplinary research integrates environmental engineering, microbiology, genetics, and advanced imaging technologies to address global sustainability challenges. He focuses on microbial interactions in the rhizosphere, with an emphasis on nutrient cycling and adaptive responses to environmental stress.

Siyuan Yin is a PhD student in the Department of Medical Engineering at Caltech. Her research focuses on optical microscopy and the application of deep learning techniques for advanced pathology analysis.

Catherine Deng is a senior at Caltech majoring in electrical engineering. She is interested in the intersection of optics and machine learning.

Changhuei Yang is the Thomas G. Myers Professor of Electrical Engineering, Bioengineering, and Medical Engineering at Caltech. His research includes Fourier ptychography, wavefront shaping, and deep learning for pathology analysis.

Category: Research Articles

Received: Aug. 23, 2024

Accepted: Jan. 24, 2025

Posted: Jan. 24, 2025

Published Online: Feb. 26, 2025

The Author Email: Zhang Oumeng (ozhang@caltech.edu)

DOI:10.1117/1.AP.7.2.026001

CSTR:32187.14.1.AP.7.2.026001