Snapshot macroscopic Fourier ptychography: far-field synthetic aperture imaging via illumination multiplexing and camera array acquisition

Sheng Li; Bowen Wang; Haitao Guan; Qian Chen; Chao Zuo

doi:10.3788/AI.2024.10005

1. Introduction

The synthetic aperture (SA) technique is crucial for many far-field detection applications, such as Earth observation^[1], remote sensing^[2], and astronomical imaging^[3,4]. By synthesizing a large virtual aperture, it overcomes the diffraction limit of finite-size apertures, enabling higher-resolution imaging. Synthetic aperture radar (SAR) has been particularly successful, directly measuring the complex field with high temporal resolution to effectively synthesize large virtual apertures and enhance resolution in microwave frequencies^[5,6]. Over the past half-century, SAR has become an indispensable tool in various detection fields. Applying the SA technique in visible light promises even greater resolution enhancements compared with SAR due to shorter wavelengths^[7–10]. However, challenges arise because light waves contain both amplitude and phase information, and the extremely high oscillation frequency of visible light (approximately $10^{14}$ ) prevents direct phase detection by human eyes or photodetectors^[11,12]. This makes realizing a visible-light synthetic aperture difficult. Additionally, methods like interferometric phase measurement and co-phasing detection for aperture synthesis often face stability issues. These challenges have hindered the application of high-resolution SA techniques in visible-light imaging.

Fourier ptychographic microscopy (FPM) is a powerful phase retrieval method that iteratively combines real and reciprocal space information^[13–15]. Based on Fourier optics and ptychography^[16], FPM illuminates a sample from different angles using an light-emitting diode (LED) array^[17,18]. This enables the capture of diverse spatial frequencies, facilitating simultaneous phase recovery and aperture synthesis^[19,20]. Its simple hardware and superior imaging capabilities make FPM widely applicable in biomedical and pathological studies^[21–23]. By replacing the LED array with aperture or camera scanning, Fourier ptychography (FP) can be extended to far-field detection as macroscopic FP^[24–27]. This non-interferometric aperture synthesis method enhances resolution and corrects aberrations over long distances^[28,29]. Importantly, macroscopic FP addresses the inherent trade-off between the imaging distance and field of view (FOV), making it highly practical for far-field applications^[30,31].

Despite their physical differences, FPM and macroscopic FP face common challenges in phase retrieval techniques, including sub-aperture positioning errors^[32], low signal-to-noise ratios (SNRs)^[33,34], and slow reconstruction speeds^[35]. Their mathematical similarities enable shared solutions, like simulated annealing algorithms^[36], adaptive step-size strategy^[37], and single-photon avalanche diode (SPAD) array cameras^[38]. Both FPM and macroscopic FP rely on data redundancy for stable iterative convergence. This is achieved through dense illumination/scanning sampling, but at the cost of numerous image acquisitions and reduced efficiency. Multiplexing has proven effective in computational imaging by decoupling individual images for greater information extraction^[39–45]. In FPM, illumination multiplexing reduces image acquisition without sacrificing data redundancy^[46,47]. Color cameras enable direct R/G/B channel separation but face photon efficiency and filter-induced downsampling issues^[48].

Unlike FPM, simultaneous multi-wavelength illumination in macroscopic FP does not improve temporal resolution because it relies on aperture scanning, not LED positioning, for spectral shifts. Camera arrays can potentially enhance temporal resolution, as seen in some FPM implementations^[49,50]. However, the lack of physical overlap between camera array sub-apertures inhibits effective redundancy for macroscopic FP, where the overlap must exceed 40%^[51,52]. Wu et al. demonstrated low-redundancy (25% overlap) aperture synthesis using a camera array and total variation (TV) regularization^[53], but this has not yet been applied to practical macroscopic FP systems. Deep neural networks can extract spectral information from camera arrays for rapid reconstruction^[54,55]. Our group achieved improved resolution and SNR in macroscopic FP using a nine-path convolutional neural network^[56], though reconstruction quality remains somewhat dataset-dependent. To address far-field imaging of moving targets or dynamic scenes, methods for achieving high-temporal-resolution macroscopic FP are crucial.

To address the above challenges, we present illumination-multiplexed snapshot synthetic aperture imaging (IMSS-SAI) for macroscopic Fourier ptychographic imaging. By employing a camera array for simultaneous data acquisition, we expand the Fourier coverage in reciprocal space without camera scanning. By employing a state-multiplexed ptychographic algorithm in IMSS-SAI, we effectively separate distinct coherent states from their incoherent summations, enhancing the Fourier spectrum overlap for ptychographic reconstruction. Simulations (USAF resolution chart and commemorative coin) and experiments with far-field targets demonstrate the effectiveness of IMSS-SAI. Our method delivers a fourfold increase in spatial resolution without sacrificing temporal efficiency. Furthermore, we successfully reconstructed a dynamic music box image with both high temporal and spatial resolutions. IMSS-SAI offers a practical and easily implemented solution for fast, long-range, and high-quality super-resolution far-field detection.

2. Principle and Method

In a standard macroscopic FP setup, a translation stage sequentially scans the aperture to obtain position-varied spectrum information, with images acquired by individual cameras. IMSS-SAI replaces the translation stage and individual cameras with a camera array, enabling simultaneous capture of spectrum information at multiple positions. Figure 1 illustrates the IMSS-SAI system. A fiber laser emits light simultaneously at three distinct wavelengths. After passing through a lens to satisfy the Fraunhofer diffraction condition, the light propagates toward the target. Upon reflection, the target’s optical field information is captured by a $5 \times 5$ camera array. Each sensor in the array records a sub-image with an intensity distribution: $I (x, y) = \sum_{l}^{n} I_{l} (x, y) = \sum_{l}^{n} {| F^{- 1} [Ψ (u, v) \cdot P (u_{l} - u_{c}, v_{l} - v_{c})] |}^{2},$ (1)where $Ψ (u, v)$ is the light field, $I_{l} (x, y)$ is the intensity distribution of the $l^{th}$ wavelength, $F^{- 1}$ is the inverse Fourier transform, and $P (u_{l} - u_{c}, v_{l} - v_{c})$ represents the pupil function corresponding to the respective wavelength. When utilizing coherent light of three wavelengths to illuminate distant targets simultaneously, it forms hybrid coherent illumination. Hence, the image recorded by an individual camera is the summation of image intensities generated by the three respective wavelengths.

Figure 1.Overview of the IMSS-SAI framework. The target is illuminated by a tri-wavelength fiber laser, and the Fourier spectrum is formed at the aperture plane of the camera array.

Download full size

View all figures

The target is positioned at the center of the captured frames by compact arrangement and slight directional adjustments among the cameras. This approach simplifies the reconstruction of high-resolution images without needing image registration. The camera array, by recording images from different sub-apertures, effectively obtains spectrum information from various positions, extending the spectrum range and achieving synthetic aperture imaging (See Visualization 1 for the whole video recording). To accurately quantify the overlapping ratio between each sub-aperture, the spectral shift components corresponding to the different wavelengths must be calculated. The aperture distribution, extending outward from the center, includes the central aperture, and the R/G/B channel apertures located on the inner ring and outer ring, respectively, as illustrated in Fig. S1(b1) in Supplement 1.

The primary distinguishing feature of IMSS-SAI lies in its capacity to “regenerate” information redundancy lost between camera arrays. Within an individual detector of IMSS-SAI, discernible variations in the size and positioning of the pupil functions are observed, a phenomenon attributed to the wavelength diversity: First, the cutoff frequency $f_{c}$ of coherent optical imaging systems is restricted by the wavelength $λ$ and is precisely denoted as $f_{c} = NA / λ$ , where NA is the numerical aperture. In addition, under a standard wavelength, the frequency-domain offset $f_{s}$ between two sub-apertures is constrained by the utilized wavelength and can be mathematically represented as $f_{s} = \frac{1}{λ} \cdot \frac{δ}{\sqrt{δ^{2} + {(L \cdot | β |)}^{2}}} = \frac{f_{c}}{NA} \cdot \frac{δ}{\sqrt{δ^{2} + {(L \cdot | β |)}^{2}}},$ (2)where $δ$ is the spacing between cameras in airspace, $L$ is the imaging distance, and $β$ is the magnification of the imaging system. Hence, within an individual sub-aperture, the spectrum information of images captured at various wavelengths encompasses overlapping and distinct components. It is noteworthy that, following the offset formula, the frequency-domain spacing is non-linear in terms of the actual spacing of the camera. This indicates that the positions of each sub-aperture in the array camera must be determined based on the imaging distance, ensuring the requisite information redundancy for macroscopic FP.

We have refined the state-multiplexed FP algorithm to enhance its suitability for macroscopic Fourier ptychographic imaging with a camera array. The iterative reconstruction process is depicted in Fig. 2. We employ a cumulative averaging method on the sub-aperture images of diffuse targets, utilizing the resultant average as the initial image amplitude to facilitate an accelerated convergence. The initial complex amplitude is estimated as $\sqrt{I_{h}} e^{i φ_{h}}$ , with the phase set to 0 at the outset. This sample estimate generates multiple low-resolution target images corresponding to different coherent states. Subsequently, the low-resolution images generated for different coherent states are combined to simulate the image captured by a single camera, and the process can be expressed as $I_{t} = {| \sqrt{I_{h}} e^{i φ_{h}} \cdot P_{rt} |}^{2} + {| \sqrt{I_{h}} e^{i φ_{h}} \cdot P_{gt} |}^{2} + {| \sqrt{I_{h}} e^{i φ_{h}} \cdot P_{bt} |}^{2},$ (3)where $P_{rt}$ , $P_{gt}$ , and $P_{bt}$ represent the pupil functions corresponding to the red, green, and blue wavelengths at the $t^{th}$ aperture. By utilizing a phase retrieval algorithm, IMSS-SAI is capable of recovering phase information under the constraint of spectrum overlapping of different channel sub-apertures. Despite the large amount of work devoted to the FP phase retrieval problem, the alternating projection (AP) method derived from the Gerchberg–Saxton (GS) algorithm is still a stable and widely used scheme for IMSS-SAI. The actual captured sub-aperture intensity image of the $t^{th}$ aperture $I_{m}$ is used to update the components of each wavelength in $I_{t}$ , while preserving corresponding phase information: $\sqrt{I_{rt}^{'}} e^{i φ_{rt}} = \sqrt{\frac{I_{m}}{I_{t}} \cdot I_{rt}} e^{i φ_{rt}} .$ (4)

Figure 2.Flow chart of the iteration reconstruction of IMSS-SAI.

Download full size

View all figures

In this context, the pre-update intensity distribution, denoted as $I_{rt}$ , represents the red channel’s intensity distribution for $t^{th}$ apertures. The updating rule for the green and blue channels follows a similar procedure. The updated target image is employed to modify the corresponding spectrum region of the sample estimate. With each component update, it is necessary to reaccumulate low-resolution images of different coherent states to update $I_{t}$ (as shown in Fig. 2).

Unlike conventional microscopy, where thin phase objects are assumed to have similar responses across spectral channels, IMMS-SAI relies on a different assumption: for diffuse reflection targets, intensity distributions remain approximately consistent across wavelengths, despite variations in phase caused by illumination differences. IMSS-SAI is best suited for targets with minimal color variation, where intensities at different wavelengths are strongly correlated. Additional noise reduction algorithms, with adjusted parameters to mitigate errors induced by depth variation, are discussed in Supplement 1. We employ the standard phase retrieval iterative engine (PIE) algorithm instead of ePIE (extended PIE) due to inconsistent aberrations across both camera array apertures and different spectral channels^[57,58]. These aberrations would hinder the convergence of a state-multiplexed FP algorithm. Speckle noise is a common issue in far-field detection with rough-surface diffuse reflection targets. Therefore, denoising techniques are essential for mitigating speckle noise’s influence on reconstruction results. Our previous work^[30] examined speckle noise formation and its influence on coherent imaging of smooth versus diffuse objects. We enhance texture details and achieve high SNR by incorporating TV regularization and guided filtering in our reconstruction process. Unlike our previous use of total variation guided filtering (TVGF) during the iterative process, which disordered phase information in IMSS-SAI, we apply TVGF to the amplitude of the reconstructed image after 25 iterations (out of 30 total) across all apertures, ensuring both high resolution and superior SNR.

3. Simulation to Demonstrate the Resolution Enhancement of IMSS-SAI

We separately simulated the reconstruction results for a smooth target (USAF resolution chart, without speckle noise) and a rough target (coin exhibiting pronounced speckle noise) to validate the algorithm’s effectiveness. The results are shown in Fig. 3. Here, the imaging system is configured at a distance of 1.6 m from the target, with incident wavelengths as 653 nm, 515 nm, and 445 nm. The F-number of the sub-apertures is designated as 32 to acquire low-resolution images, with a pixel size set at 1.85 µm. We simulated low-resolution image acquisition using a $5 \times 5$ camera array. The sub-aperture positions within the camera array were optimally arranged following the distribution illustrated in the top-left corner of Fig. 1.

Figure 3.Reconstructions of the USAF resolution chart and the coin. (a1), (a2) The spectrum information and corresponding low-resolution image captured by a single camera. (b1), (b2) The expanded spectrum range and the reconstructed result by IMSS-SAI, with the corresponding line profile of group 0 shown at the bottom. (c1)–(c3) The low-resolution image of the coin and the regions of interest. (d1)–(d3) The averaging result and the regions of interest. (e1)–(e3) The recovery result and the regions of interest.

Download full size

View all figures

The spectrum information and low-resolution image captured by a single camera are illustrated in Figs. 3(a1) and 3(a2), respectively. The maximum coverage range of the spectrum in the sub-aperture reaches the cutoff frequency of blue light. Subsequently, we conducted reconstruction based on the optimal arrangement. The expanded spectrum range and the corresponding reconstructed result are depicted in Figs. 3(b1) and 3(b2). The resolvable line pairs increased from 0.4454 line pairs per millimeter (group -2, Element 6) to 1.7818 line pairs per millimeter (group 0, Element 6). The corresponding line profile of group -3 is shown at the bottom of Fig. 3(b1), demonstrating a fourfold enhancement in the resolution of the proposed IMSS-SAI. We introduced random phase values in the range of $[- π, π]$ to generate low-resolution images with speckle noise, as depicted in Fig. 3(c1). As objective criteria, we utilized the peak-signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) to evaluate the image quality of the reconstruction result, the raw image, and the cumulative average. Figure 3(d1) represents the cumulative average of images captured by the camera array, while Fig. 3(e1) showcases the reconstructed results achieved through IMSS-SAI. It is noticeable that Fig. 3(d1) approaches the incoherent diffraction limit of a single sub-aperture, yet finer details remain blurred. Our method compresses the aperture-scanning process into a single image acquisition, thereby realizing a fourfold increase in resolution by implementing illumination multiplexing. In addition, we compared the reconstruction results of IMSS-SAI with the conventional aperture scanning method and demonstrated that the proposed method can reduce the imaging time by a factor of 121 without compromising the quality, as illustrated in Supplement 1.

4. Experiments to Demonstrate the Dynamic Imaging of IMSS-SAI

To assess the resolution enhancement of actual targets through illumination multiplexing, we constructed an experimental platform following the design shown in Fig. 1. The central wavelengths of the laser light source matched our simulated setting, used for illuminating two real targets located 2 m away. A configuration of 25 lenses with identical parameters (75 mm Fujinon lens) and sensors (DMK33UX226, Imaging Source, 2400 pixel × 2400 pixel, 1.85 µm pixel size) is arranged on the camera array. The detectable photons decay with increasing imaging distance, which can be avoided by utilizing quasi-plane wave illumination or a high-power pulsed laser to an extent^[30]. The F-number of each aperture in the $5 \times 5$ camera array was adjusted to 22, and 25 low-resolution images were captured. The low-resolution image from a single aperture presents the incoherent mixture of target information at three different wavelengths, as illustrated in Figs. 4(a1) and 4(c1). Utilizing the improved state-multiplexed FP algorithm, the incoherent mixture of sub-aperture images was decoupled, and the required high-resolution image was subsequently reconstructed. Figures 4(a3) and 4(c3) showcase the reconstructed high-resolution images. To demonstrate the effectiveness of the IMSS-SAI in resolution enhancement, we present the results of directly averaging 25 low-resolution images, as depicted in Figs. 4(a2) and 4(c2). It is observed that the averaging process mitigates the influence of speckle noise. However, compared to the averaged results, the proposed method further reconstructs detailed texture information.

Figure 4.Experimental results of IMSS-SAI on far-field targets. (a1)–(d1) The low-resolution images and the corresponding regions of interest. (a2)–(d2) The averaging results and corresponding regions of interest. (a3)–(d3) The recovery results and corresponding regions of interest.

Download full size

View all figures

It is worth noting that there are differences between the results obtained by IMSS-SAI and the averaging of independently reconstructed results from three channels. The improvement in SNR resulting from averaging is not based on speckles formed by different angles. Single-channel reconstruction under low redundancy conditions fails to reconstruct high-resolution images accurately, and even averaging the three channels only slightly suppresses speckle noise without improving resolution, as discussed in Supplement 1.

Furthermore, we demonstrated the high temporal resolution of the IMSS-SAI by capturing super-resolved videography of dynamic scenarios containing isolated individual samples (see Visualization 2 for the whole video recording). We utilized the same parameters as for the experiments in the static scenario, with an F-number of sub-apertures of 22. The imaging scenario by sub-aperture and reconstruction is depicted in Fig. 5(a), featuring a Harry Potter music box with mostly stationary elements. The color of the music box tends to be a uniform dark green, so aperture synthesis can be performed using IMSS-SAI. A clockwise-moving train is a moving target captured by the camera array at 30 frame/s, the maximum frame rate set by the camera program. Figures 5(b) and 5(c) present a comparison of results for two stationary regions in the scene. It can be observed that the original low-resolution image is too coarse to discern meaningful details, while the result obtained through cumulative averaging of sub-images only displays partial contours. In contrast, the proposed IMSS-SAI yields optimal super-resolution reconstruction, and the imaging SNR is further enhanced through TVGF denoising processing. The reconstruction results for the moving target at different time points are illustrated in Fig. 5(d). Compared to the original images, the reconstruction method consistently produces high-resolution, high SNR images across 480 continuous frames. The eagle emblem on the train’s front, representing the most discernible target, is reconstructed and identified at each position. This experimental evidence substantiates that the proposed method enhances the capabilities of the synthetic aperture technique in detecting moving targets.

Figure 5.Constructed music box dynamic rotating imaging results. (a) Comparison of the recorded low-resolution image (under F-number 12) with the predicted super-resolved image. (b1)–(b4) Raw image, averaging result, recovery result, and denoising result of the region of interest 1, respectively. (c1)–(c4) Raw image, averaging result, recovery result, and denoising result of the region of the interest 2, respectively. (d1)–(d6) The instantaneous frames of the eagle emblem captured every 0.5 s (15 frames) with IMSS-SAI.

Download full size

View all figures

5. Conclusion and Discussion

In summary, we have proposed a camera-array-based single-shot synthetic aperture imaging, combining illumination multiplexing FP with a camera array to achieve single-shot far-field macroscopic FP imaging. Employing IMSS-SAI on platforms such as satellites and unmanned aerial vehicles enables precise target identification, intelligence gathering, and tracking with exceptional accuracy. Illumination multiplexing provides decoupled low-resolution images to achieve sufficient information redundancy. The parallel acquisition at different positions of the camera array offers scalable spectrum information. By refining the state-multiplexed FP algorithm, the 25 incoherently mixed sub-images of a $5 \times 5$ camera array can be decoupled into 75 images. The resolution of the reconstructed results with the IMSS-SAI method has increased fourfold compared to the raw images. Furthermore, we have presented single-shot reconstruction results of the moving target, indicating that IMSS-SAI is a promising method for real-time, long-term, and high-quality dynamic scene recording under far-field detection conditions.

It should be emphasized that the current work is mainly focused on the construction of redundant information in macroscopic Fourier ptychographic imaging. The utilization of additional wavelengths for illumination increases the complexity of algorithmic decoupling, thereby failing to enhance the resolution of synthetic aperture imaging. Conversely, it impairs the efficiency and quality of the reconstruction process. By further enhancing the scale and performance of the camera array, it is expected to achieve higher resolution and more rapid synthetic aperture imaging. Considering that cameras farther from the array center introduce greater aberrations, we believe that navigating a graceful balance between the scale of the camera array and the enhanced resolution achievable through synthetic aperture should be more significant.

Category: Research Article

Received: Mar. 21, 2024

Accepted: May. 15, 2024

Published Online: Jun. 7, 2024

The Author Email: Qian Chen (chenqian@njust.edu.cn), Chao Zuo (zuochao@njust.edu.cn)

DOI:10.3788/AI.2024.10005

微信扫一扫：分享