Chinese Optics Letters, Volume. 22, Issue 6, 060005(2024)

High-resolution single-photon LiDAR without range ambiguity using hybrid-mode imaging [Invited]

Xin-Wei Kong1,2,3, Wen-Long Ye1,2,4, Wenwen Li1,2,4, Zheng-Ping Li1,2,4、*, and Feihu Xu1,2,4、**
Author Affiliations
  • 1Hefei National Research Center for Physical Sciences at the Microscale and School of Physical Sciences, University of Science and Technology of China, Hefei 230026, China
  • 2Shanghai Research Center for Quantum Science and CAS Center for Excellence in Quantum Information and Quantum Physics, University of Science and Technology of China, Shanghai 201315, China
  • 3School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, China
  • 4Hefei National Laboratory, University of Science and Technology of China, Hefei 230088, China
  • show less

    We proposed a hybrid imaging scheme to estimate a high-resolution absolute depth map from low photon counts. It leverages measurements of photon arrival times from a single-photon LiDAR and an intensity image from a conventional high-resolution camera. Using a tailored fusion algorithm, we jointly processed the raw measurements from both sensors and output a high-resolution absolute depth map. We scaled up the resolution by a factor of 10, achieving 1300 × 2611 pixels and extending ∼4.7 times the unambiguous range. These results demonstrated the superior capability of long-range high-resolution 3D imaging without range ambiguity.

    Keywords

    1. Introduction

    Single-photon light detection and ranging (LiDAR) presents high sensitivity and high temporal precision, which has been widely applied in fields such as topographic mapping[1-3], remote sensing[4], target identification[5,6], and underwater imaging[7]. To meet the application demands, long-range and high-resolution single-photon three-dimensional (3D) imaging has emerged as a significant trend in the development of single-photon LiDAR techniques[8,9]. However, it remains challenging to directly achieve rapid and accurate 3D imaging over a wide field-of-view (FoV) and a large depth-of-view (DoV).

    Array-based single-photon LiDAR can be used to achieve high-resolution 3D imaging[10]. However, it needs a high-power laser to flood illuminate the scene. Besides, currently available detector arrays have limited size or show a poor time-tagging performance[11]. Therefore, widely used single-photon LiDAR is typically based on raster scanning[12,13]. But, high-density scanning inevitably leads to a longer imaging time. To mitigate this issue, data fusion techniques have been proposed to merge visible or infrared high-resolution images with single-photon LiDAR data to improve imaging resolution[14-16].

    Generally, single-photon LiDAR employs a time-correlated single-photon counting (TCSPC) technique. However, when the target is far away, the photon time of flight (ToF) that extends laser emission periods will be folded, resulting in range ambiguity[17], which leads to difficulties in large-DoV imaging. Several approaches have been proposed to mitigate the range ambiguity. A pseudo-random pattern matching scheme[18-21] can identify the exact flight time by correlation between the transmitted and received patterns. Meanwhile, the multi-repetition-rate scheme has also been demonstrated to increase the maximum unambiguous distance beyond 100 kilometers[22] and achieve large-DoV imaging[23]. Nonetheless, a comprehensive solution to achieve wide FoV and large DoV simultaneously is still lacking.

    Here, we proposed and demonstrated a fusion method that simultaneously tackled the range-ambiguity and low-resolution bottleneck of single-photon LiDAR. We integrated a multi-repetition-rate single-photon LiDAR and a high-resolution intensity camera on hardware. On the software side, we developed a tailored fusion algorithm for recovering absolute distance and enhancing the image resolution in the scenario of low photon counts. We experimentally validated the ability to reconstruct high-resolution absolute depth images. We scaled up the image resolution by a factor of 10 by achieving 1300×2611 pixels and extended 4.7 times the unambiguous range. Consequently, our method holistically achieved long-range, high-resolution 3D imaging of expansive scenes with high depth accuracy over a wide FoV and a large DoV.

    2. Approach

    In single-photon imaging, the system illuminates the target’s pth pixel with a periodic laser pulse s(t) and then measures the backscattered photons. By recording the time interval t between the arrival of the echo signal and the most recent pulse emission, the depth Zp and reflectivity αp of the target’s pth pixel can be estimated. However, when the target is far away, the photon ToF that extends laser emission periods T will be folded, resulting in a Poisson-process rate function as follows: λp(t)=ηαpnps(t+npT2Zp/c)+B,t[0,T),where η is detector’s photon-detection efficiency, B represents the average rate of background-light plus dark-count detections, and c is the speed of light. The parameter npT represents the photon ToF being folded.

    After N pulsed-illumination trials, the likelihood function for the set of time interval {tpl}l=1kp is P({tpl}l=1kp;Zp,αp)=eΛl=1kpNλp(tpl),where Λ=τ=0τ=TNλp(τ)dτ, and kp is the total number of photons detected at the pth pixel. Generally, the target distance can be estimated by applying maximum likelihood estimation (MLE): ZpMLE=argmaxZpl=1kplog{N[ηαpnps(tpl+npT2Zp/c)+B]}.

    Because the maximum likelihood estimator is a periodic function of Zp, Eq. (3) has multiple optimal solutions, which prevents a straightforward calculation of the actual distance to the target and causes range aliasing.

    To overcome this range ambiguity, we use a data acquisition scheme where adjacent pixels are detected through different laser pulse repetition periods and a data fusion method exploiting images captured by camera. The data acquisition scheme has been extensively detailed in a previous paper[23]. Here, we focus on the use of high-resolution images for absolute distance reconstruction and upsampling of single-photon LiDAR data. The schematic of the algorithm is illustrated in Fig. 1, and the algorithm can be divided into two steps.

    Schematic diagram of the algorithm. (a) Single-photon LiDAR data acquired by laser source with multiple repetition rates. (b) Image captured by camera. (c) Intensity image of (b). (d) Absolute distance image. (e) Horizontal, vertical, and diagonal gradient images from the camera image. (f) High-resolution depth image without range ambiguity.

    Figure 1.Schematic diagram of the algorithm. (a) Single-photon LiDAR data acquired by laser source with multiple repetition rates. (b) Image captured by camera. (c) Intensity image of (b). (d) Absolute distance image. (e) Horizontal, vertical, and diagonal gradient images from the camera image. (f) High-resolution depth image without range ambiguity.

    2.1. Resolving range ambiguity guided by the intensity image

    Upon acquiring the measurements via the multi-repetition-rate scheme, the integration of data from adjacent pixels within the neighborhood Ω through cluster algorithms[20] enables the determination of the absolute distance: Z^p=argmaxZpqΩωq,pl=1kqlog{N[ηαpnps(tpl+np·T2Zq/c)+B]},where the weighting factor ωp,q is used to avoid errors in distance calculation at the edges of objects. Similar to the previous paper[23], we leverage the spatial and reflectivity information to evaluate the weighting factor ωp,q for neighboring pixels. However, due to the reflectivity map of single-photon LiDAR being susceptible to Poisson noise at low photon counts, we use conventional high-resolution camera images to evaluate the reflectivity information of single-photon LiDAR pixels. Due to the pixel number discrepancy between the conventional camera and single-photon LiDAR, the reflectivity value of the single-photon LiDAR is the weighted average of several conventional camera pixels. A many-to-one pixel mapping scenario arises: Ip=42πDl=1DIple8(pxpl)2D2,where {xpl}l=1D and {Ipl}l=1D correspond to the positions and intensities of the conventional camera images, respectively. Therefore, the definition of the weighting factor ωp,q is ωp,q=f(|pq|)·g(|IpIq|). Here, f and g are the spatial and reflectivity kernels, respectively, both positively correlated with the Gaussian distribution.

    Since the above process of solving Z^p requires integration of the echo signals from the surrounding pixels, this often results in the image becoming overly smoothed, consequently reducing the imaging resolution and affecting the image quality. Here a convex optimization algorithm is employed to further enhance the accuracy of image reconstruction. The folded photon ToF npT for the pth pixel can be determined as n^pT=Z^p/2c. Then, taking advantage of spatial correlations in natural scenes, we select total variation (TV) as the penalization term. Thus, the absolute depth map is derived as follows: ZMLE=argmaxZpl=1kplog{N[ηαps(tpl+n^pT2Zp/c)+B]}+β·penalty(Z).

    The above equation constitutes a convex optimization problem and can be solved using convex optimization algorithms[24] to obtain the final estimated distance value of the target.

    2.2. Intensity-image guided upsampling

    Furthermore, to improve the resolution of single-photon imaging, we can take advantage of the high resolution offered by conventional camera images to guide the upsampling of single-photon images. In our framework, ZH is designated as the high-resolution single-photon depth map we aim to obtain. Correspondingly, the already acquired absolute depth map ZMLE represents a downsampled mapping of ZH, and this downsampling satisfies the following relation: ZMLE=fd(ZH)+ZN,where fd(·) is the downsampling function that performs pixel-weighted summation using Gaussian weights, and ZN represents the noise. Assuming the noise follows a Gaussian distribution, its likelihood function can be expressed as follows: L=log[P(ZH|ZMLE)]ZMLEfd(ZH)22.

    Thus, by applying MLE, we can obtain the high-resolution single-photon image: Z^H=argminZH[L+β·penalty(ZH)].

    Here, we employ a second-order total generalized variation (TGV) regularization as the penalty term to constraint image, which is represented as penalty(ZH)=α1T1/2(ZHν)1+α0ν1,where T1/2 is the anisotropic diffusion tensor, ν is an auxiliary variable, and the scalars α1 and α0 are non-negative weight coefficients. The TGV allows for sharper edge preservation while suppressing noise. Since the problem is convex but nonsmooth due to the TGV regularization term, a primal-dual optimization algorithm is used for solving[14].

    3. Simulations

    We conducted simulation experiments using the Middlebury 2007 dataset[25] to validate the effectiveness of our proposed method in reconstructing high-resolution absolute distance images. The resolution of single-photon imaging is set to 64×64 pixels. Considering the depth span of only 6 m in the simulation scenario, we conducted a downscaled simulation of the imaging system’s laser period by a factor of 100. We selected laser periods as 10 ns, 14.3 ns, 15.9 ns, 16.1 ns, and 17.1 ns for the simulation, of which the single period maximum unambiguous range is 2.565 m. As shown in Fig. 2, we reconstructed the depth map using our method and compared the results with two state-of-the-art methods.

    Simulation results. (a) Ground truth. (b) High-resolution camera image. (c) The simulation results by different methods under various PPP and SBR. From top to bottom, each row corresponds to PPP ∼1 with SBR ∼0.1, PPP ∼10 with SBR ∼0.01, and PPP ∼10 with SBR ∼0.1, respectively. From left to right, each column shows the results reconstructed by Snyder et al. and Dai et al., proposed without and with upsampling, respectively.

    Figure 2.Simulation results. (a) Ground truth. (b) High-resolution camera image. (c) The simulation results by different methods under various PPP and SBR. From top to bottom, each row corresponds to PPP ∼1 with SBR ∼0.1, PPP ∼10 with SBR ∼0.01, and PPP ∼10 with SBR ∼0.1, respectively. From left to right, each column shows the results reconstructed by Snyder et al. and Dai et al., proposed without and with upsampling, respectively.

    Figure 2(c) demonstrates that conventional algorithms[26] struggle to accurately estimate the front-to-back position of a target because of range ambiguity. Dai et al.[23] achieved absolute distance recovery; however, this method leads to the presence of noise in the depth maps. Our proposed method reconstructs absolute distance by combining conventional camera images with single-photon LiDAR, reducing the impact of Poisson noise and thereby achieving higher reconstruction accuracy. Compared with Dai et al.’s method, it shows a lower RMSE, which demonstrates superior absolute distance reconstruction capabilities even with low photon counts and a low signal-to-background ratio (SBR). Besides, we have used conventional camera images for upsampling, which can enrich target details and remarkably improve image resolution. Compared to the results before upsampling, it has a lower RMSE.

    By comparing our method and Dai et al.’s method in terms of root mean square error (RMSE) under the same conditions, we find that reconstructions relying purely on LiDAR data, especially in low PPP and low SBR scenarios, tend to have some noisy pixels. By using the upsampling guidance, our algorithm performs well. As shown in Fig. 3, our method outperforms Dai et al.’s method in terms of the RMSE. The trend of our results initially decreases and then stabilizes as SBR/PPP increases, demonstrating that our results achieve the best accuracy.

    The RMSE in simulations with different PPP and SBR levels. (a) For PPP ∼1 with SBR ∼0.01, 0.05, and 0.1, the RMSE results are calculated by the methods of Dai et al., proposed with and without upsampling. (b) For SBR ∼0.1 with PPP ∼0.5, 1, 5, and 10, the RMSE results are calculated by the methods of Dai et al., proposed without and with upsampling.

    Figure 3.The RMSE in simulations with different PPP and SBR levels. (a) For PPP ∼1 with SBR ∼0.01, 0.05, and 0.1, the RMSE results are calculated by the methods of Dai et al., proposed with and without upsampling. (b) For SBR ∼0.1 with PPP ∼0.5, 1, 5, and 10, the RMSE results are calculated by the methods of Dai et al., proposed without and with upsampling.

    4. Experiment

    4.1. Experimental setup

    The schematic of our long-range, high-resolution single-photon imaging system is shown in Fig. 4. We use a digital full-frame camera with a pixel resolution set to 7008×4672. The focal length of the objective lens of the camera is 400 mm. A raster scanning single-photon LiDAR using laser source with multiple repetition rates provides raw depth data. The scanning interval is set to be 100 µrad. Single-photon LiDAR uses a coaxial design, allowing for highly efficient detection over wide detection distances. To eliminate the local noise in this coaxial system, we set a temporal separation of laser emission and detection and employ two acousto-optic modulators (AOMs) for noise suppression. The system employs a 1550 nm fiber pulsed laser, and the period is adjustable through an external trigger, which is typically set between 1 and 2 µs. The maximum emission laser power of the system is 250 mW. The system includes a home-made InGaAs/InP single-photon avalanche diode (SPAD) detector with a detection efficiency of 30% and a dark count rate of 1.2 kcps (cps, counts per second). The system uses a home-made field programmable gate array (FPGA) board for precise timing control. Moreover, we use the pixel signals output from the micro-electromechanical system (MEMS) mirror to discern different pixel information and implement a scanning method where each pixel is illuminated by a specific frequency, with different frequencies employed for adjacent pixels.

    The layout of the system. (a) Conventional high-resolution camera. (b) Single-photon LiDAR. (c) Data processing system.

    Figure 4.The layout of the system. (a) Conventional high-resolution camera. (b) Single-photon LiDAR. (c) Data processing system.

    4.2. Experimental results

    As shown in Fig. 5(a), we imaged residential buildings located 0.4 to 1.6 kilometers away. The experiment was conducted under five different laser pulse periods (1 µs, 1.43 µs, 1.59 µs, 1.61 µs, 1.71 µs), with a per-pixel acquisition time of 330 µs. We collected a single-photon image of 128×250 pixels, and the average PPP was 4.07. Guided by intensity information from the camera, we obtained absolute depth estimation shown in Fig. 5(d). Furthermore, using the extracted contour information of the same image, we successfully generated a depth map with 10-fold higher resolution (1300×2611) while maintaining high depth accuracy as illustrated in Fig. 5(e). By comparing Figs. 5(f) and 5(g), our method displays better detail of the building after upsampling. The comparison between Figs. 5(h) and 5(i) shows a superiority for capturing detailed 3D surfaces in complex urban environments. These results prove the robustness and accuracy of our method in practical applications.

    The experimental results. (a) The target’s location on the map. (b) Photograph of our system. (c) High-resolution camera image of target. (d), (e) The results using our proposed method without and with upsampling. (f), (g) Closeup views of the building details in depth reconstructions [area highlighted by green rectangle in (c)]. (h), (i) 3D profiles of the eaves details in depth reconstructions [highlighted by blue rectangle in (c)].

    Figure 5.The experimental results. (a) The target’s location on the map. (b) Photograph of our system. (c) High-resolution camera image of target. (d), (e) The results using our proposed method without and with upsampling. (f), (g) Closeup views of the building details in depth reconstructions [area highlighted by green rectangle in (c)]. (h), (i) 3D profiles of the eaves details in depth reconstructions [highlighted by blue rectangle in (c)].

    5. Conclusion

    We proposed and validated a fusion long-range 3D imaging method to overcome the challenges of range ambiguity and low resolution. The outdoor experimental results extended 4.7 times the unambiguous range and imaged with over 3 megapixels (1300×2611), a 10-fold increase in resolution. By providing accurate depth perception and fine spatial awareness, the results may offer enhanced methods for rapid, high-resolution, long-range 3D imaging for large-scale scenes. These are essential for target identification and environmental mapping in complex areas.

    [2] R. M. Marino, W. R. Davis. Jigsaw: a foliage-penetrating 3D imaging laser radar system. Linc. Lab. J., 15, 23(2005).

    [5] A. B. Gschwendtner, W. E. Keicher. Development of coherent laser radar at Lincoln Laboratory. Linc. Lab. J., 12, 383(2000).

    [14] D. Ferstl, C. Reinbacher, R. Ranftl et al. Image guided depth upsampling using anisotropic total generalized variation. Proceedings of the IEEE International Conference on Computer Vision, 993(2013).

    [17] W. H. Long, D. H. Mooney, W. A. Skillman. Pulse doppler radar. Radar Handbook, 2(1990).

    [25] D. Scharstein, C. Pal. Learning conditional random fields for stereo. IEEE Conference on Computer Vision and Pattern Recognition, 1(2007).

    [26] D. L. Snyder, M. I. Miller. Random Point Processes in Time and Space(2012).

    Tools

    Get Citation

    Copy Citation Text

    Xin-Wei Kong, Wen-Long Ye, Wenwen Li, Zheng-Ping Li, Feihu Xu, "High-resolution single-photon LiDAR without range ambiguity using hybrid-mode imaging [Invited]," Chin. Opt. Lett. 22, 060005 (2024)

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Special Issue: SPECIAL ISSUE ON QUANTUM IMAGING

    Received: Dec. 28, 2023

    Accepted: Mar. 25, 2024

    Published Online: Jun. 27, 2024

    The Author Email: Zheng-Ping Li (lizhp@ustc.edu.cn), Feihu Xu (feihuxu@ustc.edu.cn)

    DOI:10.3788/COL202422.060005

    Topics