Hyperspectral-depth imaging based on single-pixel detectors

Zhihao Zhao; Zhaohua Yang; Jie Liu; Ling'an Wu; Yuanjin Yu

doi:10.3788/COL202523.051103

1. Introduction

Hyperspectral imaging and depth imaging, as powerful techniques for acquiring geometric and photometric features of objects, play a crucial role in remote sensing^[1], mobile photography^[2,3], classification^[4,5], high-resolution imaging in medicine^[6], and the food industry^[7]. For example, Sun et al. employed dual-wavelength spectral analysis to assess oxygen metabolism^[8], and Li et al. used synthetic aperture radar backscattering to map a 3D building structure^[9]. Although the technologies developed independently, there are some methods that can simultaneously obtain both types of images. Kim et al. combined a depth scanning system and a modified spectral imaging camera to obtain a hyperspectral-depth image with a spectral resolution of 12 nm^[10]. Behmann used a depth imaging sensor and two hyperspectral cameras for the reconstruction of hyperspectral shape plant images^[11]. The Microsoft Kinect, which combines a red, green, and blue (RGB) camera with a structured light-depth camera^[12], is a very cheap strategy but has only three spectral channels. In most existing systems, the depth $f (x, y, z)$ and hyperspectral images $f (x, y, λ)$ are acquired independently by two separate cameras, but because the detectors are at different positions, it is necessary to match the spatial information of the images, which reduces data-processing efficiency and entails a trade-off in the depth accuracy, spectral resolution, and spatial resolution. Therefore, it is necessary to develop a hyperspectral-depth imaging system with both high detection and high data-processing efficiency.

Single-pixel imaging (SPI) is a new computational-imaging method that acquires the object image from a sequence of modulation patterns and corresponding measurements with a single-pixel detector. Specifically, a series of two-dimensional patterns loaded into a spatial light modulator modulate the light reflected off the object, which is then collected by the detector to obtain a series of light-intensity values corresponding to the modulation patterns^[13–16]. The image is reconstructed through the convolution of the intensity values and the modulation patterns. Since the spatial resolution of the image is only related to the modulation patterns, the detector only has to acquire depth and spectral information. With this advantage, SPI has already been applied in the field of depth^[17–20] and spectral imaging^[21–24]. However, most of these studies were limited by the fact that the modulation patterns could only be projected one by one, and only focused on improving one aspect of depth or hyperspectral imaging. The efficiency of luminous flux utilization needs to be further improved.

This study designs a hyperspectral-depth SPI (H-DSPI) system, where a digital micromirror device (DMD) is used as a spatial light modulator and its reflected beams are utilized. One beam is used to obtain spectral data, and the other, depth information; thus, simultaneous acquisition of spectral and depth information is achieved. Compared with existing spectral depth imaging systems, since the spatial information of our hyperspectral-depth image is only related to the intensity value at the single-pixel detector and the modulation pattern and there is only one object imaging lens, there is no need for spatial pixel matching of the spectral and depth images. Moreover, the hyperspectral information and depth information are acquired simultaneously, which improves the efficiency of light-energy utilization and information acquisition. The reconstructed image has a spectral resolution of 1.2 nm in the range of 420 to 780 nm, and the depth measurement error is less than 1 cm. Compared with previous systems, our H-DSPI setup exploits the characteristics of the DMD and makes full use of the optical flux. In addition, compressed sensing is employed to lower the sampling ratio to 25%, as well as reduce the amount of data processing required.

2. Principle

As shown in Fig. 1, a pulsed laser source emits pulses of photons that are reflected off an object and then spatially modulated by the DMD micromirrors, each of which can rotate relative to its normal direction by $\pm 12 °$ , thus forming complementary modulation patterns in the two directions. In one direction, a single-photon avalanche diode (SPAD) and a time-correlated single-photon counter (TCSPC) are used to measure the time of flight (TOF) of the photons from the laser to the detector, while a spectrometer is placed in the other direction.

Figure 1.Schematic diagram of the H-DSPI system.

Download full size

View all figures

The measurement of the H-DSPI system can be expressed as $Y_{D} = Φ X_{D},$ (1) $Y_{s} = - Φ X_{s},$ (2)where $Y_{D} \in R^{M \times L_{1}}$ represents the intensity distribution distance for each modulation pattern, and $Y_{S} \in R^{M \times L_{2}}$ represents the spectral intensity distribution ( $M$ is the number of modulation patterns, $L_{1}$ is the number of channels of different distances, and $L_{2}$ is the number of spectral channels); $Φ \in R^{M \times N}$ represents the set of measurement patterns, and $N$ is the total number of pixels in the image; $X_{D} \in R^{N \times L_{1}}$ represents the images at different distances; $X_{s} \in R^{N \times L_{2}}$ represents the images in the various spectral bands.

The modulated patterns usually consist of a series of Hadamard matrices, all of which are orthogonal to one another, so there should be no redundant modulation. Moreover, different from gray-scale matrices, they are composed of $+ 1$ and $- 1$ elements, which avoids the process of binarization when constructing modulation patterns; element 1 represents a micromirror flip to $+ 12 °$ and element $- 1$ represents a flip to $- 12 °$ .

Different from usual SPI systems that have detectors in only one direction of the DMD, in our scheme, it is necessary to ensure that the modulation patterns are the same in both directions. Thus, we generate one pattern immediately followed by its inverse (complementary). Specifically, the complementary modulation patterns can be expressed as $Φ = {(φ_{1}, φ_{2}, \dots, φ_{M})}^{T},$ (3) $φ_{i} = {\begin{matrix} h_{i}^{T} i \in (1, 3, \dots, 2 k - 1) \\ - h_{(i - 1)}^{T} i \in (2, 4, \dots, 2 k) \end{matrix},$ (4)where $φ_{i}$ represents the $i$ th modulation pattern, $h_{i}$ represents the $i$ th row of the Hadamard matrix, and $2 k = M$ is the number of modulation patterns.

Although this method doubles the number of measurements, it ensures that the modulation patterns are the same in both directions, and the complementary modulation patterns allow the background direct current term of the measured value to be eliminated, giving a better image. Moreover, the sampling ratio can be reduced to a much lower level than that in traditional SPI for the same image quality^[25].

The row vector of $Y_{S}$ is obtained from the beam reflected by the DMD into the spectrometer, which diffracts the beam into a line array charge-coupled device (CCD) to give the one-dimensional intensity distribution of the spectral bands.

Similarly, we require depth information for each modulation pattern, but different from analog CCD detectors, SPAD cannot directly output intensity information and can only identify single photons, so a TCSPC is employed to record the time interval between consecutive laser pulses and photon events, as shown in Fig. 2(a). A large number of modulation patterns are projected and accumulated to obtain a histogram of the distribution of the photon arrival times^[26], as shown in Fig. 2(b).

Figure 2.TCSPC is employed to record the time interval between consecutive laser pulses and photon events. (a) Time distribution of the photon events. (b) Histogram of the photon arrival times.

Download full size

View all figures

Then, the row vector of $Y_{D}$ is obtained by multiplying the time distribution of the photon count intensity by the speed of light for each modulation pattern. Thus, the measurement matrices $Y_{D}$ and $Y_{S}$ can be obtained as $Y_{D} = (\begin{matrix} y_{1} (d_{1}) & y_{1} (d_{2}) & \dots & y_{1} (d_{L_{1}}) \\ y_{2} (d_{1}) & y_{2} (d_{2}) & \dots & y_{2} (d_{L_{1}}) \\ ⋮ & ⋮ & ⋱ & ⋮ \\ y_{M} (d_{1}) & y_{M} (d_{2}) & \dots & y_{M} (d_{L_{1}}) \end{matrix}),$ (5) $Y_{s} = (\begin{matrix} y_{1} (s_{1}) & y_{1} (s_{2}) & \dots & y_{1} (s_{L_{2}}) \\ y_{2} (s_{1}) & y_{2} (s_{2}) & \dots & y_{2} (s_{L_{2}}) \\ ⋮ & ⋮ & ⋱ & ⋮ \\ y_{M} (s_{1}) & y_{M} (s_{2}) & \dots & y_{M} (s_{L_{2}}) \end{matrix}),$ (6)where $d_{i}$ represents the different distances and $s_{i}$ represents the different spectral bands.

Based on the modulation patterns $Φ$ and the measurement data $Y_{D}$ and $Y_{S}$ , we can use a compressed-sensing algorithm to reconstruct the hyperspectral-depth image. The TVAL3 algorithm^[27] is fast and can achieve better reconstruction quality at a low sampling ratio. The image measurement and reconstruction processes are shown in Fig. 3.

Figure 3.Process of image measurement and reconstruction.

Download full size

View all figures

3. Experiment

The experimental setup of the H-DSPI system is shown in Fig. 4.

Figure 4.Experimental setup of the H-DSPI system.

Download full size

View all figures

The light source is a wide-spectrum (360–2600 nm) pulsed laser (Leukos Rock 400, with a repetition rate of 40 MHz). After collimation and beam expansion, the laser beam illuminates two colored cubes separated by a distance of 15 cm. The light reflected from the objects is then focused onto the DMD (ViALUX v-9501), where it is modulated and further reflected along two paths. The beam in the $+ 12 °$ direction is coupled into the spectrometer (Ocean Optics FLAME-S-UV-VIS, 420–780 nm), while that in the $- 12 °$ direction is coupled into the SPAD (MPD-PDM Series), which is connected to the TCSPC (Siminics FT1010). An attenuator is also added in front of the SPAD.

In our experiment, the modulation patterns were constructed from Eq. (4) and a $64 \times 64$ Hadamard matrix, totaling two sets of 4096 patterns. To obtain better reconstructed images under a low sampling ratio^[28], we arranged the modulation patterns in order from low frequency to high frequency. The refresh rate of the DMD was set to 20 Hz, and the total time for full sampling was about 410 s.

4. Experimental Results

Although our system acquires the depth and hyperspectral data at the same time, we still need to reconstruct the spectral image and the depth image separately, and then fuse the two into a hyperspectral-depth image.

4.1 Spectral image reconstruction

The measured spectral data for the first modulation pattern is shown in Fig. 5, covering a range of 420–780 nm, with a spectral resolution of 1.2 nm. In this experiment, a total of 8192 spectral measurements were obtained for all the modulation patterns.

Figure 5.Measured spectrum for the first modulation pattern.

Download full size

View all figures

Based on the spectral data $Y_{s}$ and Eq. (2), 300 images were reconstructed with the TVAL3 algorithm at 1.2 nm intervals in the 420–780 nm spectral range, of which 24 are shown in Fig. 6.

Figure 6.Reconstructed pseudo-color images for different spectral bands.

Download full size

View all figures

To show the spectral images more directly, we reconstruct pseudo-color images according to the CIE1931 color-space model and the gray image of each spectral band, as shown in Fig. 7(a). The images restored for sampling ratios of 100%, 50%, and 25% are shown in Figs. 7(a)–7(c), respectively. We can see that as the sampling ratio decreases, the image quality decreases and the edges become gradually more blurred. However, even at a 25% sampling ratio, the cubes can still be distinguished.

Figure 7.Spectral images for different sampling ratios. The sampling ratios are (a) 100%, (b) 50%, and (c) 25%, respectively.

Download full size

View all figures

To evaluate the image quality quantitatively, the contrast-to-noise ratio (CNR)^[29] is used as an evaluation index, which is defined as $CNR = 10 \lg (S_{i} / σ_{i}),$ (7)where $S_{i} = \sum_{x_{o}} {(T_{o} (x) - \bar{T_{o}})}^{2} / P$ is the variance of the object, $T_{o} (x)$ is the gray value, $\bar{T_{o}}$ is the average gray value, and $P$ is the number of object pixels. The variance of the background is $σ_{i} = \sum_{x_{b}} {(T_{b} (x) - \bar{T_{b}})}^{2} / Q$ , where $T_{b} (x)$ is the gray value, $\bar{T_{b}}$ is the average gray value, $Q$ is the number of background pixels, and $x_{o}$ and $x_{b}$ represent the pixel positions of the object and background, respectively.

4.2 Depth imaging reconstruction

The photon count distributions for different TOF values, i.e., at different distances, are calculated for each modulation pattern, and Fig. 8(a) shows the photon count versus TOF/distance for the first modulation pattern.

Figure 8.Intensity distribution for the first modulation pattern and reconstruction of the objects at different depths. (a) Photon count versus TOF/distance for the first modulation pattern; (b) reconstructed image at distance d₁; and (c) reconstructed image at distance d₂.

Download full size

View all figures

We can clearly see two peaks, due to the reflected photons from the two objects. The left peak represents the closer $3 \times 1$ cube, and the right peak represents the $3 \times 3$ cube. The calculated distances of the two peaks are $d_{1} = c \times t_{1} / 2 = 1.0002 m$ and $d_{2} = c \times t_{2} / 2 = 1.1466 m$ , respectively, which gives a distance of 14.64 cm between the two objects, an error of less than 1 cm from the actual distance of 15 cm. We take the intensities of the two peaks as $y_{1} (d_{1})$ and $y_{1} (d_{2})$ , respectively, to form the first-row vector of $Y_{D}$ . The complete $Y_{D}$ matrix is obtained by performing the same operation for each of the modulations patterns. Then, from $Y_{D}$ and Eq. (1), the TVAL3 algorithm was used to reconstruct two images at the positions $d_{1}$ and $d_{2}$ , as shown in Figs. 8(b) and 8(c). The two images are then binarized and combined to produce the final depth image. The $x - y$ projection image is shown in Fig. 9(a), and the depth image is shown in Fig. 9(d). The $x - y$ projection images restored for sampling ratios of 100%, 50%, and 25% are shown in Figs. 9(a)–9(c), and their depth images are shown in Figs. 9(d)–9(e), respectively.

Figure 9.Depth and x–y projection images for different sampling ratios. (a)–(c) x–y projection images restored for sampling ratios of 100%, 50%, and 25%; (d)–(f) depth images restored for sampling ratios of 100%, 50%, and 25%.

Download full size

View all figures

Although the quality deteriorates as the sampling ratio decreases, the image still can be acquired at a sampling ratio of 25%, and the structures and positions of the two cubes can still be distinguished.

Figure 10 shows the hyperspectral-depth images obtained by combining Fig. 9 and the spectral images in Fig. 7, reconstructed with different sampling ratios. As expected, the imaging quality also deteriorates as the sampling ratio decreases, but we can still distinguish the structures and colors in Fig. 10(c) at the sampling ratio of 25%, which took about 100 s. The corresponding amount of data processing was also reduced by 75%.

Figure 10.Hyperspectral-depth images for sampling ratios of (a) 100%, (b) 50%, and (c) 25%.

Download full size

View all figures

5. Conclusion

In conclusion, a hyperspectral-depth imaging system has been developed based on single-pixel imaging. A spectral resolution of 1.2 nm in the range of 420–780 nm with a depth measurement error better than 1 cm was obtained. The TVAL3 algorithm was used for the image reconstruction, and even at a sampling ratio of 25%, the structures and colors of two cube objects separated by a distance of 15 cm can still be distinguished. Our results demonstrate the potential of a single-pixel imaging system for hyperspectral-depth imaging, especially in scenarios where high efficiency of luminous flux utilization is required, such as in remote sensing, mobile photography, and classification.

Category: Imaging Systems and Image Processing

Received: Oct. 6, 2024

Accepted: Nov. 29, 2024

Posted: Nov. 29, 2024

Published Online: Apr. 30, 2025

The Author Email: Yuanjin Yu (yuanjin.yu@bit.edu.cn)

DOI:10.3788/COL202523.051103

CSTR:32184.14.COL202523.051103