Robust HDR 3D reconstruction with an event camera via inverted spatial-shifting Gray-code encoding

Hao Yuan; Qican Zhang; Yajun Wang

doi:10.3788/COL202523.101202

1. Introduction

Optical 3D measurement techniques are widely applied in diverse fields, including intelligent inspection, biomedical applications, and virtual reality, due to their advantages of non-contact and high precision. However, achieving accurate measurements in high-dynamic-range (HDR) scenes still presents challenges, such as camera overexposure caused by highly reflective metal surfaces or low signal-to-noise ratio (SNR) issues from black objects.

Numerous studies have attempted to overcome these challenges. The multiple exposure method is currently the most commonly used HDR 3D measurement technique^[1,2]. By varying the camera’s exposure time, a series of fringe images with different intensities is captured from the measured scene. These images are then fused based on specific rules to obtain a higher SNR stripe pattern, which is subsequently used for HDR 3D reconstruction. In addition to globally adjusting the camera’s exposure time, Feng et al. proposed an adaptive HDR measurement method based on the digital micromirror device (DMD)^[3]. By loading adaptive masks onto the DMD and adjusting the light intensity pixel by pixel, this method significantly extends the imaging dynamic range of conventional cameras. Similarly, the projection light intensity can also be adjusted^[4]. Researchers have proposed a pixel-level optimized adaptive projection method, where a series of patterns with varying light intensities is projected to establish a projection-imaging model of the scene, enabling HDR 3D measurement. Furthermore, by adding imaging or projection units to form multiple viewpoints in basic structured light measurement systems, the differences or complementarity of reflected light from different angles can also improve HDR 3D measurement performance to some extent^[5,6]. Recently, deep learning, as a powerful data analysis method, has also been introduced in HDR scenes to repair degraded structured light patterns or improve the SNR^[7,8]. However, these methods generally rely on high-quality datasets, and their measurement effectiveness may be limited when the SNR of images captured by the camera is too low or when severe saturation occurs.

As mentioned above, most HDR 3D measurement techniques are based on conventional frame-based CCD/CMOS cameras. However, the inherent low dynamic range of frame cameras requires complex algorithms or additional hardware to operate effectively in HDR scenes. To overcome this limitation, event cameras have emerged as an alternative vision sensor. As illustrated in Fig. 1, event cameras capture changes in log-intensity over time as a continuous stream of events, dramatically reducing bandwidth and latency requirements, thereby offering a higher dynamic range. With these advantages, event cameras have gradually been applied to 3D imaging.

Figure 1.Principle of operation of an event camera. (a) Comparison of the output of a frame-based camera and an event camera when viewing a spinning disk with a black circle. (b) Events triggered by changes in log-intensity.

Download full size

View all figures

However, due to the dynamic change detection nature of event cameras and the binarized event data output, traditional 3D measurement techniques cannot be applied directly. Existing studies have attempted to address this challenge by adapting line-structured light technologies or pattern-based structured light technologies. For example, Matsuda et al. leverage the intensity variations during the line scanning process to generate event information associated with the line region, thereby enabling 3D reconstruction^[9]. However, the binary and sparse event camera data present substantial challenges in accurately localizing the line center, while remaining susceptible to noise events. Chen et al. employed a liquid crystal display (LCD) screen to display flashing calibration patterns, thereby generating event data corresponding to the calibration target^[10]. Fourier analysis was then applied to extract phase-based feature points, enabling accurate calibration. Nevertheless, the binarized event data, noise events, and timestamp jitter continue to pose significant challenges for achieving high-precision 3D reconstruction^[11–13].

In addressing these challenges, we found that the binary projection strategy of Gray-code-based 3D reconstruction naturally aligns with the imaging mechanism of event cameras, offering high resistance to the above interference. However, two critical issues remain to be addressed for achieving high-accuracy 3D reconstruction using the Gray code: 1)The first challenge is accurately identifying Gray code boundaries from the event data captured by the event camera;2)The second challenge is overcoming the blurring of high-order dense Gray code patterns in event-based scenes.

This work focuses on these two issues and, respectively, proposes the inverted Gray code strategy and the spatial-shifting Gray-code encoding method to address these challenges.

2. Principle

2.1. Inverted Gray code strategy for event-based 3D reconstruction

To generate events corresponding to the Gray code regions, an intuitive approach is to project flashing Gray code patterns, which is equivalent to projecting a completely black pattern followed by a Gray code pattern^[14,15]. When the projector projects a flashing Gray code pattern onto the measured object, events are recorded and output by the event camera when the logarithmic intensity change of the corresponding region of the pixel exceeds the threshold. The triggering condition for an event can be written as $p (u, v, t) = {\begin{matrix} + 1, & if \log [\frac{I (u, v, t)}{I (u, v, t_{0})}] > C_{p} \\ - 1, & if \log [\frac{I (u, v, t)}{I (u, v, t_{0})}] < C_{n} \end{matrix},$ (1)where $I$ represents the intensity, $u$ and $v$ denote the pixel coordinates, $t$ is the timestamp index, and $t_{0}$ is the timestamp of the most recent event. A positive polarity event is triggered when the logarithmic intensity change exceeds the threshold $C_{p}$ , while a negative polarity event is generated when it falls below the threshold $C_{n}$ .

In the ideal optical measurement system, regions corresponding to Gray code words of 1 will exhibit a change in intensity, which will be captured and recorded by the event camera. Conversely, regions corresponding to Gray code words of 0 will not undergo any intensity variation, and thus, no events will be recorded by the camera. However, due to the point spread function (PSF) of the particle optical system, the intensity distribution captured by the event camera is the convolution of the system’s PSF with the original Gray code pattern: $I (u, v, t) = h (u, v, x, y; ξ) \otimes G_{n} (x, y),$ (2)where $h (u, v; x, y; ξ)$ represents the optical PSF, which is jointly modulated by the constructed measurement system and the object, and $ξ$ denotes the system parameters. $G_{n} (x, y)$ represents the $n$ th Gray code pattern.

This convolution effect causes the boundaries of the Gray code to become blurred, leading to significant intensity variations in regions where the Gray code is supposed to be 0, resulting in errors in recognizing the Gray code pattern and affecting reconstruction accuracy, as shown in Fig. 2(a). Increasing the threshold $C$ can reduce this error to some extent, but this global operation may filter out low-reflectance regions where the Gray code is 1, as those regions fail to generate events due to minimal intensity variations.

Figure 2.Diagram of the main bottlenecks in applying Gray code to 3D reconstruction under an event camera. (a) Errors in extracting the boundaries of the Gray code pattern. (b) Diagram of the proposed inverted Gray code strategy.

Download full size

View all figures

Furthermore, the background light intensity caused by scattering or multiple reflections in the scene depends on the total light output of the projector. Therefore, the flashing patterns will cause significant variations in background light, leading to recognition failure.

To address these issues, the inverted Gray code strategy is proposed. First, the original Gray code pattern is inverted to generate the inverted Gray code pattern, and then the projector continuously projects both the original pattern and the inverted Gray code pattern. As a result, regions with opposite code words in the encoded Gray code pattern undergo opposing intensity variations during the projection process, resulting in events with opposite polarities, as illustrated in Fig. 2(b). This improves the accuracy of the Gray code pattern boundaries.

Moreover, since each encoded region in the inverted Gray code pattern experiences considerable intensity changes due to the inversion of code words, it exhibits enhanced resistance to background light, thus increasing robustness. Additionally, since the number of 0 and 1 code words in the Gray code pattern is approximately balanced, the inversion operation does not induce significant fluctuations in the total light output of the projector, leading to greater stability in background light during projection.

2.2. Spatial-shifting Gray-code encoding method

To achieve higher resolution and accuracy in measurements, the use of high-order Gray codes is inevitable. High-order Gray codes contain more encoding information, enabling them to represent more depth information, as illustrated in Fig. 3(a). However, the PSF of the optical measurement system functions as a low-pass filter, which attenuates the high-frequency encoding information. As the density of the Gray code pattern increases, the blurring effect introduced by the optical system’s PSF becomes more pronounced, leading to recognition failure, as illustrated in Fig. 3(b). Furthermore, since the background light intensity of inverted Gray code patterns remains stable, intensity variations in high-frequency regions may not exceed the threshold required for event generation, leading to recognition failure.

Figure 3.Schematic diagram of the proposed spatial-shifting Gray-code encoding method. (a) The original Gray code patterns. (b) The failure of the high-order dense Gray codes due to blurring. (c) The spatial-shifting Gray code patterns.

Download full size

View all figures

To encode all projector columns, the total number $n$ of Gray code patterns can be calculated by $n = \log_{2} (c) .$ (3)

In Eq. (3), $c$ denotes the column index of the projector. Assuming that the boundaries of the last $m$ Gray code patterns fail to be extracted due to blurring, only the first $n - m$ patterns are effectively used for measurement. Consequently, the resulting depth resolution is reduced to $2^{- m}$ times the original resolution. Furthermore, within each encoding region, the $2^{m}$ projector columns share the same code word, making them indistinguishable from each other.

To overcome this issue, this work proposes the spatial-shifting Gray-code encoding method. First, the original valid Gray code patterns are shifted, causing a misalignment between the encoding regions of the shifted patterns and the original Gray-code encoding regions. This misalignment ensures that each indistinguishable region in the original encoding undergoes a unique variation in this process. The shifted code words are then combined with the original code words, and the combined code words can uniquely encode all projector columns. Figure 3 illustrates the principle of the proposed method.

In the general case, the number of additional shifted Gray code patterns depends on the width of the indistinguishable region in the valid Gray code pattern.

3. Experiment

We designed and conducted a series of experiments to validate the effectiveness of the proposed methods, including measurements on sculptures and HDR scenes. The experimental system comprises a projector (DLP6500, resolution: $1920 \times 1080$ ) and an event camera (PROPHESEE EVK3, resolution: $1280 \times 720$ ). To fully encode the 1920 columns of the projector, 11 traditional Gray code patterns are required.

Before measurement, system calibration was performed to establish a pixel-wise mapping between Gray code words and object height. The system calibration depth is 100 mm, with six calibration planes placed at 0, 20, 40, 60, 80, and 100 mm. Based on the triangulation geometry of the system, the decoded Gray code word at each pixel is expected to be proportional to the corresponding height. Therefore, a mapping relationship can be obtained by performing linear fitting between the decoded Gray code words and the known heights of the calibration planes, which is then used to recover the height information of the measured object. However, due to unavoidable optical distortions such as lens aberrations, the actual relationship exhibits nonlinearity. To improve fitting accuracy, higher-order polynomial regression is employed in practice.

First, to validate the proposed inverted Gray code strategy, we projected the first-order flashing Gray code patterns and the inverted Gray code patterns onto the sculpture and extracted the event information from each, as shown in Fig. 4. The flashing Gray code method involves projecting a full black pattern and a Gray code pattern, which causes significant variations in the total light intensity output by the projector, leading to a drastic increase in background light intensity across the scene. As a result, a large portion of the areas on the sculpture originally with the code word of 0 also generate positive events due to the increased background light. This severely impacts the accurate recognition of the Gray code boundaries. In contrast, the proposed method exhibits opposite and strong intensity variations in the code word regions for 0 and 1, making it more robust and yielding clearer and accurate boundaries.

Figure 4.Comparison of boundary identification results between the proposed inverted Gray code method and the conventional flashing Gray code method.

Download full size

View all figures

When higher-order inverted Gray code patterns were further projected, the code word information with high spatial frequencies became increasingly blurred due to the PSF of the optical system. Additionally, since the background light of the inverted Gray code patterns remained relatively stable, the event camera was less likely to record events in regions corresponding to higher spatial frequencies. As shown in Fig. 5, when projecting the eighth and ninth sets of inverted Gray code patterns, almost all areas of the sculpture generated events corresponding to the code words, with clearly defined Gray code boundaries. However, for the 10th and 11th sets of patterns, regions such as the sculpture’s face and hair exhibited an absence of recorded events due to excessively high spatial frequencies.

Figure 5.Event information extracted under the projection of inverted Gray code patterns with different spatial frequencies.

Download full size

View all figures

To mitigate the impact of blurred higher-order Gray code patterns on measurement accuracy, we use only the first nine sets of patterns for surface shape measurements. The reconstruction results are shown in Fig. 6(a). A zoomed-in view of the surface shape reveals a distinct stepped appearance, which is attributed to the loss of three-quarters of the depth resolution when using only these nine sets of patterns.

Figure 6.Measurements on a sculpture. (a) The result of the proposed inverted Gray code method. (b) The result of the proposed spatial-shifting inverted Gray code method.

Download full size

View all figures

The indistinguishable region of using the first nine sets of patterns encompasses four projector columns. Accordingly, it is necessary to apply an additional three spatial shifts to the original inverted Gray code pattern, and combine the shifted patterns with the original pattern to uniquely encode all the projector columns. Figure 6(b) shows the reconstruction results of the spatial-shifting inverted Gray code method. It can be observed that this method offers a higher depth resolution, with the recovered surface shape exhibiting smoother and more continuous surface variations.

To validate the effectiveness of the proposed method in HDR scenarios, we further conducted measurements in a scene containing a highly reflective metal workpiece and a low-reflectance panel. In addition, measurements were performed using both a conventional frame-based camera and an event camera to compare the reconstruction results obtained from each camera.

In the measurements conducted with the frame-based camera, two different measurement methods were configured. In the first method, a conventional three-step phase-shifting algorithm was applied under a single exposure to simulate general measurement conditions. The corresponding result is shown in Fig. 7(b). In the second method, 10 exposure levels with varying intensities were employed. For each exposure level, a six-step phase-shifting algorithm was used to compute the phase, and high-quality phase information was obtained by fusing the results based on fringe modulation for subsequent 3D reconstruction. The result is illustrated in Fig. 7(d). For measurements using the event camera, the projector intensity was kept consistent with that used in the frame-based experiments, and the proposed method was adopted. The results are presented in Figs. 7(e) and 7(f).

Figure 7.Measurement on an HDR scene. (a) Diagram of the HDR scene, which includes a highly reflective metal workpiece and a low-reflectance panel. (b) The result of the three-step phase-shifting method under one exposure. (c) Diagram of multiple exposures of the HDR scene. (d) Multi-exposure fusion reconstruction result. (e), (f) The respective reconstruction results of the two proposed methods using an event camera.

Download full size

View all figures

As illustrated, in the single-exposure condition, the reconstruction results obtained by the frame-based camera were significantly degraded in HDR scenes due to overexposure and low-SNR regions. Although multi-exposure fusion combined with a higher-step phase-shifting algorithm partially mitigated these effects, the frame-based camera still struggled to provide high-quality reconstruction in the presence of extreme overexposure and severely low SNR. In contrast, the proposed method using the event camera enables robust and accurate measurements even in regions where the frame-based camera fails due to saturation and noise. Moreover, since there is no need to repeatedly adjust the camera or projector parameters, the measurement efficiency is significantly improved.

4. Conclusion

In conclusion, we propose a robust HDR 3D reconstruction method based on Gray codes using an event camera. First, we identify two key bottlenecks that limit the accuracy and resolution of Gray-code-based 3D reconstruction with event cameras: the precise extraction of Gray code boundaries and the failure of high-order dense Gray codes. To address these challenges, we introduce the inverted Gray code strategy and the spatial-shifting Gray-code encoding method. By combining the two methods, our approach enables accurate and robust measurements of both general and HDR scenes.

Category: Instrumentation, Measurement, and Optical Sensing

Received: Apr. 29, 2025

Accepted: Jul. 8, 2025

Published Online: Sep. 15, 2025

The Author Email: Yajun Wang (yjwangisu@scu.edu.cn)

DOI:10.3788/COL202523.101202

CSTR:32184.14.COL202523.101202