Optoelectronics Letters, Volume. 21, Issue 9, 547(2025)
Point-voxel dual transformer for LiDAR 3D object detection
In this paper, a two-stage light detection and ranging (LiDAR) three-dimensional (3D) object detection framework is presented, namely point-voxel dual transformer (PV-DT3D), which is a transformer-based method. In the proposed PV-DT3D, point-voxel fusion features are used for proposal refinement. Specifically, keypoints are sampled from entire point cloud scene and used to encode representative scene features via a proposal-aware voxel set abstraction module. Subsequently, following the generation of proposals by the region proposal networks (RPN), the internal encoded keypoints are fed into the dual transformer encoder-decoder architecture. In 3D object detection, the proposed PV-DT3D takes advantage of both point-wise transformer and channel-wise architecture to capture contextual information from the spatial and channel dimensions. Experiments conducted on the highly competitive KITTI 3D car detection leaderboard show that the PV-DT3D achieves superior detection accuracy among state-of-the-art point-voxel-based methods.
Get Citation
Copy Citation Text
TONG Jigang, YANG Fanhang, YANG Sen, DU Shengzhi. Point-voxel dual transformer for LiDAR 3D object detection[J]. Optoelectronics Letters, 2025, 21(9): 547
Category: Image and Information processing
Received: Jul. 17, 2023
Accepted: Sep. 15, 2025
Published Online: Sep. 15, 2025
The Author Email: