3D Object Detection Based on Voxel Self-Attention Auxiliary Networks

Jie Cao; Yiqiang Peng; Likang Fan; Longfei Wang

doi:10.3788/LOP240923

Laser & Optoelectronics Progress, Volume. 61, Issue 24, 2415004(2024)

3D Object Detection Based on Voxel Self-Attention Auxiliary Networks

Jie Cao¹, Yiqiang Peng^1,2,3, Likang Fan^1,2,3、*, and Longfei Wang¹

Author Affiliations

¹School of Automobile and Transportation, Xihua University, Chengdu 610039, Sichuan , China

²Vehicle Measurement Control and Safety Key Laboratory of Sichuan Province, Xihua University, Chengdu 610039, Sichuan , China

³Provincial Engineering Research Center for New Energy Vehicle Intelligent Control and Simulation Test Technology of Sichuan, Chengdu 610039, Sichuan , China

show less

Abstract Get PDF(in Chinese)

A voxel self-attention auxiliary (VSAA) network is proposed to address the issue of poor detection performance in LiDAR object detection algorithms for autonomous driving scenes. This issue stems from a lack of deep understanding of the spatial structure, owing to its reliance on a convolutional neural network (CNN). VSAA network can be directly applied to most voxel-based target detection algorithms to enhance its feature extraction capabilities. First, the VSAA network enhances the efficiency of searching relevant voxels in subsequent self-attention calculations by further constructing voxel hash tables for secondary encoding, based on the foundation of voxel feature encoding. Second, VSAA network applies the self-attention mechanism at the voxel level to capture comprehensive global information and profound contextual semantic information. Finally, this study proposes the VA-SECOND and VA-PVRCNN algorithms by applying VSAA network to the benchmark algorithms SECOND and PV-RCNN, respectively. The features of VSAA network and CNN are fused to compensate for the disadvantage of the small receptive field of the CNN, thus enhancing the detection ability of the algorithm and allowing it to understand an entire spatial scene. Experimental results obtained using the KITTI dataset show that, compared with the benchmark algorithms, VA-SECOND and VA-PVRCNN algorithms improve the average detection accuracy of all detected targets by 1.16 percentage point and 1.54 percentage point, respectively, which proves the effectiveness of the VSAA network.

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Keywords

automatic drive LiDAR object detection self-attention voxel

Tools

Get Citation

Copy Citation Text

Jie Cao, Yiqiang Peng, Likang Fan, Longfei Wang. 3D Object Detection Based on Voxel Self-Attention Auxiliary Networks[J]. Laser & Optoelectronics Progress, 2024, 61(24): 2415004

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites