Laser & Optoelectronics Progress, Volume. 61, Issue 24, 2415004(2024)

3D Object Detection Based on Voxel Self-Attention Auxiliary Networks

Jie Cao1, Yiqiang Peng1,2,3, Likang Fan1,2,3、*, and Longfei Wang1
Author Affiliations
  • 1School of Automobile and Transportation, Xihua University, Chengdu 610039, Sichuan , China
  • 2Vehicle Measurement Control and Safety Key Laboratory of Sichuan Province, Xihua University, Chengdu 610039, Sichuan , China
  • 3Provincial Engineering Research Center for New Energy Vehicle Intelligent Control and Simulation Test Technology of Sichuan, Chengdu 610039, Sichuan , China
  • show less

    A voxel self-attention auxiliary (VSAA) network is proposed to address the issue of poor detection performance in LiDAR object detection algorithms for autonomous driving scenes. This issue stems from a lack of deep understanding of the spatial structure, owing to its reliance on a convolutional neural network (CNN). VSAA network can be directly applied to most voxel-based target detection algorithms to enhance its feature extraction capabilities. First, the VSAA network enhances the efficiency of searching relevant voxels in subsequent self-attention calculations by further constructing voxel hash tables for secondary encoding, based on the foundation of voxel feature encoding. Second, VSAA network applies the self-attention mechanism at the voxel level to capture comprehensive global information and profound contextual semantic information. Finally, this study proposes the VA-SECOND and VA-PVRCNN algorithms by applying VSAA network to the benchmark algorithms SECOND and PV-RCNN, respectively. The features of VSAA network and CNN are fused to compensate for the disadvantage of the small receptive field of the CNN, thus enhancing the detection ability of the algorithm and allowing it to understand an entire spatial scene. Experimental results obtained using the KITTI dataset show that, compared with the benchmark algorithms, VA-SECOND and VA-PVRCNN algorithms improve the average detection accuracy of all detected targets by 1.16 percentage point and 1.54 percentage point, respectively, which proves the effectiveness of the VSAA network.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Jie Cao, Yiqiang Peng, Likang Fan, Longfei Wang. 3D Object Detection Based on Voxel Self-Attention Auxiliary Networks[J]. Laser & Optoelectronics Progress, 2024, 61(24): 2415004

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Machine Vision

    Received: Mar. 19, 2024

    Accepted: Apr. 24, 2024

    Published Online: Dec. 13, 2024

    The Author Email: Likang Fan (BITfanlikang@163.com)

    DOI:10.3788/LOP240923

    CSTR:32186.14.LOP240923

    Topics