Laser & Optoelectronics Progress, Volume. 62, Issue 16, 1615007(2025)

Multi-Modal 3D Object Detection Algorithm Based on Kolmogorov-Arnold Network

Yanwu Ling1,2, Junmin Rao2, Yan Li2, and Fanming Li2、*
Author Affiliations
  • 1School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
  • 2Shanghai Institute of Technical Physics, Chinese Academy of Sciences, Shanghai 200083, China
  • show less

    In order to solve the problems of insufficient interpretability, the increase in training cost due to the increase in the number of neurons, and the poor fusion effect of image and point cloud data in the traditional multilayer perceptron (MLP) model, a multi-modal 3D object detection network based on Kolmogorov?Arnold network (KAN) is proposed. In this network, KAN is used as the backbone, and a voxel feature encoder KANDyVFE combined with a fusion layer is designed. The fusion layer uses a self-attention mechanism to dynamically fuse image and point cloud features. In addition, cropping RGB images and rendering to generate colored point clouds also enhance point cloud feature expression. Experimental results on the KITTI data set show that compared with the baseline method SECOND, the mean average precision of the network is improved by 3.78 percentage points in the bird's-eye view detection of automobile category and 3.75 percentage points in the 3D object detection. The visualization results show that the method performs well in reducing false positives and missed detections, and verify the effectiveness of KAN in point cloud applications. The ablation experiments further prove that the proposed network has good detection performance.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Yanwu Ling, Junmin Rao, Yan Li, Fanming Li. Multi-Modal 3D Object Detection Algorithm Based on Kolmogorov-Arnold Network[J]. Laser & Optoelectronics Progress, 2025, 62(16): 1615007

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Machine Vision

    Received: Jan. 20, 2025

    Accepted: Mar. 14, 2025

    Published Online: Aug. 11, 2025

    The Author Email: Fanming Li (lfmjws@163.com)

    DOI:10.3788/LOP250553

    CSTR:32186.14.LOP250553

    Topics