Laser & Optoelectronics Progress, Volume. 62, Issue 16, 1615007(2025)
Multi-Modal 3D Object Detection Algorithm Based on Kolmogorov-Arnold Network
In order to solve the problems of insufficient interpretability, the increase in training cost due to the increase in the number of neurons, and the poor fusion effect of image and point cloud data in the traditional multilayer perceptron (MLP) model, a multi-modal 3D object detection network based on Kolmogorov?Arnold network (KAN) is proposed. In this network, KAN is used as the backbone, and a voxel feature encoder KANDyVFE combined with a fusion layer is designed. The fusion layer uses a self-attention mechanism to dynamically fuse image and point cloud features. In addition, cropping RGB images and rendering to generate colored point clouds also enhance point cloud feature expression. Experimental results on the KITTI data set show that compared with the baseline method SECOND, the mean average precision of the network is improved by 3.78 percentage points in the bird's-eye view detection of automobile category and 3.75 percentage points in the 3D object detection. The visualization results show that the method performs well in reducing false positives and missed detections, and verify the effectiveness of KAN in point cloud applications. The ablation experiments further prove that the proposed network has good detection performance.
Get Citation
Copy Citation Text
Yanwu Ling, Junmin Rao, Yan Li, Fanming Li. Multi-Modal 3D Object Detection Algorithm Based on Kolmogorov-Arnold Network[J]. Laser & Optoelectronics Progress, 2025, 62(16): 1615007
Category: Machine Vision
Received: Jan. 20, 2025
Accepted: Mar. 14, 2025
Published Online: Aug. 11, 2025
The Author Email: Fanming Li (lfmjws@163.com)
CSTR:32186.14.LOP250553