Laser & Optoelectronics Progress, Volume. 62, Issue 2, 0212001(2025)
Multiview 3D Object Detection Based on Improved DETR3D
To overcome the limitations of current multicamera 3D object detection methods, which often struggle to balance precision and computational speed, we propose an enhanced version of DETR3D. The algorithm framework is based on the encoder-decoder architecture of DETR3D. We incorporate a 3D position encoder alongside the image feature extraction branch to enhance image features. Object queries are initialized with two components, representing the object's bounding box and instance features. In the decoder stage, we introduce a multiscale adaptive attention mechanism based on Euclidean distance, allowing the algorithm to effectively capture multiscale information in 3D space, which significantly improves detection performance for complex and diverse objects in autonomous driving scenarios. During feature sampling, we integrate temporal information to align features across consecutive frames, improving detection accuracy. Additionally, multipoint sampling is employed to strengthen the robustness of the sampling process. Experiments conducted on the nuScenes dataset indicate that compared to the baseline algorithm, our proposed approach achieves a 17.1% improvement in detection accuracy and a 4.22-fold increase in computational speed. Moreover, it proves effective in detecting objects even in occluded environments.
Get Citation
Copy Citation Text
Yuhan Zhang, Miaohua Huang, Gengyao Chen, Yanzhou Li, Yiming Wu. Multiview 3D Object Detection Based on Improved DETR3D[J]. Laser & Optoelectronics Progress, 2025, 62(2): 0212001
Category: Instrumentation, Measurement and Metrology
Received: Mar. 18, 2024
Accepted: Apr. 30, 2024
Published Online: Dec. 26, 2024
The Author Email: Huang Miaohua (mh_huang@163.com)
CSTR:32186.14.LOP240912