Optoelectronics Letters, Volume. 21, Issue 9, 547(2025)
Point-voxel dual transformer for LiDAR 3D object detection
[1] [1] YU J H, GAO H W, ZHOU D L, et al. Deep temporal model-based identity-aware hand detection for space human-robot interaction[J]. IEEE transactions on cybernetics, 2021, 52(12): 13738-13751.
[2] [2] YU J H, XU Y K, CHEN H, et al. Versatile graph neural networks toward intuitive human activity understanding[J]. IEEE transactions on neural networks and learning systems, 2022.
[3] [3] ZHOU Y, TUZEL O. Voxelnet: end-to-end learning for point cloud based 3D object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, USA. New York: IEEE, 2018: 4490-4499.
[4] [4] DENG J J, SHI S S, LI P W, et al. Voxel R-CNN: towards high performance voxel-based 3D object detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence, February 2-9, 2021, Vancouver, Canada. Washington: AAAI, 2021, 35(2): 1201-1209.
[5] [5] QI C R, SU H, MO K C, et al. Pointnet: deep learning on point sets for 3D classification and segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 2017: 652-660.
[6] [6] QI C R, YI L, SU H, et al. Pointnet++: deep hierarchical feature learning on point sets in a metric space[J]. Advances in neural information processing systems, 2017.
[7] [7] SHI S, WANG X G, LI H S. PointRCNN: 3D object proposal generation and detection from point cloud[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 16-20, 2019, Long Beach, CA, USA. New York: IEEE, 2019: 770-779.
[8] [8] YAN Y, MAO Y X, LI B. SECOND: sparsely embedded convolutional detection[J]. Sensors, 2018, 18(10): 3337.
[9] [9] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.
[10] [10] TONG J G, YANG F H, YANG S, et al. Hyperbolic cosine transformer for LiDAR 3D object detection[EB/OL]. (2022-11-05) [2023-9-18]. https://arxiv.org/abs/2211.05580.
[11] [11] SHENG H L, CAI S J, LIU Y, et al. Improving 3D object detection with channel-wise transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10-17, 2021, Montreal, Canada. New York: IEEE, 2021: 2743-2752.
[12] [12] SHI S S, GUO C X, JIANG L, et al. PV-RCNN: point-voxel feature set abstraction for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 13-19, 2020, Seattle, WA, USA. New York: IEEE, 2020: 10529-10538.
[13] [13] YANG Z T, SUN Y N, LIU S, et al. 3DSSD: point-based 3D single stage object detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 13-19, 2020, Seattle, WA, USA. New York: IEEE, 2020: 11040-11048.
[14] [14] CHEN C, CHEN Z, ZHANG J, et al. SASA: semantics-augmented set abstraction for point-based 3D object detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence, February 22-March 1, 2022, Vancouver, Canada. Washington: AAAI, 2022, 36(1): 221-229.
[15] [15] CHEN Y K, LI Y W, ZHANG X Y, et al. Focal sparse convolutional networks for 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June19-24, 2022, New Orleans, Louisiana, USA. New York: IEEE, 2022: 5428-5437.
[16] [16] HU J S K, KUAI T, WASLANDER S L. Point density-aware voxels for lidar 3D object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June19-24, 2022, New Orleans, Louisiana, USA. New York: IEEE, 2022: 8469-8478.
[17] [17] ZHAO H S, JIANG L, JIA J Y, et al. Point transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10-17, 2021, Montreal, Canada. New York: IEEE, 2021: 16259-16268.
[18] [18] GUO M H, CAI J X, LIU Z N, et al. PCT: point cloud transformer[J]. Computational visual media, 2021, 7(2): 187-199.
[19] [19] GUAN T R, WANG J, LAN S Y, et al. M3DETR: multi-representation, multi-scale, mutual-relation 3D object detection with transformers[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, January 3-8, 2022, Waikoloa, HI, USA. New York: IEEE, 2022.
[20] [20] MAO J G, XUE Y J, NIU M Z, et al. Voxel transformer for 3D object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10-17, 2021, Montreal, Canada. New York: IEEE, 2021: 3164-3173.
[21] [21] XIE E, ZHANG Z Y, ZHANG G D, et al. Light bottle transformer based large scale point cloud classification[J]. Optoelectronics letters, 2023, 19(6): 377-384.
[22] [22] YANG H H, WANG W X, CHEN M H, et al. PVT-SSD: single-stage 3D object detector with point-voxel transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-22, 2023, Vancouver, Canada. New York: IEEE, 2023: 13476-13487.
[23] [23] GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition, June 16-21, 2012, Providence, Rhode Island, USA. New York: IEEE, 2012: 3354-3361.
[24] [24] CARION N, MASSA F, SYNNAEVE G, et al. End-to-end object detection with transformers[C]//European Conference on Computer Vision, August 23-28, 2020, Cham, Glasgow, UK. Heidelberg: Springer, 2020: 213-229.
[25] [25] JIANG B R, LUO R X, MAO J Y, et al. Acquisition of localization confidence for accurate object detection[C]//Proceedings of the European Conference on Computer Vision (ECCV), September 8-14, 2018, Munich, Germany. Heidelberg: Springer, 2018: 784-799.
[26] [26] CHEN X Z, KUNDU K, ZHU Y K, et al. 3D object proposals for accurate object class detection[J]. Advances in neural information processing systems, 2015, 28.
[27] [27] OpenPCDET development team. OpenPCDET: an opensource toolbox for 3D object detection from point clouds[EB/OL]. (2020-01-01) [2023-11-25]. https://github.com/openmmlab/OpenPCDet.
[28] [28] MAO J G, NIU M Z, BAI H Y, et al. Pyramid R-CNN: towards better performance and adaptability for 3D object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10-17, 2021, Montreal, Canada. New York: IEEE, 2021: 2723-2732.
[29] [29] QIAN R, LAI X, LI X R. BADet: boundary-aware 3D object detection from point clouds[J]. Pattern recognition, 2022, 125: 108524.
[30] [30] LI Z Y, YAO Y C, QUAN Z B, et al. Spatial information enhancement network for 3D object detection from point cloud[J]. Pattern recognition, 2022, 128: 108684.
Get Citation
Copy Citation Text
TONG Jigang, YANG Fanhang, YANG Sen, DU Shengzhi. Point-voxel dual transformer for LiDAR 3D object detection[J]. Optoelectronics Letters, 2025, 21(9): 547
Category: Image and Information processing
Received: Jul. 17, 2023
Accepted: Sep. 15, 2025
Published Online: Sep. 15, 2025
The Author Email: