Three-Dimensional Object Detection Technology Based on Point Cloud Data

Jianan Li; Ze Wang; Tingfa Xu

doi:10.3788/AOS230745

Acta Optica Sinica, Volume. 43, Issue 15, 1515001(2023)

Three-Dimensional Object Detection Technology Based on Point Cloud Data

Jianan Li^1,2, Ze Wang¹, and Tingfa Xu^1,2,3、*

Author Affiliations

¹School of Optoelectronics, Beijing Institute of Technology, Beijing 100081, China

²Key Laboratory of Photoelectronic Imaging Technology and System, Ministry of Education, Beijing Institute of Technology, Beijing 100081, China

³Chongqing Innovation Center, Beijing Institute of Technology, Chongqing 401135, China

show less

Abstract Get PDF(in Chinese)

References(82)

[1] Lalonde J F, Unnikrishnan R, Vandapel N et al. Scale selection for classification of point-sampled 3D surfaces[C], 285-292(2005).

[2] Gao Z H, Liu X W. Support vector machine and object-oriented classification for urban impervious surface extraction from satellite imagery[C](2014).

[3] Zheng G, Zhong L, Li Y F et al. A random forest based method for urban object classification using lidar data and aerial imagery[C](2016).

[4] Munoz D, Bagnell J A, Vandapel N et al. Contextual classification with functional Max-Margin Markov Networks[C], 975-982(2009).

[5] Niemeyer J, Rottensteiner F, Soergel U. Contextual classification of lidar data and building object detection in urban areas[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 87, 152-165(2014).

[6] LeCun Y, Bengio Y, Hinton G. Deep learning[J]. Nature, 521, 436-444(2015).

[7] Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C], 3354-3361(2012).

[8] Maturana D, Scherer S. VoxNet: a 3D Convolutional Neural Network for real-time object recognition[C], 922-928(2015).

[9] Xu Y, Hoegner L, Tuttas S et al. Voxel- and graph-based point cloud segmentation of 3D scenes using perceptual grouping laws[J]. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, IV-1/W1, 43-50(2017).

[10] Qi C R, Hao S, Mo K C et al. PointNet: deep learning on point sets for 3D classification and segmentation[C], 77-85(2017).

[11] Qi C R, Litany O, He K M et al. Deep Hough voting for 3D object detection in point clouds[C], 9276-9285(2020).

[12] Qi X J, Liao R J, Jia J Y et al. 3D graph neural networks for RGBD semantic segmentation[C], 5209-5218(2017).

[13] Landrieu L, Simonovsky M. Large-scale point cloud semantic segmentation with superpoint graphs[C], 4558-4567(2018).

[14] Bi Y, Chadha A, Abbas A et al. Graph-based object classification for neuromorphic vision sensing[C], 491-501(2020).

[15] Wang Y, Sun Y B, Liu Z W et al. Dynamic graph CNN for learning on point clouds[J]. ACM Transactions on Graphics, 38, 1-12.

[16] Zhou Y, Tuzel O. VoxelNet: end-to-end learning for point cloud based 3D object detection[C], 4490-4499(2018).

[17] Yan Y, Mao Y X, Li B. SECOND: sparsely embedded convolutional detection[J]. Sensors, 18, 3337(2018).

[18] Lang A H, Vora S, Caesar H et al. PointPillars: fast encoders for object detection from point clouds[C], 12689-12697(2020).

[19] Deng J J, Shi S S, Li P W et al. Voxel R-CNN: towards high performance voxel-based 3D object detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 1201-1209(2021).

[20] Wu H, Wen C L, Li W et al. Transformation-equivariant 3D object detection for autonomous driving[EB/OL]. https://arxiv.org/abs/2211.11962

[21] Zheng W, Tang W L, Chen S J et al. CIA-SSD: confident IoU-aware single-stage object detector from point cloud[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 3555-3562(2021).

[22] Zheng W, Tang W L, Jiang L et al. SE-SSD: self-ensembling single-stage object detector from point cloud[C], 14489-14498(2021).

[23] Yin T W, Zhou X Y, Krähenbühl P. Center-based 3D object detection and tracking[C], 11779-11788(2021).

[24] Vaswani A, Shazeer N, Parmar N et al. Attention is all you need[C], 6000-6010(2017).

[25] Mao J G, Xue Y J, Niu M Z et al. Voxel transformer for 3D object detection[C], 3144-3153(2022).

[26] Sheng H L, Cai S J, Liu Y et al. Improving 3D object detection with channel-wise transformer[C], 2723-2732(2022).

[27] He C H, Li R H, Li S et al. Voxel set transformer: a set-to-set approach to 3D object detection from point clouds[C], 8407-8417(2022).

[28] Dong S, Ding L, Wang H et al. MsSVT: mixed-scale sparse voxel transformer for 3D object detection on point clouds[C](2022).

[29] Ding L H, Dong S C, Xu T F et al. FH-net: a fast hierarchical network for scene flow estimation on real-world point clouds[M]. Avidan S, Brostow G, Cissé M, et al. Computer vision-ECCV 2022. Lecture notes in computer science, 13699, 213-229(2022).

[30] Zhang Y N, Chen J X, Huang D. CAT-det: contrastively augmented transformer for multimodal 3D object detection[C], 898-907(2022).

[31] Xie Q, Lai Y K, Wu J et al. MLCVNet: multi-level context VoteNet for 3D object detection[C], 10444-10453(2020).

[32] Xie Q, Lai Y K, Wu J et al. Vote-based 3D object detection with context modeling and SOB-3DNMS[J]. International Journal of Computer Vision, 129, 1857-1874(2021).

[33] Chen X X, Zhao H, Zhou G Y et al. PQ-transformer: jointly parsing 3D objects and layouts from point clouds[J]. IEEE Robotics and Automation Letters, 7, 2519-2526(2022).

[34] Misra I, Girdhar R, Joulin A. An end-to-end transformer model for 3D object detection[C], 2886-2897(2022).

[35] Xu X, Dong S, Xu T et al. FusionRCNN: LiDAR-camera fusion for two-stage 3D object detection[J]. Remote Sensing, 15, 1839(2023).

[36] Hu J S K, Kuai T S, Waslander S L. Point density-aware voxels for LiDAR 3D object detection[C], 8459-8468(2022).

[37] Qi C R, Yi L, Su H et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C], 5105-5114(2017).

[38] Shi S S, Wang X G, Li H S. PointRCNN: 3D object proposal generation and detection from point cloud[C], 770-779(2020).

[39] Yang Z T, Sun Y N, Liu S et al. 3DSSD: point-based 3D single stage object detector[C], 11037-11045(2020).

[40] Zhang Y F, Hu Q Y, Xu G Q et al. Not all points are equal: learning highly efficient point-based detectors for 3D LiDAR point clouds[C], 18931-18940(2022).

[41] Chen C, Chen Z, Zhang J et al. SASA: semantics-augmented set abstraction for point-based 3D object detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 221-229(2022).

[42] Shi W J, Rajkumar R. Point-GNN: graph neural network for 3D object detection in a point cloud[C], 1708-1716(2020).

[43] Zhang Y N, Huang D, Wang Y H. PC-RGNN: point cloud completion and graph neural network for 3D object detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 3430-3437(2021).

[44] Yang H H, Liu Z L, Wu X P et al. Graph R-CNN: towards accurate 3D object detection with semantic-decorated local graph[M]. Avidan S, Brostow G, Cissé M, et al. Computer vision–ECCV 2022. Lecture notes in computer science, 13668, 662-679(2022).

[45] Chen Y L, Liu S, Shen X Y et al. Fast point R-CNN[C], 9774-9783(2020).

[46] Yang Z T, Sun Y N, Liu S et al. STD: sparse-to-dense 3D object detector for point cloud[C], 1951-1960(2020).

[47] Shi S S, Guo C X, Jiang L et al. PV-RCNN: point-voxel feature set abstraction for 3D object detection[C], 10526-10535(2020).

[48] He C H, Zeng H, Huang J Q et al. Structure aware single-stage 3D object detection from point cloud[C], 11870-11879(2020).

[49] Liu Z J, Tang H T, Amini A et al. BEVFusion: multi-task multi-sensor fusion with unified bird's-eye view representation[EB/OL]. https://arxiv.org/abs/2205.13542

[50] Qi C R, Liu W, Wu C X et al. Frustum PointNets for 3D object detection from RGB-D data[C], 918-927(2018).

[51] Wang Z X, Jia K. Frustum ConvNet: sliding Frustums to aggregate local point-wise features for amodal 3D object detection[C], 1742-1749(2020).

[52] Chen X Z, Ma H M, Wan J et al. Multi-view 3D object detection network for autonomous driving[C], 6526-6534(2017).

[53] Ku J, Mozifian M, Lee J et al. Joint 3D proposal generation and object detection from view aggregation[C](2019).

[54] Chen X Y, Zhang T Y, Wang Y et al. FUTR3D: a unified sensor fusion framework for 3D detection[EB/OL]. https://arxiv.org/abs/2203.10642

[55] Bai X Y, Hu Z Y, Zhu X G et al. TransFusion: robust LiDAR-camera fusion for 3D object detection with transformers[C], 1080-1089(2022).

[56] Vora S, Lang A H, Helou B et al. PointPainting: sequential fusion for 3D object detection[C], 4603-4611(2020).

[57] Yin T W, Zhou X Y, Krähenbühl P. Multimodal virtual point 3D detection[EB/OL]. https://arxiv.org/abs/2111.06881

[58] Liang M, Yang B, Wang S L et al. Deep continuous fusion for multi-sensor 3D object detection[M]. Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision–ECCV 2018. Lecture notes in computer science, 11220, 663-678(2018).

[59] Liang M, Yang B, Chen Y et al. Multi-task multi-sensor fusion for 3D object detection[C], 7337-7345(2020).

[60] Wang Y D, Tian Y L, Li G Q et al. 3D object detection based on convolutional neural networks: a survey[J]. Pattern Recognition and Artificial Intelligence, 34, 1103-1119(2021).

[61] Sun P, Kretzschmar H, Dotiwalla X et al. Scalability in perception for autonomous driving: waymo open dataset[C], 2443-2451(2020).

[62] Caesar H, Bankiti V, Lang A H et al. nuScenes: a multimodal dataset for autonomous driving[C], 11618-11628(2020).

[63] Cong P S, Zhu X G, Qiao F et al. STCrowd: a multimodal dataset for pedestrian perception in crowded scenes[C], 19608-19617(2022).

[64] Silberman N, Hoiem D, Kohli P et al. Indoor segmentation and support inference from RGBD images[M]. Fitzgibbon A, Lazebnik S, Perona P, et al. Computer vision–ECCV 2012. Lecture notes in computer science, 7576, 746-760(2012).

[65] Xiao J X, Owens A, Torralba A. SUN3D: a database of big spaces reconstructed using SfM and object labels[C], 1625-1632(2013).

[66] Song S R, Lichtenberg S P, Xiao J X. SUN RGB-D: a RGB-D scene understanding benchmark suite[C], 567-576(2015).

[67] Dai A, Chang A X, Savva M et al. ScanNet: richly-annotated 3D reconstructions of indoor scenes[C], 2432-2443(2017).

[68] Qian R, Lai X, Li X R. 3D object detection for autonomous driving: a survey[J]. Pattern Recognition, 130, 108796(2022).

[69] Simonelli A, Bulò S R, Porzi L et al. Disentangling monocular 3D object detection[C], 1991-1999(2020).

[70] Shi S S, Wang Z, Shi J P et al. From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 2647-2664(2021).

[71] Liu Z, Zhao X, Huang T T et al. TANet: robust 3D object detection from point clouds with triple attention[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 11677-11684(2020).

[72] Yi H W, Shi S S, Ding M Y et al. SegVoxelNet: exploring semantic context and depth-aware features for 3D vehicle detection from point cloud[C], 2274-2280(2020).

[73] Qian X L, Wang L, Zhu Y et al. ImpDet: exploring implicit fields for 3D object detection[C], 4249-4259(2023).

[74] Zhou Y, Sun P, Zhang Y et al. End-to-end multi-view fusion for 3D object detection in LiDAR point clouds[EB/OL]. https://arxiv.org/abs/1910.06528

[75] Wang Y E, Fathi A, Kundu A et al. Pillar-based object detection for autonomous driving[M]. Vedaldi A, Bischof H, Brox T, et al. Computer vision–ECCV 2020. Lecture notes in computer science, 12367, 18-34(2020).

[76] Li Z C, Wang F, Wang N Y. LiDAR R-CNN: an efficient and universal 3D object detector[C], 7542-7551(2021).

[77] Miao Z W, Chen J K, Pan H Y et al. PVGNet: a bottom-up one-stage 3D object detector with integrated multi-level features[C], 3278-3287(2021).

[78] Mao J G, Niu M Z, Bai H Y et al. Pyramid R-CNN: towards better performance and adaptability for 3D object detection[C], 2703-2712(2022).

[79] Wang Y, Chao W L, Garg D et al. Pseudo-LiDAR from visual depth estimation: bridging the gap in 3D object detection for autonomous driving[C], 8437-8445(2020).

[80] Meng Q H, Wang W G, Zhou T F et al. Weakly supervised 3D object detection from lidar point cloud[M]. Vedaldi A, Bischof H, Brox T, et al. Computer Vision–ECCV 2020. Lecture notes in computer science, 12358, 515-531(2020).

[81] Zhang Z W, Girdhar R, Joulin A et al. Self-supervised pretraining of 3D features on any point-cloud[C], 10232-10243(2022).

[82] Luo Z P, Cai Z A, Zhou C Q et al. Unsupervised domain adaptive 3D detection with multi-level consistency[C], 8846-8855(2022).

Tools

Get Citation

Copy Citation Text

Jianan Li, Ze Wang, Tingfa Xu. Three-Dimensional Object Detection Technology Based on Point Cloud Data[J]. Acta Optica Sinica, 2023, 43(15): 1515001

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites