Acta Optica Sinica, Volume. 43, Issue 15, 1515001(2023)

Three-Dimensional Object Detection Technology Based on Point Cloud Data

Jianan Li1,2, Ze Wang1, and Tingfa Xu1,2,3、*
Author Affiliations
  • 1School of Optoelectronics, Beijing Institute of Technology, Beijing 100081, China
  • 2Key Laboratory of Photoelectronic Imaging Technology and System, Ministry of Education, Beijing Institute of Technology, Beijing 100081, China
  • 3Chongqing Innovation Center, Beijing Institute of Technology, Chongqing 401135, China
  • show less
    References(82)

    [1] Lalonde J F, Unnikrishnan R, Vandapel N et al. Scale selection for classification of point-sampled 3D surfaces[C], 285-292(2005).

    [2] Gao Z H, Liu X W. Support vector machine and object-oriented classification for urban impervious surface extraction from satellite imagery[C](2014).

    [3] Zheng G, Zhong L, Li Y F et al. A random forest based method for urban object classification using lidar data and aerial imagery[C](2016).

    [4] Munoz D, Bagnell J A, Vandapel N et al. Contextual classification with functional Max-Margin Markov Networks[C], 975-982(2009).

    [5] Niemeyer J, Rottensteiner F, Soergel U. Contextual classification of lidar data and building object detection in urban areas[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 87, 152-165(2014).

    [6] LeCun Y, Bengio Y, Hinton G. Deep learning[J]. Nature, 521, 436-444(2015).

    [7] Geiger A, Lenz P, Urtasun R. Are we ready for autonomous driving? The KITTI vision benchmark suite[C], 3354-3361(2012).

    [8] Maturana D, Scherer S. VoxNet: a 3D Convolutional Neural Network for real-time object recognition[C], 922-928(2015).

    [9] Xu Y, Hoegner L, Tuttas S et al. Voxel- and graph-based point cloud segmentation of 3D scenes using perceptual grouping laws[J]. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, IV-1/W1, 43-50(2017).

    [10] Qi C R, Hao S, Mo K C et al. PointNet: deep learning on point sets for 3D classification and segmentation[C], 77-85(2017).

    [11] Qi C R, Litany O, He K M et al. Deep Hough voting for 3D object detection in point clouds[C], 9276-9285(2020).

    [12] Qi X J, Liao R J, Jia J Y et al. 3D graph neural networks for RGBD semantic segmentation[C], 5209-5218(2017).

    [13] Landrieu L, Simonovsky M. Large-scale point cloud semantic segmentation with superpoint graphs[C], 4558-4567(2018).

    [14] Bi Y, Chadha A, Abbas A et al. Graph-based object classification for neuromorphic vision sensing[C], 491-501(2020).

    [15] Wang Y, Sun Y B, Liu Z W et al. Dynamic graph CNN for learning on point clouds[J]. ACM Transactions on Graphics, 38, 1-12.

    [16] Zhou Y, Tuzel O. VoxelNet: end-to-end learning for point cloud based 3D object detection[C], 4490-4499(2018).

    [17] Yan Y, Mao Y X, Li B. SECOND: sparsely embedded convolutional detection[J]. Sensors, 18, 3337(2018).

    [18] Lang A H, Vora S, Caesar H et al. PointPillars: fast encoders for object detection from point clouds[C], 12689-12697(2020).

    [19] Deng J J, Shi S S, Li P W et al. Voxel R-CNN: towards high performance voxel-based 3D object detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 1201-1209(2021).

    [21] Zheng W, Tang W L, Chen S J et al. CIA-SSD: confident IoU-aware single-stage object detector from point cloud[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 3555-3562(2021).

    [22] Zheng W, Tang W L, Jiang L et al. SE-SSD: self-ensembling single-stage object detector from point cloud[C], 14489-14498(2021).

    [23] Yin T W, Zhou X Y, Krähenbühl P. Center-based 3D object detection and tracking[C], 11779-11788(2021).

    [24] Vaswani A, Shazeer N, Parmar N et al. Attention is all you need[C], 6000-6010(2017).

    [25] Mao J G, Xue Y J, Niu M Z et al. Voxel transformer for 3D object detection[C], 3144-3153(2022).

    [26] Sheng H L, Cai S J, Liu Y et al. Improving 3D object detection with channel-wise transformer[C], 2723-2732(2022).

    [27] He C H, Li R H, Li S et al. Voxel set transformer: a set-to-set approach to 3D object detection from point clouds[C], 8407-8417(2022).

    [28] Dong S, Ding L, Wang H et al. MsSVT: mixed-scale sparse voxel transformer for 3D object detection on point clouds[C](2022).

    [29] Ding L H, Dong S C, Xu T F et al. FH-net: a fast hierarchical network for scene flow estimation on real-world point clouds[M]. Avidan S, Brostow G, Cissé M, et al. Computer vision-ECCV 2022. Lecture notes in computer science, 13699, 213-229(2022).

    [30] Zhang Y N, Chen J X, Huang D. CAT-det: contrastively augmented transformer for multimodal 3D object detection[C], 898-907(2022).

    [31] Xie Q, Lai Y K, Wu J et al. MLCVNet: multi-level context VoteNet for 3D object detection[C], 10444-10453(2020).

    [32] Xie Q, Lai Y K, Wu J et al. Vote-based 3D object detection with context modeling and SOB-3DNMS[J]. International Journal of Computer Vision, 129, 1857-1874(2021).

    [33] Chen X X, Zhao H, Zhou G Y et al. PQ-transformer: jointly parsing 3D objects and layouts from point clouds[J]. IEEE Robotics and Automation Letters, 7, 2519-2526(2022).

    [34] Misra I, Girdhar R, Joulin A. An end-to-end transformer model for 3D object detection[C], 2886-2897(2022).

    [35] Xu X, Dong S, Xu T et al. FusionRCNN: LiDAR-camera fusion for two-stage 3D object detection[J]. Remote Sensing, 15, 1839(2023).

    [36] Hu J S K, Kuai T S, Waslander S L. Point density-aware voxels for LiDAR 3D object detection[C], 8459-8468(2022).

    [37] Qi C R, Yi L, Su H et al. PointNet++: deep hierarchical feature learning on point sets in a metric space[C], 5105-5114(2017).

    [38] Shi S S, Wang X G, Li H S. PointRCNN: 3D object proposal generation and detection from point cloud[C], 770-779(2020).

    [39] Yang Z T, Sun Y N, Liu S et al. 3DSSD: point-based 3D single stage object detector[C], 11037-11045(2020).

    [40] Zhang Y F, Hu Q Y, Xu G Q et al. Not all points are equal: learning highly efficient point-based detectors for 3D LiDAR point clouds[C], 18931-18940(2022).

    [41] Chen C, Chen Z, Zhang J et al. SASA: semantics-augmented set abstraction for point-based 3D object detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 221-229(2022).

    [42] Shi W J, Rajkumar R. Point-GNN: graph neural network for 3D object detection in a point cloud[C], 1708-1716(2020).

    [43] Zhang Y N, Huang D, Wang Y H. PC-RGNN: point cloud completion and graph neural network for 3D object detection[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 3430-3437(2021).

    [44] Yang H H, Liu Z L, Wu X P et al. Graph R-CNN: towards accurate 3D object detection with semantic-decorated local graph[M]. Avidan S, Brostow G, Cissé M, et al. Computer vision–ECCV 2022. Lecture notes in computer science, 13668, 662-679(2022).

    [45] Chen Y L, Liu S, Shen X Y et al. Fast point R-CNN[C], 9774-9783(2020).

    [46] Yang Z T, Sun Y N, Liu S et al. STD: sparse-to-dense 3D object detector for point cloud[C], 1951-1960(2020).

    [47] Shi S S, Guo C X, Jiang L et al. PV-RCNN: point-voxel feature set abstraction for 3D object detection[C], 10526-10535(2020).

    [48] He C H, Zeng H, Huang J Q et al. Structure aware single-stage 3D object detection from point cloud[C], 11870-11879(2020).

    [50] Qi C R, Liu W, Wu C X et al. Frustum PointNets for 3D object detection from RGB-D data[C], 918-927(2018).

    [51] Wang Z X, Jia K. Frustum ConvNet: sliding Frustums to aggregate local point-wise features for amodal 3D object detection[C], 1742-1749(2020).

    [52] Chen X Z, Ma H M, Wan J et al. Multi-view 3D object detection network for autonomous driving[C], 6526-6534(2017).

    [53] Ku J, Mozifian M, Lee J et al. Joint 3D proposal generation and object detection from view aggregation[C](2019).

    [55] Bai X Y, Hu Z Y, Zhu X G et al. TransFusion: robust LiDAR-camera fusion for 3D object detection with transformers[C], 1080-1089(2022).

    [56] Vora S, Lang A H, Helou B et al. PointPainting: sequential fusion for 3D object detection[C], 4603-4611(2020).

    [58] Liang M, Yang B, Wang S L et al. Deep continuous fusion for multi-sensor 3D object detection[M]. Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision–ECCV 2018. Lecture notes in computer science, 11220, 663-678(2018).

    [59] Liang M, Yang B, Chen Y et al. Multi-task multi-sensor fusion for 3D object detection[C], 7337-7345(2020).

    [60] Wang Y D, Tian Y L, Li G Q et al. 3D object detection based on convolutional neural networks: a survey[J]. Pattern Recognition and Artificial Intelligence, 34, 1103-1119(2021).

    [61] Sun P, Kretzschmar H, Dotiwalla X et al. Scalability in perception for autonomous driving: waymo open dataset[C], 2443-2451(2020).

    [62] Caesar H, Bankiti V, Lang A H et al. nuScenes: a multimodal dataset for autonomous driving[C], 11618-11628(2020).

    [63] Cong P S, Zhu X G, Qiao F et al. STCrowd: a multimodal dataset for pedestrian perception in crowded scenes[C], 19608-19617(2022).

    [64] Silberman N, Hoiem D, Kohli P et al. Indoor segmentation and support inference from RGBD images[M]. Fitzgibbon A, Lazebnik S, Perona P, et al. Computer vision–ECCV 2012. Lecture notes in computer science, 7576, 746-760(2012).

    [65] Xiao J X, Owens A, Torralba A. SUN3D: a database of big spaces reconstructed using SfM and object labels[C], 1625-1632(2013).

    [66] Song S R, Lichtenberg S P, Xiao J X. SUN RGB-D: a RGB-D scene understanding benchmark suite[C], 567-576(2015).

    [67] Dai A, Chang A X, Savva M et al. ScanNet: richly-annotated 3D reconstructions of indoor scenes[C], 2432-2443(2017).

    [68] Qian R, Lai X, Li X R. 3D object detection for autonomous driving: a survey[J]. Pattern Recognition, 130, 108796(2022).

    [69] Simonelli A, Bulò S R, Porzi L et al. Disentangling monocular 3D object detection[C], 1991-1999(2020).

    [70] Shi S S, Wang Z, Shi J P et al. From points to parts: 3D object detection from point cloud with part-aware and part-aggregation network[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 2647-2664(2021).

    [71] Liu Z, Zhao X, Huang T T et al. TANet: robust 3D object detection from point clouds with triple attention[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 11677-11684(2020).

    [72] Yi H W, Shi S S, Ding M Y et al. SegVoxelNet: exploring semantic context and depth-aware features for 3D vehicle detection from point cloud[C], 2274-2280(2020).

    [73] Qian X L, Wang L, Zhu Y et al. ImpDet: exploring implicit fields for 3D object detection[C], 4249-4259(2023).

    [75] Wang Y E, Fathi A, Kundu A et al. Pillar-based object detection for autonomous driving[M]. Vedaldi A, Bischof H, Brox T, et al. Computer vision–ECCV 2020. Lecture notes in computer science, 12367, 18-34(2020).

    [76] Li Z C, Wang F, Wang N Y. LiDAR R-CNN: an efficient and universal 3D object detector[C], 7542-7551(2021).

    [77] Miao Z W, Chen J K, Pan H Y et al. PVGNet: a bottom-up one-stage 3D object detector with integrated multi-level features[C], 3278-3287(2021).

    [78] Mao J G, Niu M Z, Bai H Y et al. Pyramid R-CNN: towards better performance and adaptability for 3D object detection[C], 2703-2712(2022).

    [79] Wang Y, Chao W L, Garg D et al. Pseudo-LiDAR from visual depth estimation: bridging the gap in 3D object detection for autonomous driving[C], 8437-8445(2020).

    [80] Meng Q H, Wang W G, Zhou T F et al. Weakly supervised 3D object detection from lidar point cloud[M]. Vedaldi A, Bischof H, Brox T, et al. Computer Vision–ECCV 2020. Lecture notes in computer science, 12358, 515-531(2020).

    [81] Zhang Z W, Girdhar R, Joulin A et al. Self-supervised pretraining of 3D features on any point-cloud[C], 10232-10243(2022).

    [82] Luo Z P, Cai Z A, Zhou C Q et al. Unsupervised domain adaptive 3D detection with multi-level consistency[C], 8846-8855(2022).

    Tools

    Get Citation

    Copy Citation Text

    Jianan Li, Ze Wang, Tingfa Xu. Three-Dimensional Object Detection Technology Based on Point Cloud Data[J]. Acta Optica Sinica, 2023, 43(15): 1515001

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Machine Vision

    Received: Mar. 29, 2023

    Accepted: Jun. 5, 2023

    Published Online: Aug. 3, 2023

    The Author Email: Xu Tingfa (ciom_xtf1@bit.edu.cn)

    DOI:10.3788/AOS230745

    Topics