Laser & Optoelectronics Progress, Volume. 61, Issue 12, 1228007(2024)

Fast Two-Stage 3D Object Detection with Semantic Guidance

Mang Huang1,2,3,4, Bin Hui1,2、*, Zhaoji Liu1,2, and Tianming Jin1,2
Author Affiliations
  • 1Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang 110016, Liaoning, China
  • 2Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, Liaoning, China
  • 3Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, Liaoning, China
  • 4University of Chinese Academy of Sciences, Beijing 100049, China
  • show less
    Figures & Tables(12)
    Network structure of FTS3D algorithm
    Inference and training process of semantic downsampling
    Aware pooling and channel fusion process
    GPU memory consumption for different number of point clouds under semantic guidance layer
    Comparison of foreground point preservation between farthest point sampling and semantic-guided sampling under different scenes. (a) Scene without tree cover; (b) intersection scene; (c) scene with tree cover
    Comparison of first-stage and second-stage detection performance
    • Table 1. Experimental environment

      View table

      Table 1. Experimental environment

      ConfigurationModel /version
      CPUIntel Xeon Gold 6230
      GPUGeforce RTX 3090
      Operating systemUbuntu 16.04
      CUDA versionCUDA 11.1
      Deep learning frameworkPyTorch 1.7.0
      Programming languagePython 3.8
      OpenPCDet0.5.2
    • Table 2. Quantitative comparison of different methods on the car category on KITTI validation set

      View table

      Table 2. Quantitative comparison of different methods on the car category on KITTI validation set

      MethodTypeModalityAP for 3D Car(IoU is 0.7)/%mAP /%Speed /(frame·s-1
      EasyModerateHard
      MV3D182-stageRGB+LiDAR71.2962.6856.5663.512.7
      3D-CVF191-stageRGB+LiDAR89.6779.8878.4782.6713.3
      VoxelNet81-stageLiDAR81.9765.4662.8570.094.5
      SECOND91-stageLiDAR87.4376.4869.1077.6720
      PointPillars101-stageLiDAR77.9877.9842.4
      PV-RCNN222-stageLiDAR83.9083.9012.5
      PointRCNN122-stageLiDAR88.8878.6377.3881.6310
      3DSSD151-stageLiDAR89.7179.4578.6782.6125
      Pointformer161-stageLIDAR90.0579.6578.8982.86
      FTS3D2-stageLiDAR89.0283.2578.1083.4555.6
    • Table 3. Quantitative comparison of different methods under 3D view on KITTI test set

      View table

      Table 3. Quantitative comparison of different methods under 3D view on KITTI test set

      MethodModalityAP for 3D Ped(IoU is 0.5)/%AP for 3D Cyc(IoU is 0.5)/%mAP(Ped)/%mAP(Cyc)/%Speed /(frame·s-1
      EasyModerateHardEasyModerateHard
      MV3D18RGB+LiDAR2.7
      3D-CVF19RGB+LiDAR13.3
      VoxelNet8LiDAR39.4833.6931.5061.2248.3644.3734.8951.314.5
      SECOND9LiDAR45.3135.3233.1475.8360.8253.6737.9263.4420
      PointPillars10LiDAR51.4541.9238.8977.1058.6551.9244.0862.5542.4
      PV-RCNN22LiDAR52.1743.2940.2978.6063.7157.6545.2566.6512.5
      PointRCNN12LiDAR47.9839.3736.0174.9658.8252.5340.9362.1010
      3DSSD15LiDAR25
      Pointformer16LiDAR50.6742.4339.6075.0159.8053.9944.2362.93
      FTS3DLiDAR49.4240.0337.2778.3662.7356.3442.2465.8155.6
    • Table 4. Quantitative comparison of different methods on the cyclist category under BEV on KITTI test set

      View table

      Table 4. Quantitative comparison of different methods on the cyclist category under BEV on KITTI test set

      MethodTypeModalityAP for BEV Cyc(IoU is 0.5)/%mAP /%Speed /(frame·s-1
      EasyModerateHard
      MV3D182-stageRGB+LiDAR2.7
      3D-CVF191-stageRGB+LiDAR13.3
      VoxelNet81-stageLiDAR4.5
      SECOND91-stageLiDAR76.5056.0549.4560.6720
      PointPillars101-stageLiDAR79.9062.7355.5866.0742.4
      PV-RCNN222-stageLiDAR82.4968.8962.4171.2612.5
      PointRCNN122-stageLiDAR82.5667.2460.2870.0210
      3DSSD151-stageLiDAR25
      Pointformer162-stageLiDAR
      FTS3D2-stageLiDAR86.0471.2663.6573.6555.6
    • Table 5. Comparison of experimental results for different sampling methods

      View table

      Table 5. Comparison of experimental results for different sampling methods

      Sampling layer 1Sampling layer 2Sampling layer 3Sampling layer 4AP for Car Moderate(IoU is 0.7)/%Enhancement /percentage points
      RandomRandomRandomRandom66.44
      RandomRandomRandomSS67.521.08
      RandomRandomSSSS69.392.95
      FPSFPSFPSFPS79.41
      FPSFPSFPSSS80.120.71
      FPSFPSSSSS83.253.84
    • Table 6. Ablation study results

      View table

      Table 6. Ablation study results

      BaselineSSRoI poolingCAAP for Car Moderate(IoU is 0.7)/%Enhancement /percentage points
      77.46
      77.740.28
      79.361.90
      82.955.49
      83.255.79
    Tools

    Get Citation

    Copy Citation Text

    Mang Huang, Bin Hui, Zhaoji Liu, Tianming Jin. Fast Two-Stage 3D Object Detection with Semantic Guidance[J]. Laser & Optoelectronics Progress, 2024, 61(12): 1228007

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Remote Sensing and Sensors

    Received: Jul. 19, 2023

    Accepted: Sep. 6, 2023

    Published Online: Jun. 5, 2024

    The Author Email: Bin Hui (huibin@sia.cn)

    DOI:10.3788/LOP231763

    CSTR:32186.14.LOP231763

    Topics