Acta Optica Sinica, Volume. 40, Issue 9, 0915005(2020)

3D Object Detection Based on Iterative Self-Training

Kangru Wang1,2、*, Jingang Tan1,2, Liang Du3, Lili Chen1, Jiamao Li1, and Xiaolin Zhang1
Author Affiliations
  • 1Bionic Vision System Laboratory, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China
  • 2University of Chinese Academy of Sciences, Beijing, 100049, China
  • 3Key Laboratory of Computational Neuroscience and Brain Inspired Intelligence, Ministry of Education, Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
  • show less
    Figures & Tables(13)
    Flow chart of 3D object detection system
    Architectural diagram of IST-Net
    Flow chart of iterative self-training
    Architectural diagram of SAFF-3DOD Net
    Diagram of SAFFM
    Qualitative comparison of baseline and our method on estimated disparity map. (a) RGB left image; (b) PSMNET method; (c) our disparity estimation method
    Qualitative comparison of baseline and our method on estimated point cloud. (a) RGB left image; (b) PSMNET method; (c) our disparity estimation method
    Qualitative comparison of 3D object detection results. (a) Pseudo- LiDAR; (b) our method
    • Table 1. Detailed configuration of SAFFM

      View table

      Table 1. Detailed configuration of SAFFM

      ParameterSAFFM in region proposal networkSAFFM in detection network
      Layer settingOutput dimensionLayer settingOutput dimension
      FRGB/FBEV3×3×17×7×32
      L01--1×17×7×1
      IRGB/IBEV9×1×149×1×1
      L136--1×136×1×198--1×198×1×1
      L236--1×136×1×198--1×198×1×1
      L318--1×118×1×149--1×149×1×1
      L49--1×19×1×149--1×149×1×1
      Sigmoid9×1×149×1×1
      Spatial-attention map3×3×17×7×1
      Weighted FRGB and weighted FBEV3×3×17×7×32
      Foutput3×3×17×7×32
    • Table 2. Quantitative comparison of disparity estimation on KITTI 3D object detection validation set

      View table

      Table 2. Quantitative comparison of disparity estimation on KITTI 3D object detection validation set

      MethodDisparity error rate /%
      Object regionBackground regionGlobal image
      PSMNET(base)8.964.355.49
      Ours(IST)8.694.185.27
      Ours(SOL)8.724.205.30
      Ours(IST+SOL)8.604.175.25
    • Table 3. Quantitative comparison of disparity estimation on KITTI stereo matching validation set

      View table

      Table 3. Quantitative comparison of disparity estimation on KITTI stereo matching validation set

      MethodDisparity error rate /%
      Object regionBackground regionGlobal image
      Stereonet11.145.236.99
      PSMNET7.233.334.44
      Ours(IST+SOL)6.833.204.27
    • Table 4. Quantitative comparison of 3D object detection on KITTI 3D object detection validation set (units of ABEV and A3D are both %)

      View table

      Table 4. Quantitative comparison of 3D object detection on KITTI 3D object detection validation set (units of ABEV and A3D are both %)

      MethodIoU is 0.5IoU is 0.7
      EasyModerateHardEasyModerateHard
      Pseudo-LiDAR(base)92.1/91.678.3/75.366.7/63.875.6/61.555.6/43.348.3/36.8
      Ours(IST)92.1/91.080.4/77.470.8/67.877.5/61.359.5/43.350.6/36.9
      Ours(SOL)92.3/91.580.6/75.969.1/66.278.8/63.158.2/43.750.1/37.4
      Ours(IST+SOL)92.1/91.481.0/78.069.2/66.378.4/63.559.6/45.050.8/38.6
      Ours(SAFF)92.0/91.578.3/75.468.5/65.577.7/63.057.1/43.348.6/37.0
      Ours94.5/92.581.6/78.673.6/70.780.9/65.860.7/46.152.3/39.4
    • Table 5. 3D object detection results on KITTI test benchmark(units of ABEV and A3D are both %)

      View table

      Table 5. 3D object detection results on KITTI test benchmark(units of ABEV and A3D are both %)

      MethodInputEasyModerateHard
      MonoPSR[7]Monocular18.33/10.7612.58/7.259.91/5.85
      Mono3D_PLiDAR[8]Monocular21.27/10.7613.92/7.5011.25/6.10
      TopNet-HighRes[2]Lidar67.84/12.6753.05/9.2846.99/7.95
      M3D-RPN[9]Monocular21.02/14.7613.67/9.7110.23/7.42
      AM3D[10]Monocular25.03/16.5017.32/10.4714.91/9.52
      RT3D[3]Lidar56.44/23.7444.00/19.1442.34/18.86
      RT3DStereo[13]Stereo58.81/29.9046.82/23.2838.38/18.96
      Stereo R-CNN[27]Stereo61.92/47.5841.31/30.2333.42/23.72
      Pseudo-LiDAR[12]Stereo67.30/54.5345.00/34.0538.40/28.25
      OursStereo71.47/58.7049.61/37.9242.71/31.99
    Tools

    Get Citation

    Copy Citation Text

    Kangru Wang, Jingang Tan, Liang Du, Lili Chen, Jiamao Li, Xiaolin Zhang. 3D Object Detection Based on Iterative Self-Training[J]. Acta Optica Sinica, 2020, 40(9): 0915005

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Machine Vision

    Received: Nov. 25, 2019

    Accepted: Feb. 10, 2020

    Published Online: May. 6, 2020

    The Author Email: Kangru Wang (wangkangru@mail.sim.ac.cn)

    DOI:10.3788/AOS202040.0915005

    Topics