Laser & Optoelectronics Progress, Volume. 57, Issue 4, 041021(2020)

Backbone Network for Object Detection Task

Yalin Song* and Yanwei Pang
Author Affiliations
  • School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
  • show less
    Figures & Tables(12)
    Network architecture
    Initial module
    Feature fusion module
    Mix down-sampling module
    Prediction modules. (a) Plain prediction module; (b) dense prediction module
    Qualitative detection results
    • Table 1. Comparison of different initial modules

      View table

      Table 1. Comparison of different initial modules

      Initial blockMAP /%Speed /(frame·s-1)
      7×7-s279.985
      3×3 conv-s1,3×3conv-s1,3×3 conv-s2,80.785
      3×3 conv-s1,3×3conv-s2,3×3 conv-s1,80.684
      3×3 conv-s2,3×3conv-s1,3×3 conv-s181.085
    • Table 2. Comparison of different feature fusion methods

      View table

      Table 2. Comparison of different feature fusion methods

      Feature fusion methodMAP /%Speed /(frame·s-1)
      Without fusion80.491
      Sum79.896
      Concatenation+1×1 conv81.085
    • Table 3. Comparison of different down-sampling modules

      View table

      Table 3. Comparison of different down-sampling modules

      Down-sampling moduleMAP /%Speed /(frame·s-1)
      3×3 conv-s279.881
      2×2 max pool-s279.884
      Mix down-sampling81.085
    • Table 4. Comparison of different prediction modules

      View table

      Table 4. Comparison of different prediction modules

      Prediction moduleMAP /%Speed /(frame·s-1)
      Plain prediction module80.189
      Dense prediction module81.085
    • Table 5. Detection resultsof different backbone networks in SSD, DSOD, and RFBNet models

      View table

      Table 5. Detection resultsof different backbone networks in SSD, DSOD, and RFBNet models

      BackbonenetworkDepthPre-trainSSDDSODRFBNet
      MAP /%Speed /(frame·s-1)MAP /%Speed /(frame·s-1)MAP /%Speed /(frame·s-1)
      VGG1677.513078.17978.981
      VGGBN16×79.59579.58979.971
      ResNet101×76.04275.54277.138
      DenseNet121×74.63775.13275.329
      DS/64-192-48-167×78.55178.84779.442
      Root-ResNet-3434×80.27980.67581.361
      DNet25×80.18981.08580.565
    • Table 6. Detection results of different detectors on the PASCAL VOC dataset

      View table

      Table 6. Detection results of different detectors on the PASCAL VOC dataset

      MethodPre-trainBackbone networkInput size /(pixel×pixel)MAP /%Speed /(frame·s-1)
      SSD[11]VGG-16300×30077.246
      SSD*VGG-16300×30077.7130
      YOLOv2[26]DarkNet-19544×54478.681
      RFBNet[25]VGG-16300×30080.583
      DSSD[27]ResNet-101300×30078.68
      Faster R-CNN[8]ResNet-101~1000×60076.42.4
      RFCN[28]ResNet-101~1000×60080.59
      DSOD[19]×DS/64-192-48-1300×30077.717.4
      ScratchDet[20]×Root-ResNet-34300×30080.417.8
      Proposed×DNet300×30081.085
    Tools

    Get Citation

    Copy Citation Text

    Yalin Song, Yanwei Pang. Backbone Network for Object Detection Task[J]. Laser & Optoelectronics Progress, 2020, 57(4): 041021

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Jun. 10, 2019

    Accepted: Jul. 22, 2019

    Published Online: Feb. 20, 2020

    The Author Email: Yalin Song (songyalin@tju.edu.cn)

    DOI:10.3788/LOP57.041021

    Topics