Laser & Optoelectronics Progress, Volume. 60, Issue 2, 0215005(2023)

Target Detection Model Based on Once Bidirectional Feature Pyramid Network

Yunchuan Zhang, Lin Jiang*, and Li Lin
Author Affiliations
  • Faculty of Science, Kunming University of Science and Technology, Kunming 650500, Yunnan, China
  • show less
    Figures & Tables(15)
    SSD model framework
    Proposed model framework
    Once Bi-FP module
    Top to bottom feature fusion module
    Bottom-to-top feature fusion module
    Prediction module
    FSSD model framework
    Comparison of average precision of object detection model in PASCAL VOC2007 test set
    Comparison of detection results between OBSSD model and SSD* model. (a) cow; (b) car, boat; (c) bird, potted plants
    • Table 1. SSD backbone network structure

      View table

      Table 1. SSD backbone network structure

      BlockLayerOperationSpecific operational detailOutput feature size
      Block 1Conv1_1Conv,Actk=3p=1;ReLU300×300×64
      Conv1_2Conv,Actk=3p=1;ReLU300×300×64
      Block 2Pooling1MaxPoolingk=2s=2150×150×64
      Conv2_1Conv,Actk=3p=1;ReLU150×150×128
      Conv2_2Conv,Actk=3p=1;ReLU150×150×128
      Block 3Pooling2MaxPoolingk=2s=275×75×128
      Conv3_1Conv,Actk=3p=1;ReLU75×75×256
      Conv3_2Conv,Actk=3p=1;ReLU75×75×256
      Conv3_3Conv,Actk=3p=1;ReLU75×75×256
      Block 4Pooling3MaxPoolingk=2s=238×38×256
      Conv4_1Conv,Actk=3p=1;ReLU38×38×512
      Conv4_2Conv,Actk=3p=1;ReLU38×38×512
      Conv4_3Conv,Actk=3p=1;ReLU38×38×512
      Block 5Pooling4MaxPoolingk=2s=219×19×512
      Conv5_1Conv,Actk=3p=1;ReLU19×19×512
      Conv5_2Conv,Actk=3p=1;ReLU19×19×512
      Conv5_3Conv,Actk=3p=1;ReLU19×19×512
      Block 6Pooling5MaxPoolingk=2s=1p=119×19×512
      Conv6Conv,Actk=3p=6d=6;ReLU19×19×1024
      Conv7Conv,Actk=1;ReLU19×19×1024
      Block 7Conv8_1Conv,Actk=1;ReLU19×19×256
      Conv8_2Conv,Actk=3s=2p=1;ReLU10×10×512
      Block 8Conv9_1Conv,Actk=1;ReLU10×10×128
      Conv9_2Conv,Actk=3s=2p=1;ReLU5×5×256
      Block 9Conv10_1Conv,Actk=1;ReLU5×5×128
      Conv10_2Conv,Actk=3p=1;ReLU3×3×256
      Block 10Conv11_1Conv,Actk=1;ReLU3×3×128
      Conv11_2Conv,Actk=3p=1;ReLU1×1×256
    • Table 2. Number of prior frames of a single grid on effective feature layer

      View table

      Table 2. Number of prior frames of a single grid on effective feature layer

      Efficient feature layerSizeNumber of prior frames per grid
      Conv4_338×384
      Conv719×196
      Conv8_210×106
      Conv9_25×56
      Conv10_23×34
      Conv11_21×14
    • Table 3. Training strategies

      View table

      Table 3. Training strategies

      StageOptimizerBatch_sizeFreeze_trainInitial_LrLr_schedulerEpoch
      1Adam32True0.0005ReduceLROnPlateau50
      Adam16False0.0001ReduceLROnPlateau150
      2SGD-M32True0.001MultiStepLR50
      SGD-M16False0.001MultiStepLR50
    • Table 4. Comparison results of detection accuracy and detection speed on PASCAL VOC2007 test set

      View table

      Table 4. Comparison results of detection accuracy and detection speed on PASCAL VOC2007 test set

      MethodDatasetBackboneInput sizeFPSmAP /%
      Faster4VOC07+12VGG16600×1000773.2
      SSD(Baseline)10VOC07+12VGG16300×3005974.3
      SSD*[10VOC07+12VGG16300×30052.676.9
      DSSD11VOC07+12ResNet-101321×32113.678.6
      DSOD29VOC07+12DS/64-192-48-1300×30017.477.7
      RSSD12VOC07+12VGG16300×3003578.5
      FSSD30VOC07+12VGG16300×30065.878.8
      ESSD31VOC07+12VGG16300×3002579.4
      FASSD32VOC07+12ResNet-50300×3003078.1
      DFSSD33VOC07+12DenseNet-S-32-1300×30011.678.9
      FDSSD17VOC07+12VGG16300×30012.679.1
      OBSSDVOC07+12VGG16300×30041.780.8
    • Table 5. Comparison of average precision results of 20 categories in PASCAL VOC2007 test set

      View table

      Table 5. Comparison of average precision results of 20 categories in PASCAL VOC2007 test set

      MethodmAP /%areobicyclebirdboatbottlebuscarcatchaircow
      Faster473.276.579.070.965.552.183.184.786.452.081.9
      SSD10(baseline)74.375.580.272.366.347.683.084.286.154.778.3
      SSD*[1076.976.986.674.566.450.485.084.787.361.078.7
      DSSD1178.681.984.980.568.453.985.686.288.961.183.5
      ESSD3179.482.686.179.872.254.786.886.988.262.885.2
      OBSSD80.882.789.781.571.853.790.790.090.664.886.2
      ModelmAP /%tabledoghorsembikepersonplantsheepsofatraintv
      Faster473.265.784.884.677.576.738.873.673.983.072.6
      SSD10(baseline)74.373.984.585.382.676.248.673.976.083.474.0
      SSD*[1076.978.286.189.486.079.848.576.180.386.976.1
      DSSD1178.678.786.788.786.779.751.778.080.987.279.4
      ESSD3179.478.287.588.087.080.056.180.280.488.778.1
      OBSSD80.877.387.990.088.182.054.280.583.190.280.0
    • Table 6. Results of ablation experiment

      View table

      Table 6. Results of ablation experiment

      ModelmAP@0.3 /%mAP@0.5 /%Size /MBFPS
      SSD1074.325.159
      SSD*[1080.876.925.152.6
      PMSSD*82.978.225.648.2
      OBMSSD*84.280.125.844.3
      OBSSD*85.280.827.441.7
    Tools

    Get Citation

    Copy Citation Text

    Yunchuan Zhang, Lin Jiang, Li Lin. Target Detection Model Based on Once Bidirectional Feature Pyramid Network[J]. Laser & Optoelectronics Progress, 2023, 60(2): 0215005

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Machine Vision

    Received: Jan. 17, 2022

    Accepted: Mar. 14, 2022

    Published Online: Feb. 7, 2023

    The Author Email: Lin Jiang (tojianglin@126.com)

    DOI:10.3788/LOP220555

    Topics