Laser & Optoelectronics Progress, Volume. 61, Issue 24, 2428010(2024)

Remote Sensing Small Target Detection Based on Multimodal Fusion

Fanfan Liu1,2, Chengmei Zhu2, Nana Zhao2, and Jinghua Wu2、*
Author Affiliations
  • 1School of Mechanical and Electrical Engineering, Anhui Jianzhu University, Hefei 230601, Anhui , China
  • 2Changzhou Institute of Advanced Manufacturing Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Changzhou 213164, Jiangsu , China
  • show less
    Figures & Tables(14)
    Structure of YOLOv5 algorithm
    Structure of improved YOLOv5 algorithm
    Structure of MF module
    Structure of SE module
    Structure of RFSA module
    IoU value of ground truth with the same size and different shapes
    Schematic diagram of Shape-IoU
    Detected results of VEDAI and NWPU datasets. (a) VEDAI dataset; (b) NWPU dataset
    Comparison of detection results between the baseline model and the improved model
    • Table 1. Ablation experiment results

      View table

      Table 1. Ablation experiment results

      NumberMethodParameters /MFPS /(frame·s-1P /%R /%mAP@0.5 /%
      1YOLOv5(Focus)4.834141.6367.4154.93
      2YOLOv5(no Focus)4.833768.9160.2264.43
      3YOLOv5+MF4.854980.4563.4167.34
      4YOLOv5+RFSA4.833870.1966.2768.31
      5YOLOv5+Shape-IoU4.833873.3164.6668.19
      6YOLOv5+MF+RFSA4.844867.7167.4866.24
      7YOLOv5+MF+Shape-IoU4.854768.2665.1868.43
      8YOLOv5+RFSA+Shape-IoU4.833778.5865.4570.58
      9YOLOv5+MF+RFSA+Shape-IoU4.854869.1672.7072.83
    • Table 2. Experimenal results with different number of detection heads

      View table

      Table 2. Experimenal results with different number of detection heads

      MethodNumber of detection headParameters /MFPS /(frame·s-1P /%R /%mAP@0.5 /%
      YOLOv5(Focus)14.844141.6367.4154.93
      25.294577.4241.8353.38
      37.083769.3749.0951.68
      YOLOv5(no Focus)14.833768.9160.2264.43
      25.293664.5963.9563.38
      37.083461.5459.1260.94
      YOLOv5+MF+RFSA+Shape-IoU14.854869.1672.7072.83
      25.314382.9962.7571.47
      37.133978.7367.8071.11
    • Table 3. Experimental results of different parameter settings

      View table

      Table 3. Experimental results of different parameter settings

      MethodInput image typeImage size in training set or validation set /(pixel×pixel)Image size in test set /(pixel×pixel)mAP50 /%
      YOLOv5RGB1024×10241024×102414.29
      512×51254.93
      512×5121024×10247.91
      512×51250.64
      IR1024×10241024×102410.82
      512×51244.81
      512×5121024×10244.87
      512×51239.99
      Improved YOLOv5RGB1024×10241024×102416.12
      512×51262.41
      512×5121024×10244.67
      512×51251.98
      IR1024×10241024×102413.87
      512×51256.05
      512×5121024×10244.67
      512×51244.59
      RGB+IR+Fusion1024×1024512×51272.83
    • Table 4. Performance comparison for different algorithms on VEDAI dataset

      View table

      Table 4. Performance comparison for different algorithms on VEDAI dataset

      AlgorithmBackbonemAP50 /%
      Faster R-CNNResNet-5064.90
      Fast R-CNNVGG-1639.80
      SSDVGG-1646.10
      FCOSResNet-5049.60
      YOLOv3Darknet5361.06
      YOLOv4CSPDarknet5362.43
      YOLOv5CSPDarknet5364.43
      YOLOrsResNet59.73
      YOLOv8mCSPDarknet5368.60
      Improved YOLOv5CSPDarknet5372.83
    • Table 5. Performance comparison for different algorithms on the NWPU dataset

      View table

      Table 5. Performance comparison for different algorithms on the NWPU dataset

      AlgorithmAPmAP@0.5
      PlaneSHSTBDTCBCGTFHarborBridgeVehicle
      Faster R-CNN94.682.365.395.581.989.792.472.457.577.880.9
      RetainNet63.839.559.572.762.747.469.134.910.237.149.7
      SSD51290.460.979.889.982.680.698.373.476.752.178.5
      FCOS60.432.654.265.163.059.763.239.914.643.349.6
      YOLOv388.381.487.482.481.676.166.662.364.565.475.6
      YOLOv599.568.094.498.396.389.399.599.479.284.390.8
      YOLOv886.499.086.595.396.985.254.599.590.867.986.2
      Improved YOLOv599.568.598.698.995.296.397.898.892.789.093.5
    Tools

    Get Citation

    Copy Citation Text

    Fanfan Liu, Chengmei Zhu, Nana Zhao, Jinghua Wu. Remote Sensing Small Target Detection Based on Multimodal Fusion[J]. Laser & Optoelectronics Progress, 2024, 61(24): 2428010

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Remote Sensing and Sensors

    Received: Apr. 30, 2024

    Accepted: May. 20, 2024

    Published Online: Dec. 10, 2024

    The Author Email: Jinghua Wu (wjh@iamt.ac.cn)

    DOI:10.3788/LOP241203

    CSTR:32186.14.LOP241203

    Topics