Laser & Optoelectronics Progress, Volume. 61, Issue 12, 1228006(2024)

Target Detection in Remote Sensing Image Based on Deformable Transformer and Adaptive Detection Head

Haokang Peng1, Yun Ge1,2、*, Xiaoyu Yang1, and Changquan Hu1
Author Affiliations
  • 1School of Software, Nanchang Hangkong University, Nanchang 330063, Jiangxi , China
  • 2Jiangxi Huihang Engineering Consulting Co., Ltd., Nanchang 330038, Jiangxi , China
  • show less
    Figures & Tables(19)
    Network framework of proposed method
    Feature fusion moudle
    Deformable Transformer module
    Task learning module
    Feature Learning areas for different tasks. (a) Classification task; (b) feature location task
    Adaptive detection head
    Comparison of different frame structures with same IoU
    Examples of remote sensing image
    Size distribution of each category of ground truth box in different remote sensing datasets. (a) NWPU VHR-10; (b) RSOD
    Examples of detection results of the proposed method on NWPU VHR-10 dataset
    Comparison of detection results of different methods on the RSOD dataset
    • Table 1. Comparison of the effect of different number of TLM in the ADH

      View table

      Table 1. Comparison of the effect of different number of TLM in the ADH

      No.mAPmAP50mAP75
      058.892.865.3
      158.793.265.4
      260.293.968.1
      359.293.364.9
      459.093.064.2
    • Table 2. Ablation results of feature extraction network based on feature fusion and Deformable Transformer

      View table

      Table 2. Ablation results of feature extraction network based on feature fusion and Deformable Transformer

      AlgorithmmAPmAP50mAP75
      ResNet5058.691.667.5
      ResNet50+Fusion module58.792.267.2
      ResNet50+Deformable Transformer59.693.367.9
      ResNet50+Fusion module+Deformable Transformer60.293.968.1
    • Table 3. Ablation results of different modules

      View table

      Table 3. Ablation results of different modules

      ModulemAPmAP50mAP75
      Baseline58.192.366.2
      Baseline+L1-IoU58.892.865.3
      Baseline+ADH58.992.665.0
      Baseline+L1-IoU+ADH60.293.968.1
    • Table 4. Comparison of different losses on the NWPU VHR-10 dataset

      View table

      Table 4. Comparison of different losses on the NWPU VHR-10 dataset

      LossmAPmAP50mAP75
      L1 loss58.992.665.0
      IoU loss55.692.858.6
      GIoU loss91.2
      CIoU loss92.4
      L1-IoU loss60.293.968.1
    • Table 5. Comparison on NWPU VHR-10 dataset and RSOD dataset for different methods

      View table

      Table 5. Comparison on NWPU VHR-10 dataset and RSOD dataset for different methods

      MethodNWPU VHR-10RSOD
      mAPmAP50mAP75mAPmAP50mAP75
      Faster R-CNN60.387.668.557.792.066.1
      Double Heads62.587.671.560.992.471.5
      RetinaNet58.289.265.158.991.869.7
      ATSS58.290.264.159.592.868.7
      Deformable DETR58.791.564.759.194.165.4
      Ours60.293.968.161.195.069.1
    • Table 6. Comparison of parameter and calculation amount of different methods

      View table

      Table 6. Comparison of parameter and calculation amount of different methods

      MethodNumber of parameters /MGLOPs /G
      Faster R-CNN41.1134.4
      Double Heads46.7408.6
      RetinaNet36.2128.7
      ATSS31.9126.0
      Deformable DETR38.3122.2
      Ours36.8154.0
    • Table 7. AP comparison of different categories on the NWPU VHR-10 dataset

      View table

      Table 7. AP comparison of different categories on the NWPU VHR-10 dataset

      ClassAP
      Faster R-CNNDouble HeadsRetinaNetATSSDeformable DETROurs
      Airplane86.595.289.596.294.3100.0
      Ship93.198.896.595.993.288.8
      Storage tank91.090.985.789.287.296.9
      Baseball diamond94.996.899.097.598.096.9
      Tennis court83.283.778.176.689.995.6
      Baseball court86.386.892.891.797.190.1
      Ground track filed100.096.0100.0100.099.994.5
      Harbor88.178.989.091.983.898.9
      Bridge95.087.994.886.189.287.1
      Vehicle57.960.766.176.582.690.2
      mAP5087.687.689.290.291.593.9
    • Table 8. AP comparison of different categories on the RSOD dataset

      View table

      Table 8. AP comparison of different categories on the RSOD dataset

      ClassAP
      Faster R-CNNDouble HeadsRetinaNetATSSDeformable DETROurs
      Aircraft90.590.586.688.690.594.6
      Overpass89.585.487.584.188.890.7
      Playground96.0100.098.8100.099.996.9
      Oiltank92.193.894.398.497.197.8
      mAP5092.092.491.892.894.195.0
    Tools

    Get Citation

    Copy Citation Text

    Haokang Peng, Yun Ge, Xiaoyu Yang, Changquan Hu. Target Detection in Remote Sensing Image Based on Deformable Transformer and Adaptive Detection Head[J]. Laser & Optoelectronics Progress, 2024, 61(12): 1228006

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Remote Sensing and Sensors

    Received: Jul. 12, 2023

    Accepted: Sep. 7, 2023

    Published Online: Jun. 20, 2024

    The Author Email: Yun Ge (geyun@nchu.edu.cn)

    DOI:10.3788/LOP231702

    CSTR:32186.14.LOP231702

    Topics