Opto-Electronic Engineering, Volume. 52, Issue 5, 240287(2025)

Multi-level refined UAV image target detection

Zhenjiu Xiao1, Siyu Lai1、*, and Haicheng Qu1
Author Affiliations
  • 1[in Chinese]
  • 1School of Software, Liaoning Technical University, Huludao, Liaoning 125105, China
  • show less
    Figures & Tables(16)
    Schematic illustration of typical challenges in UAV image detection. (a) Complex background; (b) Sudden change in illumination;(c) Target occlusion; (d) Inconsistent scales
    Overall architecture diagram
    Structure of CSP-SMSFF
    Structure of the SMSFF module
    Structure of AFGCAttention
    Structure of SGCE-Head
    Visualization comparison of ablation experiments
    Comparison of evaluation metrics between YOLO11n and the improved model
    Comparison of visualization effects on dataset VisDrone2019
    Comparison of visualization effects on dataset VisDrone2021
    • Table 1. Comparison of values for different convolution kernel combinations

      View table
      View in Article

      Table 1. Comparison of values for different convolution kernel combinations

      Convolution kernel combinationmAP@0.5/%mAP@0.5∶0.95/%Parameters/106GFLOPs
      2×2, 4×4, 6×641.522.02.83.9
      3×3, 5×5, 7×745.226.53.55.8
      3×3, 5×5, 8×846.527.84.76.5
    • Table 2. Experimental environment

      View table
      View in Article

      Table 2. Experimental environment

      EnvironmentModel
      Operating systemUbuntu 18.04
      CPUXeon(R) Gold 6430
      GPURTX 2080 Ti (11 GB)
      Programming languagePython 3.10.0
      CUDA12.4
    • Table 3. Training parameters

      View table
      View in Article

      Table 3. Training parameters

      ParameterSetting
      Training epochs200
      Batch size64
      Input image size640×640
      Initial learning rate0.01
    • Table 4. Ablation experiment results of the proposed algorithm on the VisDrone2019 dataset

      View table
      View in Article

      Table 4. Ablation experiment results of the proposed algorithm on the VisDrone2019 dataset

      ModelPrecision/%Recall/%mAP@0.5/%mAP@0.5∶0.95/%Parameters/106GFLOPs
      YOLO11n51.538.341.824.12.54.3
      + CSP-SMSFF52.740.245.226.53.55.8
      + AFGCAttention55.238.945.827.12.65.4
      + SGCE-Head51.337.542.325.42.34.5
      + IPIoUv251.438.641.925.02.54.3
      + CSP-SMSFF + AFGCAttention56.241.646.828.23.67.0
      + CSP-SMSFF + AFGCAttention + SGCE-Head56.742.847.429.73.37.3
      + CSP-SMSFF + AFGCAttention + SGCE-Head + IPIoUv256.843.547.530.03.37.3
    • Table 5. Comparison of different models on the VisDrone2019 dataset

      View table
      View in Article

      Table 5. Comparison of different models on the VisDrone2019 dataset

      ModelPrecision/%Recall/%mAP@0.5/%mAP@0.5∶0.95/%FPS
      RetinaNet35.521.920.312.526
      Faster R-CNN48.035.135.021.923
      Deformmable-DETR52.445.044.428.369
      BDAD-YOLO45.635.736.120.7122
      YOLOv10n47.736.037.021.2146
      SSG-YOLOv748.637.142.624.698
      YOLOv850.538.545.427.463
      Ours56.843.547.530.0145
    • Table 6. Comparison of different models on VisDrone2021 dataset

      View table
      View in Article

      Table 6. Comparison of different models on VisDrone2021 dataset

      ModelPrecision/%Recall/%mAP@0.5/%mAP@0.5∶0.95/%FPS
      RetinaNet31.518.915.310.521
      Faster R-CNN46.133.632.318.820
      Deformable-DETR51.542.142.022.462
      BDAD-YOLO43.233.634.317.8112
      YOLOv10n45.436.136.418.7140
      SSG-YOLOv747.937.640.722.296
      YOLOv849.440.143.624.760
      Ours53.241.845.326.4140
    Tools

    Get Citation

    Copy Citation Text

    Zhenjiu Xiao, Siyu Lai, Haicheng Qu. Multi-level refined UAV image target detection[J]. Opto-Electronic Engineering, 2025, 52(5): 240287

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Article

    Received: Dec. 5, 2024

    Accepted: Mar. 17, 2025

    Published Online: Jul. 18, 2025

    The Author Email: Siyu Lai (赖思宇)

    DOI:10.12086/oee.2025.240287

    Topics