Acta Optica Sinica, Volume. 45, Issue 12, 1228014(2025)

Lightweight Model for Object Detection in Optical Remote Sensing Images Based on Deformable Convolution

Yuhe Zhang1,2, Jing Zhang1,2、*, Xinfang Yuan1,2、***, Xiaohui Li1、**, Jiajia Zhu1,2, Lin Mi1, Binbin Chen1, Guang Yang1, and Shuai Dou1
Author Affiliations
  • 1Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
  • 2School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China
  • show less
    Figures & Tables(17)
    DCBLM structure
    Deformable convolution process
    C2f_DCFE module structure. (a) Bottleneck; (b) Bottleneck_DCFE; (c) C2f_DCFE
    CFFM network structure
    MPDIoU loss function
    Test results of YOLOv8n and DCBLM (DOTA-v1.5). (a)(c) YOLOv8n; (b)(d) DCBLM
    Test results of YOLOv8n and DCBLM (UL22). (a)(c) YOLOv8n; (b)(d) DCBLM
    NVIDIA Jetson Orin Nano
    • Table 1. Data features of DOTA-v1.5 dataset and UL22 dataset

      View table

      Table 1. Data features of DOTA-v1.5 dataset and UL22 dataset

      VariableValue
      DOTA-v1.5UL22
      Total number of images2806443
      Number of object categories163
      Total number of samples40000054949
      Spatial resolution /m0.1‒4.50.03‒0.07
      Object scale /(pixel×pixel)2×2‒1955×17504×6‒268×236
      Percentage of small objects /%5776
      Percentage of medium objects /%4124
      Percentage of large objects /%20
    • Table 2. Experimental environment details

      View table

      Table 2. Experimental environment details

      ConfigurationInformation
      TrainingInference
      EquipmentDELL T7910 WorkstationNVIDIA Jetson Orin Nano
      Operating systemUbuntu 18.04Ubuntu 20.04
      CPU (Central Processing Unit)Intel(R) Xeon(R) CPU E5-2640 v4ARMv8 Processor rev 1 (v8l)
      GPU (Graphics Processing Unit)Quadro RTX 8000NVIDIA Corporation Device 229e (rev a1)
      Power295 W15 W
      CUDA (Compute Unified Device Architecture)11.811.4
      Deep learning frameworkPyTorch 2.1.1PyTorch 1.11.0
    • Table 3. DCBLM ablation experiment on DOTA-v1.5 dataset

      View table

      Table 3. DCBLM ablation experiment on DOTA-v1.5 dataset

      C2f_DCFECFFMMPDIoUmAP /%Params /106FLOPs /109Model size /MB
      ×××66.03.0128.16.3
      ××66.72.8557.76.1
      ××66.61.9686.64.3
      ××66.13.0098.16.2
      ×66.61.8216.44.0
      66.81.8216.34.0
    • Table 4. DCBLM ablation experiment on UL22 dataset

      View table

      Table 4. DCBLM ablation experiment on UL22 dataset

      C2f_DCFECFFMMPDIoUmAP /%Params /106FLOPs /109Model size /MB
      ×××92.53.0068.16.3
      ××94.12.8607.76.1
      ××93.61.9656.64.3
      ××94.03.0068.16.2
      ×93.61.8236.44.0
      93.41.8186.34.0
    • Table 5. Performance comparison of DCBLM and mainstream lightweight models on DOTA-v1.5 dataset

      View table

      Table 5. Performance comparison of DCBLM and mainstream lightweight models on DOTA-v1.5 dataset

      ModelmAP /%Params /106FLOPs /109Model size /MB
      NanoDet-plus57.52.4118.87.8
      YOLOX-nano56.62.1968.918.3
      YOLOv7-tiny66.26.03313.212.0
      YOLOv8n66.03.0128.16.3
      Ghost-YOLOv8n63.41.8435.16.3
      Shufflenet-YOLOv8n64.22.8157.45.9
      Mobilenet-YOLOv8n65.34.3418.06.2
      YOLOv9t65.82.71011.118.1
      YOLOv10n66.22.7018.26.1
      YOLOv11n66.32.5856.35.6
      D-FINE-N61.13.7297.160.7
      DCBLM66.81.8216.34.0
    • Table 6. AP for each object on DOTA-v1.5 dataset with DCBLM and mainstream lightweight models

      View table

      Table 6. AP for each object on DOTA-v1.5 dataset with DCBLM and mainstream lightweight models

      ObjectNanoDet-plusYOLOX-nanoYOLOv8nGhost-YOLOv8nYOLOv10nYOLOv11nD-FINE-NDCBLM
      mAP57.556.666.063.466.266.361.166.8
      Plane91.985.589.087.588.788.987.590.2
      Ship61.183.686.986.086.986.681.389.3
      Storage tank84.281.974.772.974.272.875.376.9
      Baseball diamond60.458.874.670.076.874.567.176.3
      Tennis court88.185.494.193.593.893.984.493.5
      Basketball court67.257.563.256.464.163.861.769.6
      Ground track field44.321.060.760.961.361.955.253.8
      Harbor76.576.979.078.780.279.576.481.1
      Bridge13.920.142.842.044.143.933.545.4
      Large vehicle78.781.680.479.280.080.179.781.6
      Small vehicle55.971.260.859.560.660.960.363.0
      Helicopter30.935.952.839.651.753.547.146.1
      Roundabout31.95.565.260.763.765.251.971.5
      Soccer ball field56.345.859.457.659.158.052.148.6
      Swimming pool70.276.570.468.169.271.160.369.0
      Container crane8.419.12.61.34.86.22.812.6
    • Table 7. Performance comparison of DCBLM and mainstream lightweight models on UL22 dataset

      View table

      Table 7. Performance comparison of DCBLM and mainstream lightweight models on UL22 dataset

      ModelAP /%mAP /%Params /106FLOPs /109Model size /MB
      CattleHorseSheep
      YOLOX-nano84.775.772.877.70.901
      YOLOX-x89.686.685.387.299.001
      GLDM88.585.985.086.55.701
      YOLOv7-tiny94.194.689.592.76.02013.212.0
      YOLOv8n94.493.389.892.53.0068.16.3
      Ghost-YOLOv8n92.291.187.890.31.9215.16.3
      YOLOv9t92.893.788.291.62.66011.018.1
      YOLOv10n93.994.089.692.52.6968.26.1
      YOLOv11n94.294.989.692.92.5836.35.6
      D-FINE-N89.388.186.387.93.7037.160.7
      DCBLM94.794.690.893.41.8186.34.0
    • Table 8. Data features of RSOD, NWPU VHR-10 and DIOR datasets

      View table

      Table 8. Data features of RSOD, NWPU VHR-10 and DIOR datasets

      VariableValue
      RSODNWPU VHR-10DIOR
      Total number of images93665023463
      Number of object categories41020
      Total number of samples74003896122670
      Spatial resolution /m0.5‒20.5‒20.5‒30
      Object scale /(pixel×pixel)8×10‒746×81017×18‒418×5131×2‒798×99
    • Table 9. Comparison of detection metrics between YOLOv8n and DCBLM on different datasets

      View table

      Table 9. Comparison of detection metrics between YOLOv8n and DCBLM on different datasets

      ModelmAP/%Maximum GPU utilization/%
      RSODNWPU VHR-10DIOR
      YOLOv8n80.757.866.071.6
      DCBLM82.960.868.062.2
    Tools

    Get Citation

    Copy Citation Text

    Yuhe Zhang, Jing Zhang, Xinfang Yuan, Xiaohui Li, Jiajia Zhu, Lin Mi, Binbin Chen, Guang Yang, Shuai Dou. Lightweight Model for Object Detection in Optical Remote Sensing Images Based on Deformable Convolution[J]. Acta Optica Sinica, 2025, 45(12): 1228014

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Remote Sensing and Sensors

    Received: Dec. 25, 2024

    Accepted: Mar. 25, 2025

    Published Online: Jun. 24, 2025

    The Author Email: Jing Zhang (zhangjing@aoe.ac.cn), Xinfang Yuan (yuanxf@aircas.ac.cn), Xiaohui Li (xhli@aoe.ac.cn)

    DOI:10.3788/AOS241932

    CSTR:32393.14.AOS241932

    Topics