Lightweight Model for Object Detection in Optical Remote Sensing Images Based on Deformable Convolution

Yuhe Zhang; Jing Zhang; Xinfang Yuan; Xiaohui Li; Jiajia Zhu; Lin Mi; Binbin Chen; Guang Yang; Shuai Dou

doi:10.3788/AOS241932

Acta Optica Sinica, Volume. 45, Issue 12, 1228014(2025)

Lightweight Model for Object Detection in Optical Remote Sensing Images Based on Deformable Convolution

Yuhe Zhang^1,2, Jing Zhang^1,2、*, Xinfang Yuan^1,2、***, Xiaohui Li^1、**, Jiajia Zhu^1,2, Lin Mi¹, Binbin Chen¹, Guang Yang¹, and Shuai Dou¹

Author Affiliations

¹Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

²School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 100049, China

show less

Abstract Get PDF(in Chinese)

Figures & Tables(17)

Fig. 1. DCBLM structure

Download full size

Fig. 2. Deformable convolution process

Download full size

Fig. 3. C2f_DCFE module structure. (a) Bottleneck; (b) Bottleneck_DCFE; (c) C2f_DCFE

Download full size

Fig. 4. CFFM network structure

Download full size

Fig. 5. MPDIoU loss function

Download full size

Fig. 6. Test results of YOLOv8n and DCBLM (DOTA-v1.5). (a)(c) YOLOv8n; (b)(d) DCBLM

Download full size

Fig. 7. Test results of YOLOv8n and DCBLM (UL22). (a)(c) YOLOv8n; (b)(d) DCBLM

Download full size

Fig. 8. NVIDIA Jetson Orin Nano

Download full size

Table 1. Data features of DOTA-v1.5 dataset and UL22 dataset

View table

Table 1. Data features of DOTA-v1.5 dataset and UL22 dataset

Variable	Value
Variable	DOTA-v1.5	UL22
Total number of images	2806	443
Number of object categories	16	3
Total number of samples	400000	54949
Spatial resolution /m	0.1‒4.5	0.03‒0.07
Object scale /（pixel×pixel）	2×2‒1955×1750	4×6‒268×236
Percentage of small objects /%	57	76
Percentage of medium objects /%	41	24
Percentage of large objects /%	2	0

Table 2. Experimental environment details

View table

Table 2. Experimental environment details

Configuration	Information
Configuration	Training	Inference
Equipment	DELL T7910 Workstation	NVIDIA Jetson Orin Nano
Operating system	Ubuntu 18.04	Ubuntu 20.04
CPU （Central Processing Unit）	Intel（R） Xeon（R） CPU E5-2640 v4	ARMv8 Processor rev 1 （v8l）
GPU （Graphics Processing Unit）	Quadro RTX 8000	NVIDIA Corporation Device 229e （rev a1）
Power	295 W	15 W
CUDA （Compute Unified Device Architecture）	11.8	11.4
Deep learning framework	PyTorch 2.1.1	PyTorch 1.11.0

Table 3. DCBLM ablation experiment on DOTA-v1.5 dataset
View table
Table 3. DCBLM ablation experiment on DOTA-v1.5 dataset
C2f_DCFE CFFM MPDIoU mAP /% Params /10⁶ FLOPs /10⁹ Model size /MB
× × × 66.0 3.012 8.1 6.3
√ × × 66.7 2.855 7.7 6.1
× √ × 66.6 1.968 6.6 4.3
× × √ 66.1 3.009 8.1 6.2
√ √ × 66.6 1.821 6.4 4.0
√ √ √ 66.8 1.821 6.3 4.0

Table 4. DCBLM ablation experiment on UL22 dataset
View table
Table 4. DCBLM ablation experiment on UL22 dataset
C2f_DCFE CFFM MPDIoU mAP /% Params /10⁶ FLOPs /10⁹ Model size /MB
× × × 92.5 3.006 8.1 6.3
√ × × 94.1 2.860 7.7 6.1
× √ × 93.6 1.965 6.6 4.3
× × √ 94.0 3.006 8.1 6.2
√ √ × 93.6 1.823 6.4 4.0
√ √ √ 93.4 1.818 6.3 4.0

Table 5. Performance comparison of DCBLM and mainstream lightweight models on DOTA-v1.5 dataset

View table

Table 5. Performance comparison of DCBLM and mainstream lightweight models on DOTA-v1.5 dataset

Model	mAP /%	Params /10⁶	FLOPs /10⁹	Model size /MB
NanoDet-plus	57.5	2.411	8.8	7.8
YOLOX-nano	56.6	2.196	8.9	18.3
YOLOv7-tiny	66.2	6.033	13.2	12.0
YOLOv8n	66.0	3.012	8.1	6.3
Ghost-YOLOv8n	63.4	1.843	5.1	6.3
Shufflenet-YOLOv8n	64.2	2.815	7.4	5.9
Mobilenet-YOLOv8n	65.3	4.341	8.0	6.2
YOLOv9t	65.8	2.710	11.1	18.1
YOLOv10n	66.2	2.701	8.2	6.1
YOLOv11n	66.3	2.585	6.3	5.6
D-FINE-N	61.1	3.729	7.1	60.7
DCBLM	66.8	1.821	6.3	4.0

Table 6. AP for each object on DOTA-v1.5 dataset with DCBLM and mainstream lightweight models

View table

Table 6. AP for each object on DOTA-v1.5 dataset with DCBLM and mainstream lightweight models

Object	NanoDet-plus	YOLOX-nano	YOLOv8n	Ghost-YOLOv8n	YOLOv10n	YOLOv11n	D-FINE-N	DCBLM
mAP	57.5	56.6	66.0	63.4	66.2	66.3	61.1	66.8
Plane	91.9	85.5	89.0	87.5	88.7	88.9	87.5	90.2
Ship	61.1	83.6	86.9	86.0	86.9	86.6	81.3	89.3
Storage tank	84.2	81.9	74.7	72.9	74.2	72.8	75.3	76.9
Baseball diamond	60.4	58.8	74.6	70.0	76.8	74.5	67.1	76.3
Tennis court	88.1	85.4	94.1	93.5	93.8	93.9	84.4	93.5
Basketball court	67.2	57.5	63.2	56.4	64.1	63.8	61.7	69.6
Ground track field	44.3	21.0	60.7	60.9	61.3	61.9	55.2	53.8
Harbor	76.5	76.9	79.0	78.7	80.2	79.5	76.4	81.1
Bridge	13.9	20.1	42.8	42.0	44.1	43.9	33.5	45.4
Large vehicle	78.7	81.6	80.4	79.2	80.0	80.1	79.7	81.6
Small vehicle	55.9	71.2	60.8	59.5	60.6	60.9	60.3	63.0
Helicopter	30.9	35.9	52.8	39.6	51.7	53.5	47.1	46.1
Roundabout	31.9	5.5	65.2	60.7	63.7	65.2	51.9	71.5
Soccer ball field	56.3	45.8	59.4	57.6	59.1	58.0	52.1	48.6
Swimming pool	70.2	76.5	70.4	68.1	69.2	71.1	60.3	69.0
Container crane	8.4	19.1	2.6	1.3	4.8	6.2	2.8	12.6

Table 7. Performance comparison of DCBLM and mainstream lightweight models on UL22 dataset

View table

Table 7. Performance comparison of DCBLM and mainstream lightweight models on UL22 dataset

Model	AP /%			mAP /%	Params /10⁶	FLOPs /10⁹	Model size /MB
Model	Cattle	Horse	Sheep	mAP /%	Params /10⁶	FLOPs /10⁹	Model size /MB
YOLOX-nano	84.7	75.7	72.8	77.7	0.901	—	—
YOLOX-x	89.6	86.6	85.3	87.2	99.001	—	—
GLDM	88.5	85.9	85.0	86.5	5.701	—	—
YOLOv7-tiny	94.1	94.6	89.5	92.7	6.020	13.2	12.0
YOLOv8n	94.4	93.3	89.8	92.5	3.006	8.1	6.3
Ghost-YOLOv8n	92.2	91.1	87.8	90.3	1.921	5.1	6.3
YOLOv9t	92.8	93.7	88.2	91.6	2.660	11.0	18.1
YOLOv10n	93.9	94.0	89.6	92.5	2.696	8.2	6.1
YOLOv11n	94.2	94.9	89.6	92.9	2.583	6.3	5.6
D-FINE-N	89.3	88.1	86.3	87.9	3.703	7.1	60.7
DCBLM	94.7	94.6	90.8	93.4	1.818	6.3	4.0

Table 8. Data features of RSOD, NWPU VHR-10 and DIOR datasets

View table

Table 8. Data features of RSOD, NWPU VHR-10 and DIOR datasets

Variable	Value
Variable	RSOD	NWPU VHR-10	DIOR
Total number of images	936	650	23463
Number of object categories	4	10	20
Total number of samples	7400	3896	122670
Spatial resolution /m	0.5‒2	0.5‒2	0.5‒30
Object scale /（pixel×pixel）	8×10‒746×810	17×18‒418×513	1×2‒798×99

Table 9. Comparison of detection metrics between YOLOv8n and DCBLM on different datasets
View table
Table 9. Comparison of detection metrics between YOLOv8n and DCBLM on different datasets
Model mAP/% Maximum GPU utilization/%
RSOD NWPU VHR-10 DIOR
YOLOv8n 80.7 57.8 66.0 71.6
DCBLM 82.9 60.8 68.0 62.2

Tools

Get Citation

Copy Citation Text

Yuhe Zhang, Jing Zhang, Xinfang Yuan, Xiaohui Li, Jiajia Zhu, Lin Mi, Binbin Chen, Guang Yang, Shuai Dou. Lightweight Model for Object Detection in Optical Remote Sensing Images Based on Deformable Convolution[J]. Acta Optica Sinica, 2025, 45(12): 1228014

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Remote Sensing and Sensors

Received: Dec. 25, 2024

Accepted: Mar. 25, 2025

Published Online: Jun. 24, 2025

The Author Email: Jing Zhang (zhangjing@aoe.ac.cn), Xinfang Yuan (yuanxf@aircas.ac.cn), Xiaohui Li (xhli@aoe.ac.cn)

DOI:10.3788/AOS241932

CSTR:32393.14.AOS241932

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology

Table 1. Data features of DOTA-v1.5 dataset and UL22 dataset

Table 1. Data features of DOTA-v1.5 dataset and UL22 dataset

Table 2. Experimental environment details

Table 2. Experimental environment details

Table 3. DCBLM ablation experiment on DOTA-v1.5 dataset

Table 3. DCBLM ablation experiment on DOTA-v1.5 dataset

Table 4. DCBLM ablation experiment on UL22 dataset

Table 4. DCBLM ablation experiment on UL22 dataset

Table 5. Performance comparison of DCBLM and mainstream lightweight models on DOTA-v1.5 dataset

Table 5. Performance comparison of DCBLM and mainstream lightweight models on DOTA-v1.5 dataset

Table 6. AP for each object on DOTA-v1.5 dataset with DCBLM and mainstream lightweight models

Table 6. AP for each object on DOTA-v1.5 dataset with DCBLM and mainstream lightweight models

Table 7. Performance comparison of DCBLM and mainstream lightweight models on UL22 dataset

Table 7. Performance comparison of DCBLM and mainstream lightweight models on UL22 dataset

Table 8. Data features of RSOD, NWPU VHR-10 and DIOR datasets

Table 8. Data features of RSOD, NWPU VHR-10 and DIOR datasets

Table 9. Comparison of detection metrics between YOLOv8n and DCBLM on different datasets

Table 9. Comparison of detection metrics between YOLOv8n and DCBLM on different datasets