Real-time object detection for UAV images based on improved YOLOv5s

YOLOv5s	感受野	通道	YOLOv5sm	感受野	通道
Focus	6	32	Conv 3*3 (stride:2)	3	24
			Conv3*3 (dilation:2)	15	48
下采样	10	64	Conv3*3 (stride:2)	19	96
			Res-Block	27	96
C3_x1	18	64	Res-Dconv	51	96
下采样	26	128	Conv 3*3 (stride:2)	59	192
C3_x3	74	128	C3_x3	107	192
下采样	90	256	Conv3*3 (stride:2)	123	384
C3_x3	186	256	C3_x3	219	384
下采样	218	512	Conv3*3 (stride:2)	251	768
Spp	218~634	512	Spp	251~667	768
C3_x1	282~698	512	C3_x1	315~731	768

Table 2. Pre-setting anchors in response to the receptive field and down-sampling
View table
View in Article
Table 2. Pre-setting anchors in response to the receptive field and down-sampling
下采样因子 3 4 5
最大感受野/pixel 111 255 731
先验框范围 8*8~37*37 32*32~85*85 96*96~365*365

Table 3. Statistics of different types of objects
View table
View in Article
Table 3. Statistics of different types of objects
目标种类 Small (0×0~32×32) Mid (32×32~96×96) Large (96×96~)
数量 44.44 18.63 1.704

Table 4. Performance comparison experiment results of depth and width models
View table
View in Article
Table 4. Performance comparison experiment results of depth and width models
深度宽度 mAP50 mAP BFLOPs
0.33 0.5 0.502 0.288 16.5
0.33 0.75 0.540 0.319 36.3
1.33 0.5 0.525 0.311 35.4

Table 5. Verification experiment results on Res-Dconv module
View table
View in Article
Table 5. Verification experiment results on Res-Dconv module
Baseline Res-Dconv mAP50 mAP BFLOPs
√ 0.502 0.288 16.5
√ √ 0.516 0.299 19.8

Table 6. The ablation experiment results of our algorithm modules on the VisDrone dataset

View table

View in Article

Table 6. The ablation experiment results of our algorithm modules on the VisDrone dataset

Baseline	SM	SCAM	SDCM	mAP	mAP50	BFLOPs	Infer	AP-small	AP-medium	AP-large
注：加粗字体为该列最优值。
YOLOv5s				0.319	0.548	16.5	4.8	0.220	0.437	0.495
	√			0.358	0.589	30.1	8.3	0.280	0.476	0.495
√		√		0.324	0.555	14.7	3.8	0.225	0.446	0.511
√			√	0.333	0.555	19.5	4.9	0.250	0.448	0.482
	√		√	0.356	0.593	38.0	9.0	0.278	0.475	0.512
	√	√	√	0.360	0.596	30.8	7.7	0.281	0.479	0.505

Table 7. Detection performance of different algorithms on VisDrone dataset

View table

View in Article

Table 7. Detection performance of different algorithms on VisDrone dataset

算法	mAP50	mAP	mAP75	AP-small	AP-mid	AP-large	BFLOPs	Infer/ms
注：+为添加改进模块的模型，*为多尺度测试结果，包含引用文献实验结果。
YOLOv3	0.609	0.389	0.417	0.297	0.496	0.545	154.9	27.8
Scaled-YOLOv4	0.620	0.400	0.428	0.305	0.514	0.626	119.4	27.1
ClusDet^[1]	0.562	0.324	0.316	-	-	-	-	-
HRDNet^[1]	0.620	0.3551	0.351	-	-	-	-	-
YOLOv5s	0.548	0.319	0.317	0.220	0.437	0.495	16.5	4.8
YOLOv5m	0.595	0.365	0.372	0.285	0.482	0.525	50.4	9.8
YOLOX-s	0.535	0.314	0.317	0.225	0.415	0.485	41.65	5.1
MobileNetv3	0.554	0.329	0.329	0.245	0.443	0.495	23.8	8.0
MobileViT	0.555	0.333	0.337	0.249	0.442	0.418	-	13.7
YOLOv5sm+	0.596	0.360	0.369	0.281	0.479	0.505	30.8	7.7
YOLOv5sm+*	0.606	0.367	0.378	0.295	0.478	0.439	-	-

Table 8. Detection performance of different algorithms on DIOR dataset

View table

View in Article

Table 8. Detection performance of different algorithms on DIOR dataset

模型	BackBone	mAP50
注：加粗字体为该列最优值，包含其他文献对比结果。
Faster R-CNN^[33]	VGG16	0.541
PANet^[20]	ResNet50	0.638
Retina-Net^[24]	ResNet50	0.685
文献[32]	ResNet50	0.732
CAT-Net^[34]	ResNet50	0.763
YOLOv5sm+(ours)	-	0.667

Tools

Get Citation

Copy Citation Text

Xu Chen, Dongliang Peng, Yu Gu. Real-time object detection for UAV images based on improved YOLOv5s[J]. Opto-Electronic Engineering, 2022, 49(3): 210372-1

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites