Remote Sensing Small Target Detection Based on Multimodal Fusion

Fanfan Liu; Chengmei Zhu; Nana Zhao; Jinghua Wu

doi:10.3788/LOP241203

Laser & Optoelectronics Progress, Volume. 61, Issue 24, 2428010(2024)

Remote Sensing Small Target Detection Based on Multimodal Fusion

Fanfan Liu^1,2, Chengmei Zhu², Nana Zhao², and Jinghua Wu^2、*

Author Affiliations

¹School of Mechanical and Electrical Engineering, Anhui Jianzhu University, Hefei 230601, Anhui , China

²Changzhou Institute of Advanced Manufacturing Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Changzhou 213164, Jiangsu , China

show less

Abstract Get PDF(in Chinese)

Figures & Tables(14)

Fig. 1. Structure of YOLOv5 algorithm

Download full size

Fig. 2. Structure of improved YOLOv5 algorithm

Download full size

Fig. 3. Structure of MF module

Download full size

Fig. 4. Structure of SE module

Download full size

Fig. 5. Structure of RFSA module

Download full size

Fig. 6. IoU value of ground truth with the same size and different shapes

Download full size

Fig. 7. Schematic diagram of Shape-IoU

Download full size

Fig. 8. Detected results of VEDAI and NWPU datasets. (a) VEDAI dataset; (b) NWPU dataset

Download full size

Fig. 9. Comparison of detection results between the baseline model and the improved model

Download full size

Table 1. Ablation experiment results

View table

Table 1. Ablation experiment results

Number	Method	Parameters /M	FPS /（frame·s^-1）	P /%	R /%	mAP@0.5 /%
1	YOLOv5（Focus）	4.83	41	41.63	67.41	54.93
2	YOLOv5（no Focus）	4.83	37	68.91	60.22	64.43
3	YOLOv5+MF	4.85	49	80.45	63.41	67.34
4	YOLOv5+RFSA	4.83	38	70.19	66.27	68.31
5	YOLOv5+Shape-IoU	4.83	38	73.31	64.66	68.19
6	YOLOv5+MF+RFSA	4.84	48	67.71	67.48	66.24
7	YOLOv5+MF+Shape-IoU	4.85	47	68.26	65.18	68.43
8	YOLOv5+RFSA+Shape-IoU	4.83	37	78.58	65.45	70.58
9	YOLOv5+MF+RFSA+Shape-IoU	4.85	48	69.16	72.70	72.83

Table 2. Experimenal results with different number of detection heads

View table

Table 2. Experimenal results with different number of detection heads

Method	Number of detection head	Parameters /M	FPS /（frame·s^-1）	P /%	R /%	mAP@0.5 /%
YOLOv5（Focus）	1	4.84	41	41.63	67.41	54.93
	2	5.29	45	77.42	41.83	53.38
	3	7.08	37	69.37	49.09	51.68
YOLOv5（no Focus）	1	4.83	37	68.91	60.22	64.43
	2	5.29	36	64.59	63.95	63.38
	3	7.08	34	61.54	59.12	60.94
YOLOv5+MF+RFSA+Shape-IoU	1	4.85	48	69.16	72.70	72.83
	2	5.31	43	82.99	62.75	71.47
	3	7.13	39	78.73	67.80	71.11

Table 3. Experimental results of different parameter settings

View table

Table 3. Experimental results of different parameter settings

Method	Input image type	Image size in training set or validation set /（pixel×pixel）	Image size in test set /（pixel×pixel）	mAP₅₀ /%
YOLOv5	RGB	1024×1024	1024×1024	14.29
		1024×1024	512×512	54.93
		512×512	1024×1024	7.91
		512×512	512×512	50.64
	IR	1024×1024	1024×1024	10.82
		1024×1024	512×512	44.81
		512×512	1024×1024	4.87
		512×512	512×512	39.99
Improved YOLOv5	RGB	1024×1024	1024×1024	16.12
		1024×1024	512×512	62.41
		512×512	1024×1024	4.67
		512×512	512×512	51.98
	IR	1024×1024	1024×1024	13.87
		1024×1024	512×512	56.05
		512×512	1024×1024	4.67
		512×512	512×512	44.59
	RGB+IR+Fusion	1024×1024	512×512	72.83

Table 4. Performance comparison for different algorithms on VEDAI dataset

View table

Table 4. Performance comparison for different algorithms on VEDAI dataset

Algorithm	Backbone	mAP₅₀ /%
Faster R-CNN	ResNet-50	64.90
Fast R-CNN	VGG-16	39.80
SSD	VGG-16	46.10
FCOS	ResNet-50	49.60
YOLOv3	Darknet53	61.06
YOLOv4	CSPDarknet53	62.43
YOLOv5	CSPDarknet53	64.43
YOLOrs	ResNet	59.73
YOLOv8m	CSPDarknet53	68.60
Improved YOLOv5	CSPDarknet53	72.83

Table 5. Performance comparison for different algorithms on the NWPU dataset

View table

Table 5. Performance comparison for different algorithms on the NWPU dataset

Algorithm	AP										mAP@0.5
Algorithm	Plane	SH	ST	BD	TC	BC	GTF	Harbor	Bridge	Vehicle	mAP@0.5
Faster R-CNN	94.6	82.3	65.3	95.5	81.9	89.7	92.4	72.4	57.5	77.8	80.9
RetainNet	63.8	39.5	59.5	72.7	62.7	47.4	69.1	34.9	10.2	37.1	49.7
SSD512	90.4	60.9	79.8	89.9	82.6	80.6	98.3	73.4	76.7	52.1	78.5
FCOS	60.4	32.6	54.2	65.1	63.0	59.7	63.2	39.9	14.6	43.3	49.6
YOLOv3	88.3	81.4	87.4	82.4	81.6	76.1	66.6	62.3	64.5	65.4	75.6
YOLOv5	99.5	68.0	94.4	98.3	96.3	89.3	99.5	99.4	79.2	84.3	90.8
YOLOv8	86.4	99.0	86.5	95.3	96.9	85.2	54.5	99.5	90.8	67.9	86.2
Improved YOLOv5	99.5	68.5	98.6	98.9	95.2	96.3	97.8	98.8	92.7	89.0	93.5

Tools

Get Citation

Copy Citation Text

Fanfan Liu, Chengmei Zhu, Nana Zhao, Jinghua Wu. Remote Sensing Small Target Detection Based on Multimodal Fusion[J]. Laser & Optoelectronics Progress, 2024, 61(24): 2428010

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Remote Sensing and Sensors

Received: Apr. 30, 2024

Accepted: May. 20, 2024

Published Online: Dec. 10, 2024

The Author Email: Jinghua Wu (wjh@iamt.ac.cn)

DOI:10.3788/LOP241203

CSTR:32186.14.LOP241203

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology