Target Detection in Remote Sensing Image Based on Deformable Transformer and Adaptive Detection Head

Haokang Peng; Yun Ge; Xiaoyu Yang; Changquan Hu

doi:10.3788/LOP231702

Laser & Optoelectronics Progress, Volume. 61, Issue 12, 1228006(2024)

Target Detection in Remote Sensing Image Based on Deformable Transformer and Adaptive Detection Head

Haokang Peng¹, Yun Ge^1,2、*, Xiaoyu Yang¹, and Changquan Hu¹

Author Affiliations

¹School of Software, Nanchang Hangkong University, Nanchang 330063, Jiangxi , China

²Jiangxi Huihang Engineering Consulting Co., Ltd., Nanchang 330038, Jiangxi , China

show less

Abstract Get PDF(in Chinese)

Figures & Tables(19)

Fig. 1. Network framework of proposed method

Download full size

Fig. 2. Feature fusion moudle

Download full size

Fig. 3. Deformable Transformer module

Download full size

Fig. 4. Task learning module

Download full size

Fig. 5. Feature Learning areas for different tasks. (a) Classification task; (b) feature location task

Download full size

Fig. 6. Adaptive detection head

Download full size

Fig. 7. Comparison of different frame structures with same IoU

Download full size

Fig. 8. Examples of remote sensing image

Download full size

Fig. 9. Size distribution of each category of ground truth box in different remote sensing datasets. (a) NWPU VHR-10; (b) RSOD

Download full size

Fig. 10. Examples of detection results of the proposed method on NWPU VHR-10 dataset

Download full size

Fig. 11. Comparison of detection results of different methods on the RSOD dataset

Download full size

Table 1. Comparison of the effect of different number of TLM in the ADH
View table
Table 1. Comparison of the effect of different number of TLM in the ADH
No. mAP mAP₅₀ mAP₇₅
0 58.8 92.8 65.3
1 58.7 93.2 65.4
2 60.2 93.9 68.1
3 59.2 93.3 64.9
4 59.0 93.0 64.2

Table 2. Ablation results of feature extraction network based on feature fusion and Deformable Transformer
View table
Table 2. Ablation results of feature extraction network based on feature fusion and Deformable Transformer
Algorithm mAP mAP₅₀ mAP₇₅
ResNet50 58.6 91.6 67.5
ResNet50+Fusion module 58.7 92.2 67.2
ResNet50+Deformable Transformer 59.6 93.3 67.9
ResNet50+Fusion module+Deformable Transformer 60.2 93.9 68.1

Table 3. Ablation results of different modules
View table
Table 3. Ablation results of different modules
Module mAP mAP₅₀ mAP₇₅
Baseline 58.1 92.3 66.2
Baseline+L1-IoU 58.8 92.8 65.3
Baseline+ADH 58.9 92.6 65.0
Baseline+L1-IoU+ADH 60.2 93.9 68.1

Table 4. Comparison of different losses on the NWPU VHR-10 dataset
View table
Table 4. Comparison of different losses on the NWPU VHR-10 dataset
Loss mAP mAP₅₀ mAP₇₅
L1 loss 58.9 92.6 65.0
IoU loss 55.6 92.8 58.6
GIoU loss ‒ 91.2 ‒
CIoU loss ‒ 92.4 ‒
L1-IoU loss 60.2 93.9 68.1

Table 5. Comparison on NWPU VHR-10 dataset and RSOD dataset for different methods

View table

Table 5. Comparison on NWPU VHR-10 dataset and RSOD dataset for different methods

Method	NWPU VHR-10			RSOD
Method	mAP	mAP₅₀	mAP₇₅	mAP	mAP₅₀	mAP₇₅
Faster R-CNN	60.3	87.6	68.5	57.7	92.0	66.1
Double Heads	62.5	87.6	71.5	60.9	92.4	71.5
RetinaNet	58.2	89.2	65.1	58.9	91.8	69.7
ATSS	58.2	90.2	64.1	59.5	92.8	68.7
Deformable DETR	58.7	91.5	64.7	59.1	94.1	65.4
Ours	60.2	93.9	68.1	61.1	95.0	69.1

Table 6. Comparison of parameter and calculation amount of different methods
View table
Table 6. Comparison of parameter and calculation amount of different methods
Method Number of parameters /M GLOPs /G
Faster R-CNN 41.1 134.4
Double Heads 46.7 408.6
RetinaNet 36.2 128.7
ATSS 31.9 126.0
Deformable DETR 38.3 122.2
Ours 36.8 154.0

Table 7. AP comparison of different categories on the NWPU VHR-10 dataset

View table

Table 7. AP comparison of different categories on the NWPU VHR-10 dataset

Class	AP
Class	Faster R-CNN	Double Heads	RetinaNet	ATSS	Deformable DETR	Ours
Airplane	86.5	95.2	89.5	96.2	94.3	100.0
Ship	93.1	98.8	96.5	95.9	93.2	88.8
Storage tank	91.0	90.9	85.7	89.2	87.2	96.9
Baseball diamond	94.9	96.8	99.0	97.5	98.0	96.9
Tennis court	83.2	83.7	78.1	76.6	89.9	95.6
Baseball court	86.3	86.8	92.8	91.7	97.1	90.1
Ground track filed	100.0	96.0	100.0	100.0	99.9	94.5
Harbor	88.1	78.9	89.0	91.9	83.8	98.9
Bridge	95.0	87.9	94.8	86.1	89.2	87.1
Vehicle	57.9	60.7	66.1	76.5	82.6	90.2
mAP₅₀	87.6	87.6	89.2	90.2	91.5	93.9

Table 8. AP comparison of different categories on the RSOD dataset

View table

Table 8. AP comparison of different categories on the RSOD dataset

Class	AP
Class	Faster R-CNN	Double Heads	RetinaNet	ATSS	Deformable DETR	Ours
Aircraft	90.5	90.5	86.6	88.6	90.5	94.6
Overpass	89.5	85.4	87.5	84.1	88.8	90.7
Playground	96.0	100.0	98.8	100.0	99.9	96.9
Oiltank	92.1	93.8	94.3	98.4	97.1	97.8
mAP₅₀	92.0	92.4	91.8	92.8	94.1	95.0

Tools

Get Citation

Copy Citation Text

Haokang Peng, Yun Ge, Xiaoyu Yang, Changquan Hu. Target Detection in Remote Sensing Image Based on Deformable Transformer and Adaptive Detection Head[J]. Laser & Optoelectronics Progress, 2024, 61(12): 1228006

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites