Laser & Optoelectronics Progress, Volume. 62, Issue 10, 1037001(2025)
Trans-YOLO: Improved YOLOv8 with RT-DETR Decoder & Head for Infrared Small Target Detection
Infrared detection serves as a crucial tool for remote search and surveillance, and it plays a significant role in many applications. To enhance the infrared detection accuracy of small targets in complex backgrounds, a Trans-YOLO detection framework based on an improved YOLOv8 model with RT-DETR is proposed. First, to avoid the issue of non-maximum suppression (NMS) in YOLOv8 erroneously suppressing true targets, the Head component of YOLOv8 is replaced with the Decoder & Head from RT-DETR. Furthermore, to address the challenges of weak signal strength and small size of infrared small targets, an RGCSPELAN module is designed to enable the detection network to perform more fine-grained processing of the input features. Finally, to reduce the semantic disparity between deep and shallow features, a new feature fusion strategy, called CAFM-based fusion (CAFMFusion) mechanism, is designed to facilitate the flow of different types of feature information within the network, thereby enhancing the model's ability to detect targets of varying sizes. Experimental results show that the proposed Trans-YOLO model achieves 86. 1% and 99. 5% mean average precision at IoU=0.5 (intersection over union) on two public datasets with complex scenarios, representing improvements of 7.7 percentage points and 3.0 percentage points over the original YOLOv8 model, respectively. Additionally, the model achieves the processing speed of 371.9 frame/s and 369.4 frame/s on the two datasets, respectively, effectively balancing accuracy and speed.
Get Citation
Copy Citation Text
Jiannan Liu, Shuxian Liu, Hankiz Yilahun, Askar Hamdulla. Trans-YOLO: Improved YOLOv8 with RT-DETR Decoder & Head for Infrared Small Target Detection[J]. Laser & Optoelectronics Progress, 2025, 62(10): 1037001
Category: Digital Image Processing
Received: Aug. 27, 2024
Accepted: Oct. 28, 2024
Published Online: May. 13, 2025
The Author Email: Hamdulla Askar (askar@xju.edu.cn)
CSTR:32186.14.LOP241915