Uav infrared object detection algorithm based on multi-scale fusion and channel compression

Kaijun WU; Zhibo WAN; Juanjuan DU; Lidong ZHANG; Yuelian WU; Fengqi ZHANG

doi:10.3788/IRLA20240598

Infrared and Laser Engineering, Volume. 54, Issue 7, 20240598(2025)

Uav infrared object detection algorithm based on multi-scale fusion and channel compression

Kaijun WU¹, Zhibo WAN^1、*, Juanjuan DU², Lidong ZHANG², Yuelian WU², and Fengqi ZHANG²

Author Affiliations

¹School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China

²Development Service Center of Tongliao National Agricultural Science and Technology Park, Tongliao 028006, China

show less

Abstract Get PDF(in Chinese)

Objective Infrared object detection in UAV applications is of significant value, as it can enhance target recognition under low light, complex backgrounds, and extreme weather conditions. However, due to challenges such as target feature blurring, significant multi-target scale differences, and dynamic angle changes in UAV infrared images, existing models struggle to balance high accuracy and real-time performance on resource-constrained UAV hardware. Therefore, this paper proposes model optimization based on YOLOv8 for UAV-based infrared object detection, aiming to improve detection performance for complex backgrounds and dynamic targets while reducing computational resource usage, thereby better adapting to resource-constrained real-world environments.Methods A lightweight UAV infrared object detection model, PSI-YOLO, is proposed based on multi-scale feature fusion and channel compression. First, to address the limitations of UAV computational resources and the loss of texture details in infrared images, a multi-scale feature extraction network, PHGNet (Fig.2), is introduced. This backbone network integrates the HGNetV2 network with channel scaling (Fig.3) and a partial perceptual spatial attention mechanism (Fig.4), achieving a lightweight design while enhancing feature extraction accuracy. Second, to handle complex backgrounds and excessive angular changes in infrared images, which cause target image distortion, a Slim-neck is designed to improve information flow through grouped convolutions and channel rearrangement (Fig.5), combined with cross-stage and partial residual connections (Fig.6) for feature fusion. Finally, the Inner-Eiou (Fig.7) loss function is introduced to accelerate model convergence and improve target localization accuracy, thereby strengthening target object detection performance.Results and Discussions The experiments were conducted using the HIT-UAV dataset (Fig.8), which is mainly used for personnel and vehicle detection in thermal infrared images of high-altitude UAVs. The feasibility of the improvements in each module is verified by ablation experiments (Tab.2) and comparison experiments of different lightweight backbone networks (Tab.3). The results show that PHGNet achieves a better balance between lightweight design and detection accuracy. Next, the performance of different loss functions is evaluated (Tab.4), and the experimental results show that Inner-EIoU converges faster and with less fluctuation (Fig.10). In addition, a comparison with the experimental results of different modeling algorithms (Tab.5) shows that PSI-YOLO outperforms the benchmark model in detection performance (Fig.11) and reduces the number of parameters, model size, and FLOPs by 35.5%, 25.4%, and 28.0%, respectively. Finally, heat maps (Fig.12) and detection maps (Fig.13) are provided, more comprehensively verifying the effectiveness of the improved model in reducing missed and false detection rates.Conclusions A lightweight object detection model, PSI-YOLO, is developed to address the challenges of significant feature loss, low recognition accuracy, and high computational cost caused by the lack of texture details and target deformation in UAV infrared images. The model incorporates a lightweight backbone network, PHGNet, to alleviate feature loss resulting from the absence of texture details. To resolve target deformation and stretching issues in infrared images, the Slim-neck module leverages grouped convolutions and cross-stage connections for efficient feature fusion. The loss function is refined to Inner-EIoU. Experimental results validate the effectiveness and superiority of the algorithm for object detection in UAV infrared scenes.

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Keywords

feature enhancement infrared image object detection model lightening PSI-YOLO model

Tools

Get Citation

Copy Citation Text

Kaijun WU, Zhibo WAN, Juanjuan DU, Lidong ZHANG, Yuelian WU, Fengqi ZHANG. Uav infrared object detection algorithm based on multi-scale fusion and channel compression[J]. Infrared and Laser Engineering, 2025, 54(7): 20240598

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Optical imaging, display and information processing

Received: Dec. 23, 2024

Accepted: --

Published Online: Aug. 29, 2025

The Author Email: Zhibo WAN (12231965@stu.lzjtu.edu.cn)

DOI:10.3788/IRLA20240598

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology