Infrared and Laser Engineering, Volume. 54, Issue 8, 20250209(2025)

Dynamic feature aggregation and multi-level collaboration for UAV infrared target instance segmentation

Zifen HE, Qigang WANG, Yinhui ZHANG*, Ying HUANG, Wei PENG, and Guangchen CHEN
Author Affiliations
  • Mechanical and Electrical Engineering, Kunming University of Science and Technology, Kunming 650500, China
  • show less

    ObjectiveUAVs equipped with infrared cameras can efficiently acquire and continuously track ground targets without being detected. Aiming at the problems of blurring of image contour caused by long distance in UAV infrared imaging and degradation of segmentation accuracy due to changes in target scale, this paper proposes a segmentation model of UAV infrared target instances with dynamic feature aggregation and multilevel synergy.MethodsA dynamic feature aggregation and multilevel perception UAV infrared target instance segmentation model-DFANet is proposed on the basis of YOLOv8n algorithm (Fig.1) in this paper. Firstly, the standard convolution in the network backbone is replaced with a regional feature adaptive convolution module (Fig.2), which enhances the benefit of feature extraction. Secondly, the original up-sampling module is replaced with a redesigned feature-aware reorganized up-sampling module (Fig.3) to better extract the infrared image target edge feature information; finally, a multi-scale context-aggregated feature extraction module is embedded in the backbone network to reduce the effect of target scale variation (Fig.4).Results and DiscussionsIn order to evaluate the segmentation performance of the proposed network model, various metrics such as mAP50, mAP50-95, size, GFLOPs and inference time are used in this paper for a comprehensive comparison. The ablation study in Table 2 shows that the average segmentation accuracy of the improved IR target segmentation model increases from 67.6% to 78.4% and the inference time slightly increases from 5.5 ms to 10.8 ms. The comparison experiments in Tables 3 and 4 comparing the proposed module with other modules of the same type show that our proposed module has better results for infrared target instance segmentation. Finally, the comparison experiments of different networks in Table 5 illustrate that the improved model outperforms the current state-of-the-art networks yolov11n and YOLOV12n in terms of segmentation accuracy.ConclusionsThis study proposes a dynamic feature aggregation and multilevel synergistic UAV infrared target instance segmentation model, targeting the challenges of missing target details in infrared imaging and low accuracy of multi-scale target recognition, and realizing performance breakthroughs through three core innovations: designing a regional feature adaptive convolution module, based on the spatial attention-guided dynamic weight allocation strategy, to enhance the feature focusing ability on the target's key regions; constructing a feature-aware restructuring up-sampling module, which realizes efficient reconstruction of high-resolution features through content-driven dynamic kernel generation and local affine transformation; and the development of a multi-scale context aggregation module, which fuses cavity convolution and feature pyramid structure to capture cross-level contextual dependencies. Experiments on the aerial infrared vehicle dataset show that DFANet achieves mAP50 78.4% and mAP50-95 51.1%, which improves the segmentation accuracy by 9.7% and 5.6% respectively compared to the benchmark model, and achieves the comprehensive optimal result in experimental comparison with other mainstream instance segmentation networks.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Zifen HE, Qigang WANG, Yinhui ZHANG, Ying HUANG, Wei PENG, Guangchen CHEN. Dynamic feature aggregation and multi-level collaboration for UAV infrared target instance segmentation[J]. Infrared and Laser Engineering, 2025, 54(8): 20250209

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Optical imaging, display and information processing

    Received: Mar. 4, 2025

    Accepted: --

    Published Online: Aug. 29, 2025

    The Author Email: Yinhui ZHANG (zhangyinhui@kust.edu.cn)

    DOI:10.3788/IRLA20250209

    Topics