Infrared Technology, Volume. 47, Issue 7, 884(2025)
Cross-Modal Multilevel Feature Fusion-Based Algorithm for Power-Equipment Detection
A novel cross-modal multilevel feature fusion algorithm based on adaptive fusion and self-attention enhancement is proposed to address the low robustness of power-equipment detection algorithms and inaccurate small-target detection in complex environments. The algorithm begins by constructing a dual-stream feature-extraction network to extract multilevel target representations from visible-light and infrared images. An adaptive fusion module is introduced to capture complementary features from both the visible-light and infrared branches. Furthermore, a self-attention mechanism based on a Transformer is employed to enhance the semantic spatial information of the complementary features. Finally, precise target localization is achieved by utilizing deep features at different scales. Experimental evaluations were conducted on a custom-developed power-equipment dataset, and the results show that the proposed algorithm achieved an average precision mean value of 91.7%. Compared with using only the visible-light or infrared branch separately, the algorithm shows improvements of 3.5% and 3.9%, respectively, thus effectively achieving cross-modal information fusion. Compared with current mainstream object-detection algorithms, it exhibits superior robustness.
Get Citation
Copy Citation Text
LIU Shanfeng, MAO Wandeng, LI Miaomiao, ZHOU Qiankai, ZOU Wenjie, BAO Hua. Cross-Modal Multilevel Feature Fusion-Based Algorithm for Power-Equipment Detection[J]. Infrared Technology, 2025, 47(7): 884