Method for Infrared and Visible Image Fusion Combining CNN and Transformer Feature Interaction

Deyin ZHANG; Yuyao ZHANG; Juntong LI; Zhanghui WU

Infrared Technology, Volume. 47, Issue 7, 813(2025)

Deyin ZHANG, Yuyao ZHANG^*, Juntong LI, and Zhanghui WU

Author Affiliations

Aviation Electronics and Electrical College, Civil Aviation Flight University of China, Guanghan 618307, China

show less

Abstract Get PDF(in Chinese)

To address the issues of uneven infrared feature distribution, indistinct contours, and loss of crucial background information in fused images caused by insufficient examination of the interaction between CNN- and Transformer-extracted features, this paper proposes a novel infrared and visible image fusion network incorporating CNN–Transformer feature interaction. First, the new fusion network designs a novel spatial-channel hybrid attention mechanism to enhance the extraction efficiency of both global and local features, thus yielding hybrid feature blocks. Second, feature interaction between a CNN and Transformer is leveraged to obtain fused hybrid feature blocks, and a multiscale reconstruction network is constructed to achieve image feature reconstruction for the output. Finally, comparative image fusion experiments are conducted on the TNO dataset between the proposed network and nine other fusion networks. The experimental results show that the fused images obtained by the new network exhibit excellent visual perception, i.e., it effectively highlights infrared features and object contours while preserving rich background texture details. The network achieves average improvements of approximately 64.73%, 8.17%, 69.05%, 66.34%, 15.39%, and 25.66% over existing fusion networks on the EN, SD, AG, SF, SCD, and VIF metrics, respectively. Ablation experiments further validated the effectiveness of the new model.

Keywords

CNN-Transformer features interaction global features hybrid attention image fusion local features

Tools

Get Citation

Copy Citation Text

ZHANG Deyin, ZHANG Yuyao, LI Juntong, WU Zhanghui. Method for Infrared and Visible Image Fusion Combining CNN and Transformer Feature Interaction[J]. Infrared Technology, 2025, 47(7): 813

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: Jul. 30, 2024

Accepted: Aug. 12, 2025

Published Online: Aug. 12, 2025

The Author Email: ZHANG Yuyao (2100564579@qq.com)

DOI:

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology