Laser & Optoelectronics Progress, Volume. 60, Issue 16, 1610013(2023)
Infrared and Visible Image Fusion with Convolutional Neural Network and Transformer
An innovative image fusion model combining convolutional neural network (CNN) and Transformer is proposed to address the issues of the CNN's inability to model the global semantic relevance within the source image and insufficient use of the image context information in infrared and visible image fusion field. First, to compensate for the shortcomings of CNN in establishing long-range dependencies, a combined CNN and Transformer encoder was proposed to improve the feature extraction of correlation between multiple local regions and improve the model's ability to extract local detailed information of images. Second, a fusion strategy based on the modal maximum disparity was proposed for better adaptive representation of information from various regions of the source image during the fusion process, enhancing the fused image's contrast. Finally, by comparing with multiple contrast methods, the fusion model developed in this research was experimentally confirmed using the TNO public dataset. The experimental results demonstrate that the suggested model has significant advantages over existing fusion approaches in terms of both subjective visual effects and objective evaluation metrics. Additionally, through ablation tests, the efficiency of the suggested combined encoder and fusion technique was examined separately. The findings of the experiments further support the effectiveness of the design concept for the infrared and visible image fusion assignments.
Get Citation
Copy Citation Text
Yang Yang, Zhennan Ren, Beichen Li. Infrared and Visible Image Fusion with Convolutional Neural Network and Transformer[J]. Laser & Optoelectronics Progress, 2023, 60(16): 1610013
Category: Image Processing
Received: Aug. 12, 2022
Accepted: Oct. 27, 2022
Published Online: Aug. 18, 2023
The Author Email: Ren Zhennan (Ren2151311@163.com)