Mask-Guided Two-Stage Infrared and Visible Image Fusion Network

Xiaodong Zhang; Dianwei Zhang; Yuanyuan Li; Shanshan Peng; Long Zhang

doi:10.3788/AOS250859

Acta Optica Sinica, Volume. 45, Issue 15, 1510007(2025)

Mask-Guided Two-Stage Infrared and Visible Image Fusion Network

Xiaodong Zhang¹, Dianwei Zhang^1、*, Yuanyuan Li², Shanshan Peng¹, and Long Zhang¹

¹Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Shandong Provincial Key Laboratory of Intelligent Oil & Gas Industrial Software, Qingdao 266580, Shandong , China

²Daqing Power Supply Company, State Grid Heilongjiang Electric Power Company Limited, Daqing 163000, Heilongjiang , China

show less

Abstract Get PDF(in Chinese)

Figures & Tables(15)

Fig. 1. Architecture of two-stage image fusion network

Download full size

Fig. 2. Salient object detection network module. (a) RepVGG structural reparameterization process; (b) U2-Net structure

Download full size

Fig. 3. Fusion module of proposed method. (a) Dual branch feature fusion module; (b) channel-spatial fusion process

Download full size

Fig. 4. Overall architecture of MGRM. (a) MGRM; (b) channel attention module; (c) spatial attention module

Download full size

Fig. 5. Qualitative comparison results of different methods on RoadScene dataset. (a) Infrared image; (b) visible image; (c) PIAFusion; (d) SOSMaskFuse; (e) SFCFusion; (f) GANMcC; (g) IFCNN; (h) DATFuse; (i) SeAFusion; (j) BTSFusion; (k) SwinFusion; (l) Ours

Download full size

Fig. 6. Qualitative comparison results of different methods on FLIR dataset. (a) Infrared image; (b) visible image; (c) PIAFusion; (d) SOSMaskFuse; (e) SFCFusion; (f) GANMcC; (g) IFCNN; (h) DATFuse; (i) SeAFusion; (j) BTSFusion; (k) SwinFusion; (l) Ours

Download full size

Fig. 7. Qualitative comparison results of ablation experiments for two salient object detection networks

Download full size

Fig. 8. Visualized results of ablation experiments for different modules in the second stage

Download full size

Fig. 9. Visualized results of ablation experiments for MGRM attention mechanism

Download full size

Table 1. Quantitative results of proposed method and nine other methods based on RoadScene dataset

View table

Table 1. Quantitative results of proposed method and nine other methods based on RoadScene dataset

Method	EN	SD	MI	FMI_dct	FMI_ω	Q^AB/F	PSNR	VIF
PIAFusion	7.048	41.193	3.475	0.246	0.334	0.489	62.143	0.667
SOSMaskFuse	7.168	46.193	3.641	0.348	0.431	0.425	61.601	0.764
SFCFusion	7.108	39.558	2.786	0.361	0.430	0.601	65.157	0.754
GANMcC	7.241	42.896	3.273	0.337	0.367	0.419	62.298	0.607
IFCNN	7.116	38.864	3.253	0.331	0.395	0.551	66.538	0.677
DATFuse	6.819	32.841	3.413	0.259	0.309	0.497	65.051	0.687
SeAFusion	7.391	46.821	3.313	0.220	0.314	0.510	64.295	0.701
BTSFusion	7.119	39.799	2.558	0.225	0.276	0.457	66.703	0.548
SwinFusion	7.046	43.539	3.348	0.289	0.365	0.470	63.705	0.688
Ours	7.297	45.413	3.542	0.395	0.437	0.484	65.430	0.779

Table 2. Quantitative results of proposed method and nine other methods based on FLIR dataset

View table

Table 2. Quantitative results of proposed method and nine other methods based on FLIR dataset

Method	EN	SD	MI	FMI_dct	FMI_ω	Q^AB/F	PSNR	VIF
PIAFusion	7.397	48.947	3.249	0.278	0.325	0.515	64.811	0.623
SOSMaskFuse	7.491	50.824	3.556	0.373	0.453	0.456	63.945	0.677
SFCFusion	7.394	44.950	2.361	0.383	0.469	0.544	66.170	0.674
GANMcC	7.189	40.881	2.852	0.368	0.398	0.312	63.153	0.463
IFCNN	7.374	45.333	2.661	0.374	0.416	0.513	65.807	0.571
DATFuse	6.056	37.870	3.287	0.276	0.417	0.469	64.443	0.644
SeAFusion	7.423	50.037	2.739	0.254	0.290	0.468	63.304	0.493
BTSFusion	7.397	45.925	2.201	0.260	0.265	0.423	65.555	0.401
SwinFusion	7.242	50.570	2.963	0.351	0.367	0.454	63.141	0.580
Ours	7.387	50.175	3.374	0.463	0.493	0.485	66.670	0.697

Table 3. Quantitative comparison results of ablation experiments for two salient object detection networks
View table
Table 3. Quantitative comparison results of ablation experiments for two salient object detection networks
Method EN SD MI FMI_dct FMI_ω Q^AB/F PSNR VIF
U2-Net 7.301 46.015 3.586 0.429 0.461 0.556 62.835 0.731
RepU2-Net 7.305 46.165 3.615 0.430 0.464 0.558 62.826 0.729

Table 4. Quantitative comparison results of different module ablation experiments in the second stage
View table
Table 4. Quantitative comparison results of different module ablation experiments in the second stage
Method EN SD MI FMI_dct FMI_ω Q^AB/F PSNR VIF
Net A 7.283 45.778 3.576 0.417 0.451 0.552 63.291 0.725
Net B 7.292 45.026 3.574 0.415 0.459 0.548 62.798 0.724
Net C 7.305 46.165 3.615 0.430 0.464 0.558 62.826 0.729

Table 5. Quantitative comparison results of ablation experiments for MGRM attention mechanism
View table
Table 5. Quantitative comparison results of ablation experiments for MGRM attention mechanism
Method EN SD MI FMI_dct FMI_ω Q^AB/F PSNR VIF
Module A 7.259 45.118 3.613 0.425 0.460 0.556 63.162 0.721
Module B 7.305 46.165 3.615 0.430 0.464 0.558 62.826 0.729

Table 6. Quantitative comparison results of loss function parameter

View table

Table 6. Quantitative comparison results of loss function parameter

Method	EN	SD	MI	FMI_dct	FMI_ω	Q^AB/F	PSNR	VIF
$α$ =1	7.284	45.654	3.577	0.427	0.461	0.554	62.850	0.721
$α$ =10	7.283	45.364	3.563	0.427	0.462	0.551	62.569	0.715
$α$ =100	7.305	46.165	3.615	0.430	0.464	0.558	62.826	0.729
$α$ =1000	7.290	45.754	3.602	0.429	0.463	0.557	62.919	0.726

Tools

Get Citation

Copy Citation Text

Xiaodong Zhang, Dianwei Zhang, Yuanyuan Li, Shanshan Peng, Long Zhang. Mask-Guided Two-Stage Infrared and Visible Image Fusion Network[J]. Acta Optica Sinica, 2025, 45(15): 1510007

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Image Processing

Received: Apr. 8, 2025

Accepted: May. 12, 2025

Published Online: Aug. 15, 2025

The Author Email: Dianwei Zhang (z23070050@s.upc.edu.cn)

DOI:10.3788/AOS250859

CSTR:32393.14.AOS250859

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology

Table 1. Quantitative results of proposed method and nine other methods based on RoadScene dataset

Table 1. Quantitative results of proposed method and nine other methods based on RoadScene dataset

Table 2. Quantitative results of proposed method and nine other methods based on FLIR dataset

Table 2. Quantitative results of proposed method and nine other methods based on FLIR dataset

Table 3. Quantitative comparison results of ablation experiments for two salient object detection networks

Table 3. Quantitative comparison results of ablation experiments for two salient object detection networks

Table 4. Quantitative comparison results of different module ablation experiments in the second stage

Table 4. Quantitative comparison results of different module ablation experiments in the second stage

Table 5. Quantitative comparison results of ablation experiments for MGRM attention mechanism

Table 5. Quantitative comparison results of ablation experiments for MGRM attention mechanism

Table 6. Quantitative comparison results of loss function parameter

Table 6. Quantitative comparison results of loss function parameter