Acta Optica Sinica, Volume. 45, Issue 15, 1510007(2025)

Mask-Guided Two-Stage Infrared and Visible Image Fusion Network

Xiaodong Zhang1, Dianwei Zhang1、*, Yuanyuan Li2, Shanshan Peng1, and Long Zhang1
Author Affiliations
  • 1Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Shandong Provincial Key Laboratory of Intelligent Oil & Gas Industrial Software, Qingdao 266580, Shandong , China
  • 2Daqing Power Supply Company, State Grid Heilongjiang Electric Power Company Limited, Daqing 163000, Heilongjiang , China
  • show less
    Figures & Tables(15)
    Architecture of two-stage image fusion network
    Salient object detection network module. (a) RepVGG structural reparameterization process; (b) U2-Net structure
    Fusion module of proposed method. (a) Dual branch feature fusion module; (b) channel-spatial fusion process
    Overall architecture of MGRM. (a) MGRM; (b) channel attention module; (c) spatial attention module
    Qualitative comparison results of different methods on RoadScene dataset. (a) Infrared image; (b) visible image; (c) PIAFusion; (d) SOSMaskFuse; (e) SFCFusion; (f) GANMcC; (g) IFCNN; (h) DATFuse; (i) SeAFusion; (j) BTSFusion; (k) SwinFusion; (l) Ours
    Qualitative comparison results of different methods on FLIR dataset. (a) Infrared image; (b) visible image; (c) PIAFusion; (d) SOSMaskFuse; (e) SFCFusion; (f) GANMcC; (g) IFCNN; (h) DATFuse; (i) SeAFusion; (j) BTSFusion; (k) SwinFusion; (l) Ours
    Qualitative comparison results of ablation experiments for two salient object detection networks
    Visualized results of ablation experiments for different modules in the second stage
    Visualized results of ablation experiments for MGRM attention mechanism
    • Table 1. Quantitative results of proposed method and nine other methods based on RoadScene dataset

      View table

      Table 1. Quantitative results of proposed method and nine other methods based on RoadScene dataset

      MethodENSDMIFMIdctFMIωQAB/FPSNRVIF
      PIAFusion7.04841.1933.4750.2460.3340.48962.1430.667
      SOSMaskFuse7.16846.1933.6410.3480.4310.42561.6010.764
      SFCFusion7.10839.5582.7860.3610.4300.60165.1570.754
      GANMcC7.24142.8963.2730.3370.3670.41962.2980.607
      IFCNN7.11638.8643.2530.3310.3950.55166.5380.677
      DATFuse6.81932.8413.4130.2590.3090.49765.0510.687
      SeAFusion7.39146.8213.3130.2200.3140.51064.2950.701
      BTSFusion7.11939.7992.5580.2250.2760.45766.7030.548
      SwinFusion7.04643.5393.3480.2890.3650.47063.7050.688
      Ours7.29745.4133.5420.3950.4370.48465.4300.779
    • Table 2. Quantitative results of proposed method and nine other methods based on FLIR dataset

      View table

      Table 2. Quantitative results of proposed method and nine other methods based on FLIR dataset

      MethodENSDMIFMIdctFMIωQAB/FPSNRVIF
      PIAFusion7.39748.9473.2490.2780.3250.51564.8110.623
      SOSMaskFuse7.49150.8243.5560.3730.4530.45663.9450.677
      SFCFusion7.39444.9502.3610.3830.4690.54466.1700.674
      GANMcC7.18940.8812.8520.3680.3980.31263.1530.463
      IFCNN7.37445.3332.6610.3740.4160.51365.8070.571
      DATFuse6.05637.8703.2870.2760.4170.46964.4430.644
      SeAFusion7.42350.0372.7390.2540.2900.46863.3040.493
      BTSFusion7.39745.9252.2010.2600.2650.42365.5550.401
      SwinFusion7.24250.5702.9630.3510.3670.45463.1410.580
      Ours7.38750.1753.3740.4630.4930.48566.6700.697
    • Table 3. Quantitative comparison results of ablation experiments for two salient object detection networks

      View table

      Table 3. Quantitative comparison results of ablation experiments for two salient object detection networks

      MethodENSDMIFMIdctFMIωQAB/FPSNRVIF
      U2-Net7.30146.0153.5860.4290.4610.55662.8350.731
      RepU2-Net7.30546.1653.6150.4300.4640.55862.8260.729
    • Table 4. Quantitative comparison results of different module ablation experiments in the second stage

      View table

      Table 4. Quantitative comparison results of different module ablation experiments in the second stage

      MethodENSDMIFMIdctFMIωQAB/FPSNRVIF
      Net A7.28345.7783.5760.4170.4510.55263.2910.725
      Net B7.29245.0263.5740.4150.4590.54862.7980.724
      Net C7.30546.1653.6150.4300.4640.55862.8260.729
    • Table 5. Quantitative comparison results of ablation experiments for MGRM attention mechanism

      View table

      Table 5. Quantitative comparison results of ablation experiments for MGRM attention mechanism

      MethodENSDMIFMIdctFMIωQAB/FPSNRVIF
      Module A7.25945.1183.6130.4250.4600.55663.1620.721
      Module B7.30546.1653.6150.4300.4640.55862.8260.729
    • Table 6. Quantitative comparison results of loss function parameter

      View table

      Table 6. Quantitative comparison results of loss function parameter

      MethodENSDMIFMIdctFMIωQAB/FPSNRVIF
      α=17.28445.6543.5770.4270.4610.55462.8500.721
      α=107.28345.3643.5630.4270.4620.55162.5690.715
      α=1007.30546.1653.6150.4300.4640.55862.8260.729
      α=10007.29045.7543.6020.4290.4630.55762.9190.726
    Tools

    Get Citation

    Copy Citation Text

    Xiaodong Zhang, Dianwei Zhang, Yuanyuan Li, Shanshan Peng, Long Zhang. Mask-Guided Two-Stage Infrared and Visible Image Fusion Network[J]. Acta Optica Sinica, 2025, 45(15): 1510007

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Apr. 8, 2025

    Accepted: May. 12, 2025

    Published Online: Aug. 15, 2025

    The Author Email: Dianwei Zhang (z23070050@s.upc.edu.cn)

    DOI:10.3788/AOS250859

    CSTR:32393.14.AOS250859

    Topics