Infrared and Laser Engineering, Volume. 54, Issue 8, 20250210(2025)

Infrared and visible image fusion based on cross-modal feature interaction and multi-scale reconstruction

Rui YAO*, Kai WANG, Haofan GUO, Wentao HU, and Xiangrui TIAN
Author Affiliations
  • School of Automation Enginerring, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
  • show less
    Figures & Tables(11)
    The proposed algorithm network structure diagram: Convolutional attention enhancement module(CEM), Encoder network (Encoder), Cross-modal feature interactive fusion module(CFIM) and Decoder network based on multi-scale reconstruction(Decoder)
    CFIM structure diagram
    Comparison of TNO data set fusion effect
    Comparison of LLVIP data set fusion effect
    Comparison of MSRS data set fusion effect
    • Table 1. Encoder network and decoder network parameter table

      View table
      View in Article

      Table 1. Encoder network and decoder network parameter table

      LayerSizeStrideInputOutputActivate
      EncodeConv03×31116ReLU
      E_Conv13×311616ReLU
      E_Conv23×321632ReLU
      E_Conv33×323264ReLU
      E_Conv43×3264128ReLU
      DecodeD_Conv13×31256256ReLU
      D_Conv23×31128128ReLU
      D_Conv33×316464ReLU
      D_Conv43×31321ReLU
    • Table 2. Objective index results of fused images on TNO dataset (Black bold: best, underline: second)

      View table
      View in Article

      Table 2. Objective index results of fused images on TNO dataset (Black bold: best, underline: second)

      MethodAGEIMIPSNRSDSFSSIMVIF
      GANMcC2.4426.530.6027.8733.435.380.800.12
      SwinFuse3.9149.430.5327.8952.838.370.530.10
      U2 Fusion4.2151.531.3527.9837.198.421.310.79
      LapH4.0150.771.3027.6544.748.211.280.83
      MUFusion4.2156.110.4527.9047.477.750.530.12
      CMRFusion3.4139.112.5227.8236.667.441.300.87
      TUFusion1.8621.060.5227.0028.933.850.910.10
      Ours4.7766.351.9127.9263.289.921.081.01
    • Table 3. Objective index results of fused images on LLVIP dataset (Black bold: best, underline: second)

      View table
      View in Article

      Table 3. Objective index results of fused images on LLVIP dataset (Black bold: best, underline: second)

      MethodAGEIMIPSNRSDSFSSIMVIF
      GANMcC1.6019.910.5827.9931.244.190.910.24
      SwinFuse1.0717.000.4828.2826.523.630.580.13
      U2 Fusion1.8226.371.6428.6429.464.911.340.70
      LapH2.3537.731.5828.6544.355.671.351.02
      MUFusion1.9527.730.5027.7933.584.740.780.24
      CMRFusion1.8928.001.9128.1233.934.871.110.70
      TUFusion1.1212.660.5726.7729.602.850.970.22
      Ours2.1638.232.1328.7548.555.681.101.09
    • Table 4. Objective index of fused images on MSRS dataset (Black bold: best, underline: second)

      View table
      View in Article

      Table 4. Objective index of fused images on MSRS dataset (Black bold: best, underline: second)

      MethodAGEIMIPSNRSDSFSSIMVIF
      GANMcC1.6419.700.4928.2026.553.970.870.11
      SwinFuse1.0016.340.3028.5928.473.180.490.05
      U2 Fusion1.4619.221.3627.2418.563.780.980.43
      LapH2.4737.191.5328.4940.755.521.300.78
      MUFusion2.0429.610.3227.8827.004.650.760.09
      CMRFusion2.0731.533.5631.6542.964.871.431.02
      TUFusion1.3916.620.4726.7826.133.500.860.09
      Ours2.2940.962.4128.6556.205.700.930.95
    • Table 5. CFIMRFusion ablation experiment results (black bold represents the highest value)

      View table
      View in Article

      Table 5. CFIMRFusion ablation experiment results (black bold represents the highest value)

      DatasetMethodsAGEISDSFVIF
      TNOCEM+CFIM3.1240.7943.626.150.65
      CEM+JCDE3.7145.7744.397.690.89
      CFIM+JCDE4.5658.5652.869.450.90
      CEM+CFIM+JCDE4.7766.3563.289.921.01
      LLVIPCEM+CFIM1.8527.9236.504.760.98
      CEM+JCDE1.8628.5137.344.870.98
      CFIM+JCDE2.0834.3144.545.451.05
      CEM+CFIM+JCDE2.1638.2348.555.681.09
      MSRSCEM+CFIM2.0431.3941.934.810.91
      CEM+JCDE2.0932.0042.874.960.94
      CFIM+JCDE2.2235.9248.305.420.94
      CEM+CFIM+JCDE2.2940.9656.205.700.95
    • Table 6. Comparison of average running time of different algorithms on three datasets

      View table
      View in Article

      Table 6. Comparison of average running time of different algorithms on three datasets

      Time/sTNOLLVIPMSRS
      GANMcC4.881222.93485.2128
      SwinFuse5.185317.67715.3423
      U2 Fusion4.203211.81762.5972
      LapH2.085624.52131.9659
      MUFusion5.280833.36924.3691
      CMRFusion8.414242.80388.3331
      TUFusion3.83533.94383.8626
      CFIMRFusion0.50292.81760.4963
    Tools

    Get Citation

    Copy Citation Text

    Rui YAO, Kai WANG, Haofan GUO, Wentao HU, Xiangrui TIAN. Infrared and visible image fusion based on cross-modal feature interaction and multi-scale reconstruction[J]. Infrared and Laser Engineering, 2025, 54(8): 20250210

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Optical imaging, display and information processing

    Received: Apr. 3, 2025

    Accepted: --

    Published Online: Aug. 29, 2025

    The Author Email: Rui YAO (yaorui@nuaa.edu.cn)

    DOI:10.3788/IRLA20250210

    Topics