Optics and Precision Engineering, Volume. 33, Issue 7, 1152(2025)

Infrared and visible image fusion based on multi-scale spatial attention complementary

Yongxing ZHANG1...2,3, Bowen LIAN1,2,3, Naiting GU4,*, Fangzhao LI4, and Yang LI12,* |Show fewer author(s)
Author Affiliations
  • 1National Key Laboratory of Adaptive Optics, Chengdu60209, China
  • 2Institute on Optics and Electronics Technology, Chinese Academy of Sciences, Chengdu61009, China
  • 3University of Chinese Academy of Sciences, Beijing100049, China
  • 4College of Frontier Interdisciplinary Sciences, National University of Defense Technology, Changsha10073, China
  • show less
    Figures & Tables(17)
    Overall framework of progressive infrared and visible light image fusion network based on multi-level differential complementary multi-scale spatial attention
    Multi-scale spatial attention multi-level complementary module(MSAMCM)
    Multi-scale spatial attention unit
    Comparison experimental results of sample 559 based on MSRS dataset for daytime scence (The red-framed area is a zoomed-in display of trees, and the green-framed area is a zoomed-in display of houses and windows in the fused image, color figures is showed in PDF file)
    Comparison experimental results of sample 808 based on MSRS dataset for nighttime scene (The red framed area is a zoomed-in view of pedestrians and notice boards, and the green framed area is a zoomed-in view of manhole covers in the fused image, color figures is showed in PDF file)
    Comparison of MI index for different image confusion methods under different cumulative propagation on MSRS dataset
    Comparison of VIF index for different image confusion methods under different cumulative propagation on MSRS dataset
    Comparison experimental results of sample 08835 on RoadScene dataset (The red-framed and green-framed areas are zoomed-in display of roads and bicycles, and pedestrians respectively in the fused image, color figures is showed in PDF file)
    Comparison experimental results of sample 08749 on RoadScene dataset (The red-framed and green-framed areas are zoomed-in views of road and rear door of car respectively in the fused image, color figures is showed in PDF file)
    Comparison experimental results of sample 05027 based on RoadScene dataset (The red-framed and green-framed areas are zoomed-in views of road and manholecover respectively in the fused image, color figures is showed in PDF file)
    Comparison experimental results of sample 06832 based on RoadScene dataset (The red-framed and green-framed areas are zoomed-in views of pedestrians and car as a whole in the fused image, color figures is showed in PDF file)
    Comparison of MI index for different confusion methods under different cumulative propagation based on RoadScene dataset
    Comparison of VIF index for different confusion methods under different cumulative propagation on RoadScene dataset
    Ablation results of fusion images in three typical scenes
    • Table 1. Kernel size, output channel, and activation function of all convolutional layers in feature extractor and image reconstructor

      View table
      View in Article

      Table 1. Kernel size, output channel, and activation function of all convolutional layers in feature extractor and image reconstructor

      LayerFeature extractorImage reconstructor
      Kernel sizeOutput channelActivation functionKernel sizeOutput channelActivation function
      Conv 11×116LeakyRelu3×3256Leaky Relu
      Conv 23×316LeakyRelu3×3128Leaky Relu
      Conv 33×332LeakyRelu3×364Leaky Relu
      Conv 43×364LeakyRelu3×332Leaky Relu
      Conv 53×3128LeakyRelu1×11Tanh
    • Table 2. Comparison of mean and standard deviation of MI and VIF indicators between existing methods and proposed method on MSRS dataset

      View table
      View in Article

      Table 2. Comparison of mean and standard deviation of MI and VIF indicators between existing methods and proposed method on MSRS dataset

      MethodMIVIF
      Densefuse2.392 8±0.486 20.887 9±0.176 0
      FusionGAN1.979 0±0.280 10.498 4±0.142 4
      IFCNN1.736 8±0.506 40.731 5±0.061 7
      LRRNet2.264 2±0.807 70.489 5±0.054 9
      SDNet1.915 9±0.331 60.577 2±0.059 7
      PIAFusion3.370 7±1.009 10.999 3±0.067 3
      IVF-MSAC3.515 2±1.013 61.051 4±0.082 2
    • Table 3. Comparison of average and standard deviation of MI and VIF indicators between existing methods and proposed method on RoadScene dataset

      View table
      View in Article

      Table 3. Comparison of average and standard deviation of MI and VIF indicators between existing methods and proposed method on RoadScene dataset

      MethodMIVIF
      Densefuse2.522 9±0.419 70.479 7±0.040 8
      FusionGAN2.492 6±0.306 30.320 2±0.023 2
      IFCNN2.347 1±0.391 60.473 3±0.038 6
      LRRNet2.608 1±0.411 70.437 2±0.064 4
      SDNet2.953 5±0.432 30.493 0±0.033 5
      PIAFusion3.519 7±0.331 90.568 9±0.077 7
      IVF-MSAC3.678 1±0.378 50.582 3±0.085 5
    Tools

    Get Citation

    Copy Citation Text

    Yongxing ZHANG, Bowen LIAN, Naiting GU, Fangzhao LI, Yang LI. Infrared and visible image fusion based on multi-scale spatial attention complementary[J]. Optics and Precision Engineering, 2025, 33(7): 1152

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Dec. 31, 2024

    Accepted: --

    Published Online: Jun. 23, 2025

    The Author Email: Naiting GU (gnt7328@163.com)

    DOI:10.37188/OPE.20253307.1152

    Topics