Multi-exposure Image Fusion Based on Attention Mechanism

Bendu BAI; Junpeng LI

doi:10.3788/gzxb20225104.0410004

Acta Photonica Sinica, Volume. 51, Issue 4, 0410004(2022)

Multi-exposure Image Fusion Based on Attention Mechanism

Bendu BAI and Junpeng LI^*

Author Affiliations

School of Communication and Information Engineering，Xi'an University of Posts & Telecommunications，Xi'an 710121，China

show less

Abstract Get PDF(in Chinese)

Humans mainly perceive and understand the unknown world by obtaining effective information, the visual system has always been an important way to obtain external information. With the development of digital information technology and the demand for human vision, imaging equipment has greatly improved in items of image resolution and dynamic response range. In recent years, imaging technology and its processing technology have played a vital role in various fields. Due to images captured by traditional cameras can only record a limited dynamic range, and the scene is unrepeatable and transient, the interested target cannot be captured again, and we can only process existing images, therefore, reconstructing the high dynamic range image from low-quality images and improving the visual quality of scenes is a key issue in computer vision and has very important research value. In this dissertation, we focus on the dynamic range image reconstruction method in improving image quality of static scenes. The lack of ground-truth fused images for supervised learning, and exiting multi-exposure image fusion suffer from loss of edge features and blurred detail. To address these problems, we propose an attention guided network for multi-exposure image fusion. First, a dual channel Unet network with independent weights is established, extract feature from under-exposure and over-exposure images of the target scene, and a multi-scale and high-dimensional feature maps with strong texture information feature expression ability is obtained. Then, through visual attention mechanism focus local details and global features of under- and over-exposure images, generated the logical mask of the target region of interest area and superimposed on the high-dimensional multiscale feature maps to highlight the target features and suppress the non target area. Finally, during the reconstruction process, we concatenate the filtered high-dimensional multiscale features, the dilation residual dense block is used, the dilation residual dense block makes full use of the features of different levels, retains more detailed information from low dynamic range image, and increases the image receptive filed to predict the details of the saturation region. Based on end-to-end network, in order to reconstruct the fused image more accurately, in which the L2 norm is used as the constraint criterion of the content loss and the SSIM is used as the constraint criterion of the structural loss to design multiple loss functions constrain the neural network, so as to obtain a small similarity difference between the source image sequence and the fused image, realize more accurate convergence of the neural network model, and unsupervised learning. To verify the effectiveness of the proposed algorithm, some images selected from the MEFB benchmark dataset as the test set. The test set including indoor, outdoor, day, night and other static scenes, covering a wide range of real scenes, which can better show the real scene information. Combined three traditional algorithms and two deep learning algorithms for subjective analysis, and used five quality evaluation indicators of fusion image and average running time for objective evaluation. Ablation study were carried out from the effectiveness of both the attention mechanism module and the loss function $λ$ hyper parameters, the experimental results show that the proposed algorithm can capture more detailed information and structural information from the source image sequences under static scenes, obtain fused images with clear scenes and salient features, and the fused image is more in line with human visual characteristics. Comparing with the other typical algorithms, the proposed algorithm not only overcomes the shortcomings of traditional algorithms that cannot adaptively learn features and the fusion rules need to be hand-crafted, but also introduces attention mechanism and dilation residual dense block, which make easier to predict the details and structural information of saturated areas and under-exposure areas, so as to obtain more comprehensive, reliable, abundant scene information with stronger robustness.

Keywords

Attention mechanism Convolution Neural Network High-dynamic range image Multi-exposure Unsupervised

Tools

Get Citation

Copy Citation Text

Bendu BAI, Junpeng LI. Multi-exposure Image Fusion Based on Attention Mechanism[J]. Acta Photonica Sinica, 2022, 51(4): 0410004

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites