Laser & Optoelectronics Progress, Volume. 62, Issue 16, 1615001(2025)

Super-Resolution Reconstruction and Denoising Tasks for Public Safety Scene Images Using the EnSwinIR Model

Qixiang Meng1, Fanliang Bu1、*, and Qiqi Kou2
Author Affiliations
  • 1School of Information Network Security, People's Public Security University of China, Beijing 100038, China
  • 2School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, Jiangsu , China
  • show less

    Super-resolution reconstruction and denoising models facilitate the processing of low-resolution and noisy images in public safety scenarios. While the SwinIR model has achieved notable progress in preserving image details and modeling features in complex scenes, it is still limited by gradient optimization stability, local feature extraction, and computational cost. To address above challenges, our study proposes a hybrid residual attention module-based image super-resolution reconstruction model (EnSwinIR). During training, a perceptual loss function NormRMSE is designed to address the sensitivity of the original mean square error being affected by the absolute size of data pixels. By applying normalization and square root processing, the function enhances stability and learning efficiency. In the local feature extraction module, a four-directional shift convolution is introduced, including up, down, left, and right shifts. This approach reconstructs feature channels through displacement operations, thereby capturing multidirectional contextual information. In addition, a skip connection design combined with a residual module effectively mitigates the gradient vanishing problem often encountered in deep feature extraction. A grouped multi-scale self-attention method is incorporated in the later stages. Input features are evenly divided by channel count, and multi-scale sliding windows are implemented to string an optimal balance between performance, parameter count, and computational complexity. The experimental results indicate that the EnSwinIR model significantly outperforms existing approaches in terms of performance metrics and visual perception for super-resolution reconstruction tasks. For 2× and 4× super-resolution reconstruction, based on multisource testing scenarios, the model achieves an average increase in peak signal-to-noise ratio of 1.9 dB and 2.9 dB, respectively. Furthermore, the average structural similarity index improves by 0.029 and 0.050, respectively. The model also exhibits a reduction in complexity, with the number of parameters decreasing by 27.52% and 27.26% for 2× and 4× tasks, respectively, while the number of floating-point operations per second dropped by 30.72% and 35.65%, respectively. The model demonstrates notable improvements in multisource testing scenarios for the denoising tasks targeting images with noise levels of 15, 25, and 50. The average peak signal-to-noise ratio of the model increases by 3.77, 3.24, and 3.81 dB, respectively. Thus, the images processed by the EnSwinIR model exhibit a more realistic visual appearance and better preservation of local details, thereby demonstrating its potential for application in public safety scenarios.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Qixiang Meng, Fanliang Bu, Qiqi Kou. Super-Resolution Reconstruction and Denoising Tasks for Public Safety Scene Images Using the EnSwinIR Model[J]. Laser & Optoelectronics Progress, 2025, 62(16): 1615001

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Machine Vision

    Received: Dec. 5, 2024

    Accepted: Feb. 7, 2025

    Published Online: Aug. 18, 2025

    The Author Email: Fanliang Bu (20051257@ppsuc.edu.cn)

    DOI:10.3788/LOP242377

    CSTR:32186.14.LOP242377

    Topics