Acta Optica Sinica, Volume. 45, Issue 15, 1510005(2025)

Polarization-Aware Dual-Encoder Network for Image Dehazing

Jing Wu1,2, Rong Luo1,2, Feng Huang1,2、*, Zhewei Liu1,2, and Yunyi Chen1,2
Author Affiliations
  • 1School of Mechanical Engineering and Automation, Fuzhou University, Fuzhou 350116, Fujian , China
  • 2Institute of Advanced Technology Innovation, Fuzhou University, Fuzhou 350116, Fujian , China
  • show less

    Objective

    Image dehazing represents a crucial research direction in low-level vision, aimed at restoring visibility and details in hazy images. This capability holds significant importance for applications including autonomous driving, surveillance systems, and remote sensing. While deep learning-based single-image dehazing algorithms have demonstrated notable advances in recent years, they continue to face adaptability challenges when processing real-world hazy scenes characterized by complex lighting conditions and diverse haze distributions. Traditional polarization-based dehazing methods demonstrate effectiveness in complex hazy environments, yet they frequently overlook the polarization degree of transmitted light and exhibit limited adaptability to global illumination changes, constraining their practical performance. Consequently, developing a more effective and adaptive image dehazing method that maximizes polarization information benefits while addressing existing methodological limitations remains essential.

    Methods

    This paper addresses these challenges by introducing a polarization-aware dual-encoder dehazing network that utilizes scene polarization information for image restoration. The network implements a dual-encoder architecture (Fig. 1), consisting of two parallel branches: convolutional neural networks (CNN) encoding and Transformer encoding. The CNN encoding branch captures local details and texture information, while the Transformer encoding branch processes long-range dependencies and global contextual information. Within the CNN branch, a multi-angular polarization aggregation (MAPA) module embeds distinct position encodings into multi-angular polarization information and performs compression. Subsequently, a dynamic large kernel (DLK) module extracts multi-scale polarization local features. The Transformer branch employs a retentive meet transformer (RMT) module to extract multi-scale global feature information and integrate fine local features from the CNN branch, enhancing the Transformer module’s local representation capability. An adaptive dynamic feature fusion (ADFF) module dynamically fuses features from different levels. The architecture concludes with a Transformer decoder that globally decodes the multi-level feature layers, progressively upsampling the resolution and restoring image details to produce the dehazed result.

    Results and Discussions

    The experimental results demonstrate the algorithm’s superior performance in both objective evaluation metrics and visual quality. For comprehensive evaluation, this study employed parametric metrics including peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and visibility index (VI), alongside non-parametric metrics such as natural image quality evaluator (NIQE) and blind/referenceless image spatial quality evaluator (BRISQUE), conducting assessments across multiple datasets (Table 1). On the IHP dataset, the method achieved optimal performance with PSNR, SSIM, and VI values of 22.54 dB, 0.8379, and 0.8669, respectively, surpassing the second-best results by 0.85 dB, 0.012, and 0.0161. The method achieved the highest PSNR on the Cityscapes-DBF dataset, exceeding FocalNet by 1.05 dB and ConvIR by 2.64 dB. The synthetic dataset’s use of coarse semantic segmentation maps for probabilistic filling resulted in average SSIM performance. In outdoor real-world scenes, the method achieved NIQE and BRISQUE scores of 12.83 and 40.28, ranking first and second respectively. Visually, the algorithm effectively removes haze while maintaining crucial scene details and color information (Figs. 6?8). Extensive ablation studies, documented in Tables 2?4, and Fig. 9, confirmed the effectiveness of the dual-encoder structure and key modules, showing decreased performance when using single-branch structures or removing key components.

    Conclusions

    This paper presents an advanced deep learning-based dehazing method for polarized images. The research introduces three innovative modules: MAPA, DLK, and ADFF, integrating CNN and Transformer features to construct a dual-encoder?single-decoder polarization dehazing network that estimates residuals for image restoration from four polarization states. This approach extends polarization-based dehazing applications, operating independently of prior knowledge while utilizing semantic and contextual information to address spatially varying scattering phenomena. The method demonstrates state-of-the-art performance on the IHP and Cityscapes-DBF datasets, with proven robustness in real outdoor environments. Current limitations include reduced effectiveness in processing hazy images containing sky regions, due to training dataset constraints. Future research will focus on dataset enrichment to enhance algorithm performance across diverse environments.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Jing Wu, Rong Luo, Feng Huang, Zhewei Liu, Yunyi Chen. Polarization-Aware Dual-Encoder Network for Image Dehazing[J]. Acta Optica Sinica, 2025, 45(15): 1510005

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Apr. 7, 2025

    Accepted: May. 8, 2025

    Published Online: Aug. 8, 2025

    The Author Email: Feng Huang (huangf@fzu.edu.cn)

    DOI:10.3788/AOS250848

    CSTR:32393.14.AOS250848

    Topics