Acta Photonica Sinica, Volume. 54, Issue 1, 0110003(2025)

Spatial-spectral Collaborative Unrolling Network for Pansharpening

Jianwei ZHENG... Hongyi XIA and Honghui XU* |Show fewer author(s)
Author Affiliations
  • School of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China
  • show less

    To address the limitations inherent in physical device acquisition, pansharpening offers a computational alternative. This process aims to enhance the spatial resolution of Low-Resolution Multispectral Images (LRMS) by integrating textural information from Panchromatic (PAN) images, thereby generating High-Resolution Multispectral images (HRMS). Recently, a growing number of deep learning-based methods, leveraging their enhanced feature extraction capabilities, have been introduced, demonstrating exceptional results in improving fusion quality. However, many of these methods continue to exhibit two notable shortcomings. For one thing, the universally adopted black-box principle limits the model interpretability. For another thing, existing DL-based methods fail to efficiently capture local-and-global dependencies at the same time, inevitably limiting the overall performance. By gathering the merits of nonlinear network architectures and interpretable optimization schemes, Deep Unfolding Network (DUN) has shed new light on pansharpening. However, current DUNs lack a dedicated design for both estimating the degradation matrices and extracting intricate information from the proximal operator. To address the conundrums, we propose a novel Spatial-Spectral Collaborative Unrolling Network (SCUN). An alternating optimization-based Half-Quadratic Splitting (HQS) is practiced to solve the resulting model, giving rise to an elementary iteration mechanism. Under the guidance of iterative optimization theory, this network achieves Adaptive Degradation Matrix Estimation (ADME) and spatial-spectral prior operator learning through multi-scale cascade strategies, point convolution operations, and Transformer technology. During the ADME step, the overall estimation undergoes an end-to-end iterative block, allowing for adaptive modeling of complex spatial and spectral structures. On that basis, we employ customized multiscale convolution and point convolution to simulate the degradation processes of both spatial and spectral degradation matrices. Moreover, the proposed convolution method is reassigned in each unfolding iteration, endowing it with a highly adaptive capability. To address the limitations of prior operators, we propose a collaborative complementary mechanism that enables the approximation of operators and facilitates the joint exploration and acquisition of global-local and spatial-spectral features. This is achieved through a combination of convolutional layers and attention mechanisms. The entire prior module is designed as a U-shaped architecture network, following the process of “embedding-encoder-bottleneck layer-decoder-deembedding” to extract refined feature representations. Initially, the intermediate variables are processed through an embedding layer, which segments them into non-overlapping patch markers. These patch markers are then fed into two Spatial-Spectral Collaborative Modules (SSCMs) and a bottleneck layer consisting of a single SSCM to explore comprehensive properties. Each SSCM is composed of three key components, including Spatial-Spectral Collaborative Attention (SSCA), Scale-Aware Channel Collaboration (SACC), and Mixed-Scale Feed-forward Layer (MSFL). Specifically, the SSCA subassembly includes two Transformer blocks. The first is the Spatial Transformer Block, which primarily transfers high-frequency texture features from PAN images to HRMS. The second is the Spectral Transformer Block, which focuses on transferring spectral features from LRMS to HRMS images. After extracting these two attention features, a multi-head self-attention mechanism is further applied to deeply fuse the spatial and spectral information, thereby achieving enhanced collaboration and complementarity of the target information. Within SACC, we dynamically assimilate and cross-converge characteristics originating from size-varied receptive fields via multiscale convolution, while simultaneously introducing channel attention to model the spectral dependency of MSIs. Similarly, to amplify the nonlinear feature transformation stemming from attention layers, our MSFL incorporates a mixed-scale strategy and subsequently a cross-complementary mechanism is introduced to emphasize the important components of the multiscale convolutions. With all modules organically assembled, the final proposal stands out as the initial attempt to systematically capture local-global and spatial-spectral information during model unfolding, guaranteeing an appealing pansharpening performance. Experimental results on multiple remote sensing datasets demonstrate that the proposed method outperforms comparative methods, achieving a PSNR gain of 0.798 dB on the GF-2 dataset.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Jianwei ZHENG, Hongyi XIA, Honghui XU. Spatial-spectral Collaborative Unrolling Network for Pansharpening[J]. Acta Photonica Sinica, 2025, 54(1): 0110003

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Jul. 2, 2024

    Accepted: Sep. 2, 2024

    Published Online: Mar. 5, 2025

    The Author Email: XU Honghui (xhh@zjut.edu.cn)

    DOI:10.3788/gzxb20255401.0110003

    Topics