Acta Optica Sinica, Volume. 45, Issue 6, 0628008(2025)

Dense Hybrid Attention Network for Remote Sensing Building Change Detection

Qinglin Tian1、*, Donghua Lu1, Yao Li2, and Chengkai Pei1
Author Affiliations
  • 1National Key Laboratory of Uranium Resources Exploration-Mining and Nuclear Remote Sensing, Beijing Research Institute of Uranium Geology, Beijing 100029, China
  • 2School of Geographical Sciences, Southwest University, Chongqing 400715, China
  • show less

    Objective

    As a key element in geographic information systems, building change detection plays a crucial role in evaluating land use, urban development, and disaster damage assessment. Over the past decade, many methods have been proposed for change detection, evolving from pixel-based to object-based approaches that incorporate contextual information. However, traditional methods often struggle with the complexities of high-resolution remote sensing imagery, particularly in handling challenging scenes, leading to limitations in accuracy. With the advent of deep learning, especially deep convolutional neural networks (CNNs), change detection in remote sensing has seen significant improvements. Despite these advancements, deep learning-based methods still face challenges such as such as insufficient utilization of multi-scale information, weak feature representation, and inadequate suppression of pseudo-changes. To address these limitations, we propose a novel method for building change detection in high-resolution remote sensing images, leveraging a dense hybrid attention network (DHANet).

    Methods

    The proposed DHANet utilizes an encoder-decoder architecture. During the encoding phase, a Siamese ResNet network with shared weights extracts multi-level, multi-scale features from bi-temporal images. In addition, dilated convolution (DC) is incorporated to enhance the receptive field of the ResNet, allowing for better feature extraction. A multi-scale feature aggregation module (MSA) is then utilized to effectively integrate the extracted multi-level and multi-scale features between the encoder and decoder, facilitating the detection of changed buildings of various shapes and sizes, while preserving spatial details. Furthermore, to fully exploit contextual information, reduce redundant feature interference, and generate more discriminative features for change detection, multi-level features are refined using a hybrid attention module (HAM), which combines the interlaced sparse self-attention module (ISSA) with the convolutional block attention module (CBAM). Finally, a deep supervision strategy is applied to optimize model performance. Multiple change prediction maps are generated at various stages during the feature fusion process, and the total loss value is obtained through weighted calculation.

    Results and Discussions The performance of DHANet is evaluated on two publicly available datasets

    LEVIR-CD and WHU-CD. On the LEVIR-CD dataset, DHANet significantly outperforms models such as FC-EF, FC-Siam-Conc, FC-Siam-Diff, STANet, IFN, and BIT in both F1 score and Intersection over Union (IoU), with F1 scores improving by 7.64, 7.35, 4.73, 3.78, 1.49, and 0.89 percentage point, respectively. On the WHU-CD dataset, DHANet also surpasses the aforementioned models in F1 and IoU, with F1 scores increases of 11.25, 8.51, 8.13, 7.19, 2.28, and 2.08 percentage points, respectively. Moreover, qualitative visual results demonstrate that DHANet achieves superior change detection outcomes, particularly in identifying buildings of varying shapes and sizes. The resulting change maps exhibit clearer building boundaries and maintain high internal compactness, closely aligning with actual labels. To validate the effectiveness of the key modules (DC, MSA, and HAM), we conduct a series of ablation experiments on the LEVIR-CD dataset. The significant improvements shown in the quantitative results (Table 3) not only confirm the individual effectiveness of the DC, MSA, and HAM modules but also highlight their synergy in enhancing change detection performance.

    Conclusions

    In this paper, we propose a novel DHANet for building change detection in high-resolution remote sensing images. DHANet effectively integrates multi-scale feature extraction through a Siamese ResNet network with shared weights, attention mechanisms, and DC. The MSA enhances feature fusion, while the HAM refines features for improved discriminative power. A deep supervision strategy ensures the progressive refinement of change maps throughout the feature fusion process. Experimental results indicate that DHANet achieves superior performance compared to other mainstream methods and strikes a good balance between accuracy and computational complexity. Ablation studies further validate the effectiveness of the proposed modules, demonstrating the potential of DHANet for detecting building changes in complex scenes using high-resolution remote sensing data.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Qinglin Tian, Donghua Lu, Yao Li, Chengkai Pei. Dense Hybrid Attention Network for Remote Sensing Building Change Detection[J]. Acta Optica Sinica, 2025, 45(6): 0628008

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Remote Sensing and Sensors

    Received: Aug. 16, 2024

    Accepted: Sep. 30, 2024

    Published Online: Mar. 17, 2025

    The Author Email: Tian Qinglin (736924158@qq.com)

    DOI:10.3788/AOS241436

    Topics