Dense Hybrid Attention Network for Remote Sensing Building Change Detection

Qinglin Tian; Donghua Lu; Yao Li; Chengkai Pei

doi:10.3788/AOS241436

Acta Optica Sinica, Volume. 45, Issue 6, 0628008(2025)

Dense Hybrid Attention Network for Remote Sensing Building Change Detection

Qinglin Tian^1、*, Donghua Lu¹, Yao Li², and Chengkai Pei¹

¹National Key Laboratory of Uranium Resources Exploration-Mining and Nuclear Remote Sensing, Beijing Research Institute of Uranium Geology, Beijing 100029, China

²School of Geographical Sciences, Southwest University, Chongqing 400715, China

show less

Abstract Get PDF(in Chinese)

Objective

As a key element in geographic information systems, building change detection plays a crucial role in evaluating land use, urban development, and disaster damage assessment. Over the past decade, many methods have been proposed for change detection, evolving from pixel-based to object-based approaches that incorporate contextual information. However, traditional methods often struggle with the complexities of high-resolution remote sensing imagery, particularly in handling challenging scenes, leading to limitations in accuracy. With the advent of deep learning, especially deep convolutional neural networks (CNNs), change detection in remote sensing has seen significant improvements. Despite these advancements, deep learning-based methods still face challenges such as such as insufficient utilization of multi-scale information, weak feature representation, and inadequate suppression of pseudo-changes. To address these limitations, we propose a novel method for building change detection in high-resolution remote sensing images, leveraging a dense hybrid attention network (DHANet).

Methods

The proposed DHANet utilizes an encoder-decoder architecture. During the encoding phase, a Siamese ResNet network with shared weights extracts multi-level, multi-scale features from bi-temporal images. In addition, dilated convolution (DC) is incorporated to enhance the receptive field of the ResNet, allowing for better feature extraction. A multi-scale feature aggregation module (MSA) is then utilized to effectively integrate the extracted multi-level and multi-scale features between the encoder and decoder, facilitating the detection of changed buildings of various shapes and sizes, while preserving spatial details. Furthermore, to fully exploit contextual information, reduce redundant feature interference, and generate more discriminative features for change detection, multi-level features are refined using a hybrid attention module (HAM), which combines the interlaced sparse self-attention module (ISSA) with the convolutional block attention module (CBAM). Finally, a deep supervision strategy is applied to optimize model performance. Multiple change prediction maps are generated at various stages during the feature fusion process, and the total loss value is obtained through weighted calculation.

Results and Discussions The performance of DHANet is evaluated on two publicly available datasets

LEVIR-CD and WHU-CD. On the LEVIR-CD dataset, DHANet significantly outperforms models such as FC-EF, FC-Siam-Conc, FC-Siam-Diff, STANet, IFN, and BIT in both F₁ score and Intersection over Union (IoU), with F₁ scores improving by 7.64, 7.35, 4.73, 3.78, 1.49, and 0.89 percentage point, respectively. On the WHU-CD dataset, DHANet also surpasses the aforementioned models in F₁ and IoU, with F₁ scores increases of 11.25, 8.51, 8.13, 7.19, 2.28, and 2.08 percentage points, respectively. Moreover, qualitative visual results demonstrate that DHANet achieves superior change detection outcomes, particularly in identifying buildings of varying shapes and sizes. The resulting change maps exhibit clearer building boundaries and maintain high internal compactness, closely aligning with actual labels. To validate the effectiveness of the key modules (DC, MSA, and HAM), we conduct a series of ablation experiments on the LEVIR-CD dataset. The significant improvements shown in the quantitative results (Table 3) not only confirm the individual effectiveness of the DC, MSA, and HAM modules but also highlight their synergy in enhancing change detection performance.

Conclusions

In this paper, we propose a novel DHANet for building change detection in high-resolution remote sensing images. DHANet effectively integrates multi-scale feature extraction through a Siamese ResNet network with shared weights, attention mechanisms, and DC. The MSA enhances feature fusion, while the HAM refines features for improved discriminative power. A deep supervision strategy ensures the progressive refinement of change maps throughout the feature fusion process. Experimental results indicate that DHANet achieves superior performance compared to other mainstream methods and strikes a good balance between accuracy and computational complexity. Ablation studies further validate the effectiveness of the proposed modules, demonstrating the potential of DHANet for detecting building changes in complex scenes using high-resolution remote sensing data.

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Keywords

building change detection deep learning dense attention multi-scale feature remote sensing image

Tools

Get Citation

Copy Citation Text

Qinglin Tian, Donghua Lu, Yao Li, Chengkai Pei. Dense Hybrid Attention Network for Remote Sensing Building Change Detection[J]. Acta Optica Sinica, 2025, 45(6): 0628008

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Remote Sensing and Sensors

Received: Aug. 16, 2024

Accepted: Sep. 30, 2024

Published Online: Mar. 17, 2025

The Author Email: Tian Qinglin (736924158@qq.com)

DOI:10.3788/AOS241436

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology