Laser & Optoelectronics Progress, Volume. 62, Issue 16, 1628004(2025)

High-Resolution Remote Sensing Semantic Segmentation Method Coupling ResNet and Transformer

Lei Zhang1, Xue Ding1,2,3、*, Jinliang Wang2,3,4, Shuangyun Peng4, and Rongxiang Luo1
Author Affiliations
  • 1School of Information Science and Technology, Yunnan Normal University, Kunming 650500, Yunnan , China
  • 2Key Laboratory of Resources and Environmental Remote Sensing for Universities in Yunnan, Kunming 650500, Yunnan , China
  • 3Yunnan Provincial Engineering and Technology Research Center for Geospatial Information Technology, Kunming 650500, Yunnan , China
  • 4Faculty of Geography, Yunnan Normal University, Kunming 650500, Yunnan , China
  • show less

    Convolutional neural networks (CNNs) and visual Transformer face the problems of difficulty in effective fusion and low segmentation accuracy when fusing global and local features in semantic segmentation of high-resolution remote sensing images. This paper proposes a highly fused hybrid network RTHNet. RTHNet adopts an encoder and decoder structure, and uses ResNet50 as the backbone network in the encoding stage to effectively extract local features in remote sensing images. An attention adaptive fusion module (AAFM) is designed to achieve efficient integration of multi-level attention features between the encoder and decoder. In the decoding stage, a global-local contextual Transform module (GLCTB) is designed to pay attention to global context information and local details at the same time. A detail enhancement module (DEM) is proposed at the end of the decoder to ensure the precision and accuracy of the segmentation results by refining the semantic consistency and spatial detail information between features. Experimental results on the Potsdam, Vaihingen, and WHDLD datasets show that, the mean intersection over union (mIoU) of RTHNet reach 79.58%, 73.61%, and 60.37%, respectively. Compared with the current mainstream segmentation networks such as MAResU-Net and UNetFormer, RTHNet has significantly improved the segmentation accuracy.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Lei Zhang, Xue Ding, Jinliang Wang, Shuangyun Peng, Rongxiang Luo. High-Resolution Remote Sensing Semantic Segmentation Method Coupling ResNet and Transformer[J]. Laser & Optoelectronics Progress, 2025, 62(16): 1628004

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Remote Sensing and Sensors

    Received: Feb. 5, 2025

    Accepted: Mar. 21, 2025

    Published Online: Jul. 25, 2025

    The Author Email: Xue Ding (4228@ynnu.edu.cn)

    DOI:10.3788/LOP250591

    CSTR:32186.14.LOP250591

    Topics