Optics and Precision Engineering, Volume. 31, Issue 11, 1700(2023)

Fast extraction of buildings from remote sensing images by fusion of CNN and Transformer

Yunzuo ZHANG1,2、*, Wei GUO1, and Cunyu WU1
Author Affiliations
  • 1School of Information Science and Technology, Shijiazhuang Tiedao University, Shijiazhuang050043, China
  • 2Hebei Key Laboratory of Electromagnetic Environmental Effects and Information Processing, Shijiazhuang Tiedao University, Shijiazhuang050043, China
  • show less

    The efficient extraction of buildings from remote sensing images plays an important role in urban planning, disaster rescue, and military reconnaissance. Building extraction methods based on deep learning have made significant progress in accuracy, especially with the sparse token transformer network (STTNet) achieving extremely high accuracy. However, these methods are usually implemented using complex convolution operations in extremely large network models, which results in low extraction speed, thereby presenting difficulties in fulfilling practical needs. Therefore, in this study, a method is designed for the fast extraction of buildings from remote sensing images. First, multi-scale convolution is introduced into the feature extraction network of the STTNet model, whereby multi-scale features are extracted in the same convolution layer to further improve the feature extraction capability of the model. Second, channel attention is applied to the feature map of the force weights, to effectively learn channel attention weights, thereby solving the problem of floating channel attention weights when using the backbone network to output the learned feature map. Finally, to reduce the number of model parameters and speed up the model, the STTNet model structure is changed from parallel to series. Experiments on the INRIA building dataset show that in terms of accuracy and the intersection over union (IoU) metric, the proposed method is 18.3% faster than STTNet and thus better than current mainstream methods.

    Tools

    Get Citation

    Copy Citation Text

    Yunzuo ZHANG, Wei GUO, Cunyu WU. Fast extraction of buildings from remote sensing images by fusion of CNN and Transformer[J]. Optics and Precision Engineering, 2023, 31(11): 1700

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Information Sciences

    Received: Sep. 15, 2022

    Accepted: --

    Published Online: Jul. 4, 2023

    The Author Email: ZHANG Yunzuo (zhangyunzuo888@sina.com)

    DOI:10.37188/OPE.20233111.1700

    Topics