Laser & Optoelectronics Progress, Volume. 61, Issue 14, 1428002(2024)

Classification of High-Resolution Remote Sensing Image Based on Swin Transformer and Convolutional Neural Network

Xiaoying He1,2,3, Weiming Xu1,2,3、*, Kaixiang Pan1,2,3, Juan Wang1,2,3, and Ziwei Li1,2,3
Author Affiliations
  • 1The Academy of Digital China, Fuzhou University, Fuzhou 350108, Fujian , China
  • 2Key Laboratory of Spatial Data Mining & Information Sharing, Ministry of Education, Fuzhou University, Fuzhou 350002, Fujian , China
  • 3National Engineering Research Center of Geospatial Information Technology, Fuzhou University, Fuzhou 350002, Fujian , China
  • show less
    Figures & Tables(15)
    Overall architecture of SRAU-Net
    Standard Transformer block and Swin Transformer block. (a) Standard Transformer block; (b) Swin Transformer block
    Convolution block and residual block. (a) Convolution block; (b) residual block
    FFM
    FEM
    Visualization of segmentation results of different models on Vaihingen dataset
    Visualization of segmentation results of different models on GID dataset
    Visualization of the ablation experiment results. (a) baseline; (b) baseline+residual block; (c) baseline+residual block+FFM; (d) baseline+residual block+FEM; (e) baseline+residual block+FFM+FEM
    • Table 1. Class distribution and weight calculation of Vaihingen dataset

      View table

      Table 1. Class distribution and weight calculation of Vaihingen dataset

      Class nameNumber of pixelsClass frequencyWeight
      Impervious surface467927570.27800.7939
      Building437798510.26010.8486
      Low vegetation357667670.21251.0387
      Tree385347480.22900.9641
      Car20960780.012517.7239
      Clutter/background13176700.007828.1943
    • Table 2. Class distribution and weight calculation of GID dataset

      View table

      Table 2. Class distribution and weight calculation of GID dataset

      Class nameNumber of pixelsClass frequencyWeight
      Building6510809270.08991.0996
      Water7807990580.10790.9169
      Forest2773304050.03832.5815
      Farmland22229293360.30710.3221
      Meadow1449438310.02004.9394
      Others31624862390.43680.2264
    • Table 3. Confusion matrix

      View table

      Table 3. Confusion matrix

      ActualPredict
      PositiveNegative
      PositiveTrue positive(NTPFalse positive(NFP
      NegativeFalse positive(NFPTrue negative(NTN
    • Table 4. Comparison of segmentation results of different models on Vaihingen dataset

      View table

      Table 4. Comparison of segmentation results of different models on Vaihingen dataset

      MethodIoU /%MIoU /%OA /%F1 /%
      Impervious surfaceBuildingLow vegetationTreeCar
      FCN65.9686.3467.0975.6367.1772.4485.6776.56
      U-Net80.4088.0770.3277.7874.1678.1589.5782.59
      DeeplabV3+81.1588.6269.9778.3273.9578.4089.3982.93
      TransUNet81.2689.7971.0178.4974.9579.1090.0283.64
      SRAU-Net83.8890.9773.2280.9577.7181.3592.6086.90
    • Table 5. Comparison of segmentation results of different models on GID dataset

      View table

      Table 5. Comparison of segmentation results of different models on GID dataset

      MethodIoU /%MIoU /%OA /%F1 /%
      BuildingWaterForestFarmlandMeadow
      FCN47.0372.5046.7249.1143.0851.6970.8565.11
      U-Net62.2181.2251.0768.7964.0865.4782.8069.05
      DeeplabV3+63.2279.4953.2367.5360.9364.8880.6667.14
      TransUNet63.0380.3555.0268.9364.2466.3182.9169.12
      SRAU-Net64.5880.8255.7370.1165.5267.3584.2671.84
    • Table 6. Ablation experiment setting of SRAU-Net

      View table

      Table 6. Ablation experiment setting of SRAU-Net

      Model nameBaselineResidual blockFFMFEM
      (a)
      (b)
      (c)
      (d)
      (e)
    • Table 7. Result of the ablation experiment on Vaihingen dataset

      View table

      Table 7. Result of the ablation experiment on Vaihingen dataset

      MethodIoU /%MIoU /%OA /%F1 /%
      Impervious surfaceBuildingLow vegetationTreeCar
      (a)82.6689.3770.0777.0375.1278.8590.0385.55
      (b)83.6890.0973.0280.6776.3380.7691.2586.38
      (c)83.8090.8273.0580.6976.3680.9492.3586.62
      (d)83.7890.8773.1480.7276.4380.9892.3786.68
      (e)83.8890.9773.2280.9577.7181.3592.6086.90
    Tools

    Get Citation

    Copy Citation Text

    Xiaoying He, Weiming Xu, Kaixiang Pan, Juan Wang, Ziwei Li. Classification of High-Resolution Remote Sensing Image Based on Swin Transformer and Convolutional Neural Network[J]. Laser & Optoelectronics Progress, 2024, 61(14): 1428002

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Remote Sensing and Sensors

    Received: Aug. 29, 2023

    Accepted: Nov. 21, 2023

    Published Online: Jul. 8, 2024

    The Author Email: Weiming Xu (xwming2@126.com)

    DOI:10.3788/LOP232003

    CSTR:32186.14.LOP232003

    Topics