Laser & Optoelectronics Progress, Volume. 60, Issue 22, 2228005(2023)

Remote Sensing Scene Classification Based on Local Selection Vision Transformer

Kai Yang1,2 and Xiaoqiang Lu1、*
Author Affiliations
  • 1Key Laboratory of Spectral Imaging Technology, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an 710119, Shaanxi , China
  • 2University of Chinese Academy of Sciences, Beijing 100049, China
  • show less
    Figures & Tables(8)
    Remote-sensing image examples
    Network structure of LSViT
    • Table 1. Comparison results of different Patch divisions

      View table

      Table 1. Comparison results of different Patch divisions

      Patch divisionOA /%Training time /min
      Non-overlapping95.2159.82
      Overlapping95.3362.91
    • Table 2. Comparison results of different step sizes

      View table

      Table 2. Comparison results of different step sizes

      Step size810121416
      OA /%94.7394.9095.1295.3395.21
    • Table 3. Impact result of the local selection module

      View table

      Table 3. Impact result of the local selection module

      MethodOA /%
      ViT+overlapping95.01
      LSViT95.33
    • Table 4. Comparison results with the CNN model

      View table

      Table 4. Comparison results with the CNN model

      MethodOA /%
      VGG1688.18
      GoogleNet92.88
      ResNet-1892.96
      ResNet-5093.24
      ViT94.90
      LSViT95.33
    • Table 5. Comparison with state-of-the-art methods on AID dataset

      View table

      Table 5. Comparison with state-of-the-art methods on AID dataset

      MethodBackboneOA /%
      20% training data50% training data
      GBNet19VGG1692.20±0.2395.48±0.12
      IBCNN20MobileNet-V294.23±0.1696.57±0.28
      ACR-MLFF6ResNet-5092.73±0.1295.06±0.33
      SAFF21VGG-VD1690.25±0.2993.83±0.28
      Ji et al.+VGG-VD167VGG-VD1694.75±0.2396.93±0.16
      ResNet50+EAM22ResNet5093.64±0.2596.62±0.13
      CNN-GCN23VGG1694.93±0.3196.89±0.10
      SKAL24ResNet1894.38±0.1096.76±0.20
      LSViTViT95.33±0.0597.55±0.12
    • Table 6. Comparison with state-of-the-art methods on NWPU-RESISC45 dataset

      View table

      Table 6. Comparison with state-of-the-art methods on NWPU-RESISC45 dataset

      MethodBackboneOA /%
      10% training data20% training data
      IBCNN20MobileNet-V290.49±0.1793.33±0.21
      ACR-MLFF6ResNet-5090.01±0.3392.45±0.20
      SAFF21VGG-VD1684.38±0.1987.86±0.14
      Ji et al.+VGG-VD167VGG-VD1691.08±0.2493.49±0.17
      ResNet50+EAM22ResNet5090.89±0.1593.51±0.12
      CNN-GCN23VGG1690.75±0.2190.92±0.24
      SKAL24ResNet1890.04±0.1592.79±0.11
      LSViTViT92.41±0.1594.25±0.25
    Tools

    Get Citation

    Copy Citation Text

    Kai Yang, Xiaoqiang Lu. Remote Sensing Scene Classification Based on Local Selection Vision Transformer[J]. Laser & Optoelectronics Progress, 2023, 60(22): 2228005

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Remote Sensing and Sensors

    Received: Jan. 30, 2023

    Accepted: Mar. 10, 2023

    Published Online: Nov. 6, 2023

    The Author Email: Xiaoqiang Lu (luxq666666@gmail.com)

    DOI:10.3788/LOP230539

    Topics