Laser & Optoelectronics Progress, Volume. 60, Issue 14, 1410019(2023)

Classification Method of High-Resolution Remote Sensing Scene Image Based on Dictionary Learning and Vision Transformer

Xiaojun He1, Xuan Liu1,2、*, and Xian Wei2
Author Affiliations
  • 1College of Software, Liaoning Technical University, Huludao 125105, Liaoning, China
  • 2Quanzhou Institute of Equipment Manufacturing Haixi Institutes, Fujian Institute of Research on the Structure, Chinese Academy of Sciences, Quanzhou 362216, Fujian, China
  • show less
    Figures & Tables(17)
    Diagram of dictionary learning
    Flowchart of the proposed method
    Batch normalization and layer normalization
    Schematic of multilayer perceptron
    Flowchart of attention module method
    Attention module based on dictionary learning
    RSSCN7 dataset
    NWPU-RESISC45 dataset
    AID dataset
    Rate of change of classification accuracy on Gaussian noise images
    • Table 1. Introduction of datasets

      View table

      Table 1. Introduction of datasets

      DatasetNumber of scene classesNumber of total imagesImage sizeSpatial resolution /mYear
      RSSCN772800400×4002015
      NWPU-RESISC454531500256×256~30-0.22016
      AID3010000600×600~8-0.52017
    • Table 2. Laboratory environment

      View table

      Table 2. Laboratory environment

      Laboratory environmentEnvironment configuration
      LanguagePython3.8.6
      ToolPyCharm11.0.11
      FrameworkPyTorch1.9.1
      CUDA10.2
    • Table 3. Accuracy of different networks on RSSCN7 dataset

      View table

      Table 3. Accuracy of different networks on RSSCN7 dataset

      NetworkAccuracy /%
      AlexNet82.230
      VGG80.833
      ResNet5089.048
      TNT84.833
      ViT89.643
      Proposed network91.406
    • Table 4. Accuracy of different networks on NWPU-RESISC45 dataset

      View table

      Table 4. Accuracy of different networks on NWPU-RESISC45 dataset

      NetworkAccuracy /%
      Fine-tuned AlexNet85.160
      Fine-tuned VGGNet-1690.360
      Fine-tuned GoogLeNet86.020
      TNT85.031
      ViT90.255
      Proposed network91.576
    • Table 5. Accuracy of different networks on AID dataset

      View table

      Table 5. Accuracy of different networks on AID dataset

      NetworkAccuracy /%
      CaffeNet86.860
      VGG-VD-1686.590
      ResNet15289.130
      GoogLeNet83.440
      TNT80.450
      ViT85.514
      Proposed network89.218
    • Table 6. Parameter indicators of two methods on three datasets

      View table

      Table 6. Parameter indicators of two methods on three datasets

      ParameterRSSCN7NWPU-RESISC45AID
      ViTProposed methodViTProposed methodViTProposed method
      kappa0.9000.9160.9340.9470.8830.909
      F186.22290.89088.92790.20784.20287.768
      recall85.98691.14288.98490.28684.14787.662
      precision86.41791.00289.03990.31784.55888.004
    • Table 7. Parameters of different classification frameworks

      View table

      Table 7. Parameters of different classification frameworks

      NetworkNumber of parameter /106
      AlexNet6
      VGG13.3
      ResNet502.55
      TNT2.25
      ViT2.6
      Proposed method1.84
    Tools

    Get Citation

    Copy Citation Text

    Xiaojun He, Xuan Liu, Xian Wei. Classification Method of High-Resolution Remote Sensing Scene Image Based on Dictionary Learning and Vision Transformer[J]. Laser & Optoelectronics Progress, 2023, 60(14): 1410019

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Jul. 26, 2022

    Accepted: Sep. 27, 2022

    Published Online: Jul. 17, 2023

    The Author Email: Liu Xuan (preciousisgfc@163.com)

    DOI:10.3788/LOP222166

    Topics