Laser & Optoelectronics Progress, Volume. 60, Issue 20, 2028006(2023)

Remote Sensing Image Classification Method Based on Fusion of CNN and Transformer

Chuan Jin and Changqing Tong*
Author Affiliations
  • School of Sciences, Hangzhou Dianzi University, Hangzhou 310018, Zhejiang , China
  • show less
    Figures & Tables(13)
    MBConv structure embedded in the CA mechanism
    Overall structure of the proposed model
    Structure of the CA
    Multihead self-attention mechanism structure. (a) Mutilhead self -attention mechanism; (b) self-attention mechanism
    Scene instances of three datasets. (a) Airport; (b) beach; (c) center; (d) park; (e) mountain; (f) bridge; (g) church; (h) harbor; (i) overpass; (j) cloud; (k) river; (l) intersection; (m) aeroplane; (n) chaparral; (o) storage tank
    Heat map comparison with other related models. (a) Airplane; (b) center; (c) pond; (d) (e) (f) EfficientNet; (g) (h) (i) ResNet; (j) (k) (l) Swin-T; (m) (n) (o) proposed method
    Confusion matrix of the AID dataset at 50% training scale
    Confusion matrix of the NWPU-RESISC45 dataset at 20% training scale
    Confusion matrix of the VGoogle dataset at 20% training scale
    • Table 1. Characteristics of the datasets

      View table

      Table 1. Characteristics of the datasets

      DatasetNumber of classesNumber of imagesTotal number of imagesResolution /mImage sizeYear
      AID30220‒420100000.5‒8600×6002017
      NWPU-RESISC4545700315000.2‒30256×2562016
      VGoogle381502‒1847594040.075‒9.555256×2562019
    • Table 2. Parameter settings for model training

      View table

      Table 2. Parameter settings for model training

      ParameterValueParameterValue
      Epoch100Drop rate0.2
      Batch_size64OptimiserAdamW
      Learning rate0.000005Warmup10
      Weight decay0.0005Random seed42
    • Table 3. Accuracy of different models on three datasets

      View table

      Table 3. Accuracy of different models on three datasets

      MethodNumber of parameters/106AIDNWPU-RESISC45VGoogle
      20% training data50% training data10% training data20% training data10% training data20% training data
      VGG-16134.486.59±0.2989.64±0.3676.47±0.1879.79±0.6572.41±0.2276.74±0.16
      GoogLeNet54.483.44±0.4086.39±0.5576.19±0.3878.48±0.2677.33±0.5786.79±0.47
      EfficientNet-B04.183.69±0.1186.17±0.1679.96±0.2782.89±0.1678.30±0.2688.38±0.29
      ResNet-5023.692.39±0.1594.96±0.1986.23±0.4188.93±0.1288.02±0.1592.99±0.10
      LGRIN4.694.74±0.2397.65±0.2591.91±0.1594.43±0.16
      ViT-Base85.891.16±0.4194.44±0.2887.59±0.2190.87±0.1786.22±0.3391.42±0.17
      PVT-Medium43.392.84±0.1995.93±0.1790.51±0.1392.66±0.1486.60±0.1492.32±0.22
      Swin-Base86.894.86±0.2297.80±0.1591.80±0.1694.04±0.1188.48±0.1293.19±0.13
      TRS46.395.54±0.1898.48±0.0693.06±0.1195.56±0.20
      TSTNet173.097.20±0.2298.70±0.1294.08±0.2495.70±0.10
      Proposed method20.497.81±0.0898.95±0.0694.82±0.0496.00±0.0791.27±0.0295.01±0.14
    • Table 4. Accuracy of ablation experiments

      View table

      Table 4. Accuracy of ablation experiments

      MethodAIDNWPU-RESISC45VGoogle
      20% training data50% training data10% training data20% training data10% training data20% training data
      Without CA+Transformer85.17±0.5792.97±0.0977.81±0.1686.55±0.1386.32±0.1692.83±0.10
      With CA86.57±0.8593.42±0.0979.25±0.2587.24±0.0986.91±0.2092.79±0.20
      With Transformer77.79±0.1790.45±0.3272.06±0.3283.88±0.2776.97±0.3086.65±0.22
      With CA+Transformer97.81±0.0898.95±0.0694.82±0.0496.00±0.0791.27±0.0295.01±0.14
    Tools

    Get Citation

    Copy Citation Text

    Chuan Jin, Changqing Tong. Remote Sensing Image Classification Method Based on Fusion of CNN and Transformer[J]. Laser & Optoelectronics Progress, 2023, 60(20): 2028006

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Remote Sensing and Sensors

    Received: Nov. 24, 2022

    Accepted: Jan. 4, 2023

    Published Online: Sep. 28, 2023

    The Author Email: Changqing Tong (tongchangqing@hdu.edu.cn)

    DOI:10.3788/LOP223154

    Topics