Laser & Optoelectronics Progress, Volume. 60, Issue 22, 2228005(2023)

Remote Sensing Scene Classification Based on Local Selection Vision Transformer

Kai Yang1,2 and Xiaoqiang Lu1、*
Author Affiliations
  • 1Key Laboratory of Spectral Imaging Technology, Xi'an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi'an 710119, Shaanxi , China
  • 2University of Chinese Academy of Sciences, Beijing 100049, China
  • show less

    Remote sensing scene classification aims to assign specific semantic labels to aerial images, which is a fundamental and important task in remote sensing image interpretation. Existing studies have used convolutional neural networks (CNN) to learn global and local features and improve the discriminative representation of networks. However, the perceptual wilderness of CNN-based approaches has limitations in modeling the remote dependence of local features. In recent years, Vision Transformer (ViT) has shown powerful performances in traditional classification tasks. Its self-attention mechanism connects each Patch with a classification token and captures the contextual relationship between image pixels by considering global information in the spatial domain. In this paper, we propose a remote sensing scene classification network based on local selection ViT, in which an input image is first segmented into small chunks of Patch that are unfolded and converted into sequences with position encoding; thereafter, the obtained sequences are fed into an encoder. In addition, a local selection module is added before the last layer of input in order to learn the local discriminative features, and Token with discriminative properties are selected as input to obtain the final classification output. The experimental results show that the proposed method achieves good results on two large remote sensing scene classification datasets (AID and NWPU).

    Tools

    Get Citation

    Copy Citation Text

    Kai Yang, Xiaoqiang Lu. Remote Sensing Scene Classification Based on Local Selection Vision Transformer[J]. Laser & Optoelectronics Progress, 2023, 60(22): 2228005

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Remote Sensing and Sensors

    Received: Jan. 30, 2023

    Accepted: Mar. 10, 2023

    Published Online: Nov. 6, 2023

    The Author Email: Lu Xiaoqiang (luxq666666@gmail.com)

    DOI:10.3788/LOP230539

    Topics