Acta Photonica Sinica, Volume. 50, Issue 11, 1128002(2021)
Scene Classification of Optical High-resolution Remote Sensing Images Using Vision Transformer and Graph Convolutional Network
Most existing optical remote sensing scene classification methods based on convolutional neural network mainly perform global feature learning and fail to consider the local features in the scene, which cannot effectively address the large intraclass difference and high interclass similarity. Therefore, a novel remote sensing scene classification method based on two branches of vision transformer and graph convolution network is proposed. Firstly the scene image is divided into patches and the then positional encoding and vision transformer are used to encode the patches. Consequently, the long-range dependencies can be mined. On the other hand, the scene image is converted into superpixels. The convolutional neural networks features of each superpixel are pooled and used to represent the node of the graph structure. Then the graph convolutional network is applied to model the spatial topology relationships. Finally the final feature representation of the scene image are described by the features of the two branches. Experimental results on the optical remote sensing image datasets demonstrate the effectiveness of our method.
Get Citation
Copy Citation Text
Jianan WANG, Yue GAO, Jun SHI, Ziqi LIU. Scene Classification of Optical High-resolution Remote Sensing Images Using Vision Transformer and Graph Convolutional Network[J]. Acta Photonica Sinica, 2021, 50(11): 1128002
Category: Remote Sensing and Sensors
Received: May. 25, 2021
Accepted: Jul. 26, 2021
Published Online: Dec. 2, 2021
The Author Email: GAO Yue (bjlguniversity@163.com)