Laser & Optoelectronics Progress, Volume. 61, Issue 22, 2237012(2024)
High-Resolution Slope Scene Image Classification Based on SwinT-MFPN
This paper proposes a SwinT-MFPN slope scene image classification model designed to balance performance, inference speed, and convergence speed, leveraging the Swin-Transformer and feature pyramid network (FPN). The proposed model overcomes the challenges associated with rapidly increasing computational complexity and slow convergence in high-resolution images. First, the Mish activation function is introduced into the FPN to construct an MFPN structure that extracts features from the original high-resolution image, producing a feature map with reduced dimensions while eliminating redundant low-level feature information to enhance key features. The Swin-Transformer, which is known for its robust deep-level feature extraction capabilities, is then employed as the model's backbone feature extraction network. The original cross-entropy loss function of the Swin-Transformer is replaced by a weighted cross-entropy loss function to mitigate the effects of imbalanced class data on model predictions. In addition, a root mean square error evaluation index for accuracy is proposed. The proposed model's stability is verified using a self-constructed dam slope dataset. Experimental results demonstrate that the proposed model achieves a mean average precision of 95.48%, with a 3.01% improvement in time performance compared to most mainstream models, emphasizing its applicability and effectiveness.
Get Citation
Copy Citation Text
Yin Tu, Denghua Li, Yong Ding. High-Resolution Slope Scene Image Classification Based on SwinT-MFPN[J]. Laser & Optoelectronics Progress, 2024, 61(22): 2237012
Category: Digital Image Processing
Received: Feb. 29, 2024
Accepted: Apr. 14, 2024
Published Online: Nov. 19, 2024
The Author Email: Li Denghua (dhli@nhri.cn)
CSTR:32186.14.LOP240769