Laser & Optoelectronics Progress, Volume. 61, Issue 22, 2237010(2024)

Fine-Grained Image Classification Based on Feature Fusion and Ensemble Learning

Wenli Zhang1,2 and Wei Song1,2、*
Author Affiliations
  • 1School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, Jiangsu , China
  • 2Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Jiangnan University, Wuxi 214122, Jiangsu , China
  • show less

    Fine-grained image classification aims to recognize subcategories within a given superclass accurately; however, it is faced with challenges of large intra-class differences, small inter-class differences, and limited training samples. Most current methods are improved based on Vision Transformer with the goal of enhancing classification performance. However, the following issues occur: ignoring the complementary information of classification tokens from different layers leads to incomplete global feature extraction, inconsistent performance of different heads in multi-head self-attention mechanism leads to inaccurate part localization, and limited training samples are prone to overfitting. In this study, a fine-grained image classification network based on feature fusion and ensemble learning is proposed to address the above issues. The network consists of three modules: the multi-level feature fusion module integrates complementary information to obtain more complete global features, the multi-expert part voting module votes for part tokens through ensemble learning to enhance the representation ability of part features, the attention-guided mixup augmentation module alleviates the overfitting issue and improves the classification accuracy. The classification accuracy on CUB-200-2011, Stanford Dogs, NABirds, and IP102 datasets is 91.92%, 93.10%, 90.98%, and 76.21%, respectively, with improvements of 1.42, 1.50, 1.08, and 2.81 percentage points, respectively, compared to the original Vision Transformer model, performing better than other compared fine-grained image classification methods.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Wenli Zhang, Wei Song. Fine-Grained Image Classification Based on Feature Fusion and Ensemble Learning[J]. Laser & Optoelectronics Progress, 2024, 61(22): 2237010

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Digital Image Processing

    Received: Feb. 28, 2024

    Accepted: Apr. 11, 2024

    Published Online: Nov. 19, 2024

    The Author Email: Song Wei (songwei@jiangnan.edu.cn)

    DOI:10.3788/LOP240759

    CSTR:32186.14.LOP240759

    Topics