Chinese Journal of Lasers, Volume. 49, Issue 20, 2007205(2022)

Fundus Image Classification Research Based on Ensemble Convolutional Neural Network and Vision Transformer

Yuan Yuan, Minghui Chen*, Shuting Ke, Teng Wang, Longxi He, Linjie Lü, Hao Sun, and Jiannan Liu
Author Affiliations
  • Shanghai Engineering Research Center of Interventional Medical, Ministry of Education of Medical Optical Engineering Center, School of Health Sciences and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
  • show less
    Figures & Tables(10)
    Overview of Vit model
    Structure of MBConv and Fused-MBConv. (a) MBConv; (b) Fused-MBConv
    SimAM 3D attention weights
    Confusion matrix of Vit and EfficientNetV2-S models. (a) Vit; (b) EfficientNetV2-S
    Heatmap of abnormal fundus images in dataset. (a) Abnormalities in optic disc; (b) abnormalities in macular
    • Table 1. EfficientNetV2-S architecture

      View table

      Table 1. EfficientNetV2-S architecture

      StageOperatorStrideNumber of channelsNumber of layers
      0Conv 3×32241
      1Fused-MBConv1,k 3×31242
      2Fused-MBConv4,k 3×32484
      3Fused-MBConv4,k 3×32644
      4MBConv4, k 3×3, SimAM21286
      5MBConv6, k 3×3, SimAM11609
      6MBConv6, k 3×3, SimAM227215
      7Conv 1×1 & Pooling & FC17921
    • Table 2. Fundus dataset

      View table

      Table 2. Fundus dataset

      Degree of illnessNumber of training imagesNumber of testing imagesTotal number of images
      Normal28184093227
      DR13892031592
      ARMD14749196
      Myopia23435269
      Cataract26243305
    • Table 3. Accuracy, precision, and specificity of Vit, EfficientNetV2-S, and EfficientNet-Vit models

      View table

      Table 3. Accuracy, precision, and specificity of Vit, EfficientNetV2-S, and EfficientNet-Vit models

      ModelAccuracy /%Precision /%Specificity /%Training time /h
      Vit91.186.497.211.0
      EfficientNetV2-S92.287.697.59.2
      EfficientNet-Vit92.788.398.1
    • Table 4. Comparison of accuracy indexes of different models

      View table

      Table 4. Comparison of accuracy indexes of different models

      ModelAccuracy /%
      Resnet5087.3
      Densenet12189.5
      ResNeSt-10190.7
      EfficientNet-B091.3
      TNT-B91.1
      EfficientNet-Vit92.7
    • Table 5. Comparison of accuracy indexes of models with different weighted factors

      View table

      Table 5. Comparison of accuracy indexes of models with different weighted factors

      Weighted factorAccuracy /%
      0.3, 0.792.0
      0.4, 0.692.7
      0.5, 0.591.6
    Tools

    Get Citation

    Copy Citation Text

    Yuan Yuan, Minghui Chen, Shuting Ke, Teng Wang, Longxi He, Linjie Lü, Hao Sun, Jiannan Liu. Fundus Image Classification Research Based on Ensemble Convolutional Neural Network and Vision Transformer[J]. Chinese Journal of Lasers, 2022, 49(20): 2007205

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Biomedical Optical Imaging

    Received: Mar. 4, 2022

    Accepted: Jun. 6, 2022

    Published Online: Aug. 10, 2022

    The Author Email: Chen Minghui (cmhui.43@163.com)

    DOI:10.3788/CJL202249.2007205

    Topics