Laser & Optoelectronics Progress, Volume. 60, Issue 4, 0410013(2023)

Assistant Diagnosis of Pediatric Pneumonia Based on Vision Transformer

Shuang Zhao1, Guohui Wei2, Wenhua Zhao2、*, and Zhiqing Ma2
Author Affiliations
  • 1Laboratory Management Office, Shandong University of Traditional Chinese Medicine, Jinan 250355, Shandong, China
  • 2College of Intelligence and Information Engineering, Shandong University of Traditional Chinese Medicine, Jinan 250355, Shandong, China
  • show less
    Figures & Tables(14)
    Overall network architecture
    U-Net architecture
    Sample graph. (a) U-Net sub-module; (b) residual module
    ViT model architecture
    Encoder module architecture
    Self-attention mechanism
    Sample of Chest X-Ray dataset. (a) Pneumonia image;(b) normal image
    Segmentation results. (a) Original images; (b) masks; (c) prediction images
    Segmentation results. (a) Original image; (b) mask; (c) prediction image
    Confusion matrix
    • Table 1. Network structure of R50 feature extraction stage

      View table

      Table 1. Network structure of R50 feature extraction stage

      Output sizeR50
      112×112stdConv,7×7,64,stride 2
      56×56max pool,3×3,stride 2
      stdConv,1×1,64stdConv,3×3,64stdConv,1×1,256×3
      28×28stdConv,1×1,128stdConv,3×3,128stdConv,1×1,512×4
      14×14stdConv,1×1,256stdConv,3×3,256stdConv,1×1,1024×9
      14×14Conv,1×1,768,stride 1
    • Table 2. Network structure of R-MSA32 feature extraction stage

      View table

      Table 2. Network structure of R-MSA32 feature extraction stage

      Output sizeR-MSA 32
      112×112stdConv,7×7,64,stride 2
      56×56max pool,3×3,stride 2
      stdConv,1×1,64stdConv,3×3,64stdConv,1×1,256×3
      28×28stdConv,1×1,128stdConv,3×3,128stdConv,1×1,512×4
      14×14stdConv,1×1,256stdConv,MSA,256stdConv,1×1,1024×3
      14×14Conv,1×1,768,stride 1
    • Table 3. Comparison between ViT network model and hybrid ViT network model

      View table

      Table 3. Comparison between ViT network model and hybrid ViT network model

      ModelAccuracyPrecisionRecall
      ResNet5094.7094.3296.49
      ViT95.2196.4996.94
      R50+ViT96.0897.4297.20
      ResUNet+R50+ViT96.5898.1397.22
      Proposed model97.2698.6097.69
    • Table 4. Comparison with existing research results

      View table

      Table 4. Comparison with existing research results

      ModelAccuracy
      DenseNet' 90.46
      AlexNet_S94.87
      GIV396.77
      Proposed model97.26
    Tools

    Get Citation

    Copy Citation Text

    Shuang Zhao, Guohui Wei, Wenhua Zhao, Zhiqing Ma. Assistant Diagnosis of Pediatric Pneumonia Based on Vision Transformer[J]. Laser & Optoelectronics Progress, 2023, 60(4): 0410013

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Nov. 22, 2021

    Accepted: Jan. 5, 2022

    Published Online: Feb. 13, 2023

    The Author Email: Wenhua Zhao (zhaowh0621@163.com)

    DOI:10.3788/LOP213019

    Topics