Laser Journal, Volume. 45, Issue 9, 132(2024)

Chest X-ray multi-label disease classification modelbased on vision transformer

LI Min1... WANG Yue1, ZHANG Yuchuan1, JI Zhuohao1 and HU Nan2,* |Show fewer author(s)
Author Affiliations
  • 1School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
  • 2Chongqing Qinying Technology Co., Ltd. Chongqing 400010, China
  • show less
    References(26)

    [1] [1] Wang X, Peng Y, Lu L, et al. ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017, 3462-3471.

    [2] [2] Krizhevsky A, Sutskever I, Geoffrey E, et al. ImageNet Classification with Deep Convolutional Neural Networks[J]. Communications of the ACM, 2012, 60(1): 84-90.

    [3] [3] Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C]//International Conference on Learning Representations. 2015.

    [4] [4] Szegedy C, Liu W, Jia YQ, et al. Going Deeper with Convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2015.1-9.

    [5] [5] He KM, Zhang XY, Ren SQ, et al. Deep Residual Learning for Image Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016, 770-778.

    [6] [6] Yao L. Weakly Supervised Medical Diagnosis and Localization from Multiple Resolutions. https://arxiv.org/abs/1803.07703,2019-3-21.

    [7] [7] Huang G, Liu Z, Van D, et al. Densely Connected Convolutional Networks[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017, 2261-2269.

    [8] [8] Hochreiter S, Schmidhuber J. Long Short-Term Memory[J]. Neural Computation, 1997, 9(8): 1735-1780.

    [9] [9] Zhang ZR, li Q, Guan X. Mulrilabel chest X-ray disease classification based on a dense squeere-and-excitation network[J]. Journal of Image and Graphis, 2020, 25(10): 2238-2248.

    [10] [10] Valsson S, Arandjelovi c' O. Nuances of interpreting X-ray analysis by deep learning and lessons for reporting experimental findings[J]. Sci, 2022, 4(1): 3.

    [11] [11] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[C]//International conference on learning representations. PMLR, 2020.

    [12] [12] Taslimi S, Taslimi S, Fathi N, et al. SwinChex: Multi-label classification on chest x-ray images with transformers[J]. arXiv: 2206.04246, 2022.

    [13] [13] Wang J, Yu X, Gao Y. Feature Fusion Vision Transformer for Fine-Grained Visual Categorization. https://arxiv.org/abs/2107.02341,2021-5-31.

    [14] [14] Ridnik T, Ben-Baruch E, Zamir N, et al. Asymmetric loss for multi-label classification[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2021: 82-91.

    [15] [15] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale[C]//International Conference on Learning Representations. 2021.

    [16] [16] Vaswani A, Shazeer N, Parmar N, et al. Attention Is All You Need[J]. Advances in Neural Information Processing Systems, 2017, 30(1): 6000-6010.

    [17] [17] Cordonnier JB, Loukas A, Jaggi M. On the Relationship Between Self-Attention and Convolutional Layers[C]//International Conference on Learning Representations. 2019.

    [18] [18] Devlin J, Chang MW, Lee K, et al. Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. https://arxiv.org/abs/1810.04805,2018-4-12.

    [19] [19] Tolstikhin I O, Houlsby N, Kolesnikov A, et al. Mlp-mixer: An all-mlp architecture for vision[J]. Advances in neural information processing systems, 2021, 34: 24261-24272.

    [20] [20] Hendrycks D, Gimpel K. Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units. https://arxiv.org/abs/1606.08415,2016-7-9.

    [21] [21] Hendrycks D, Gimpel K. Gaussian error linear units (gelus)[J]. arXiv preprint arXiv: 1606.08415, 2016.

    [22] [22] Chaudhari S, Mithal V, Polatkan G, et al. An Attentive Survey of Attention Models[J]. ACM Transactions on Intelligent Systems and Technology, 2021, 12(5): 1-32.

    [23] [23] Lin TY, Goyal P, Girshick, R, et al. Focal Loss for Dense Object Detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 42(2): 318-327.

    [24] [24] Wang X, Peng Y. ChestX-ray8: Hospital-scale chest xray database and benchmarks on weakly-supervised classification and localization of common thorax diseases[C]//IEEE conference on computer vision and pattern recognition, 2017, 3462-3471.

    [25] [25] Deng J, Dong W, Socher R, et al. ImageNet: A Large-Scale Hierarchical Image Database[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2009, 248-255.

    [26] [26] Mandrekar JN. Receiver Operating Characteristic Curve in Diagnostic Test Assessment[J]. Journal of Thoracic Oncology, 2010, 5(9): 1315-1316.

    Tools

    Get Citation

    Copy Citation Text

    LI Min, WANG Yue, ZHANG Yuchuan, JI Zhuohao, HU Nan. Chest X-ray multi-label disease classification modelbased on vision transformer[J]. Laser Journal, 2024, 45(9): 132

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Feb. 21, 2024

    Accepted: Dec. 20, 2024

    Published Online: Dec. 20, 2024

    The Author Email: Nan HU (dplnan@126.com)

    DOI:10.14016/j.cnki.jgzz.2024.09.132

    Topics