Chinese Journal of Lasers, Volume. 51, Issue 21, 2107102(2024)
Application of Segment Anything Model in Medical Image Segmentation
The application of deep neural networks to image segmentation is one of the most prevalent topics in medical imaging. As an initial step in computer-aided detection processes, medical image segmentation aims to identify contours or regions of interest within images, thereby providing valuable assistance to clinicians in image interpretation, surgical planning, and clinical decision-making. Deep neural networks, which leverage their powerful ability to learn complex image features, have demonstrated outstanding performance in medical image segmentation. However, the use of deep neural networks for medical image segmentation has two significant limitations. First, different medical imaging modalities and specific segmentation tasks exhibit diverse image characteristics, leading to the low generalization capabilities of deep neural networks, which are often tailored to specific tasks. Second, increasingly complex network architectures with notable segmentation efficacy demand significant amounts of annotated image data, particularly those that require laborious manual annotation by medical experts.
With the rapid advancement of large-scale pretrained foundation models (LPFMs) in the field of artificial intelligence, an increasing number of tasks have achieved superior results through the fine-tuning of LPFMs. LPFMs are generic models trained on massive amounts of data and acquire foundational and versatile representational capabilities that can be transferred across different domains. Consequently, various downstream tasks can be easily fine-tuned using universal models. Considering the challenges in medical image segmentation, including low model generalization and difficulty in dataset acquisition, universal LPFMs are urgently needed in the field of medical image segmentation to facilitate breakthroughs in artificial intelligence applied to medical imaging.
Since its introduction as a foundational large model in the field of natural image segmentation, the segment anything model (SAM) has been applied across various domains with remarkable results. Although SAM has demonstrated powerful capabilities in natural image segmentation, its direct application to medical image segmentation tasks has yielded less-than-satisfactory outcomes. This can be attributed to two main factors. First, the training datasets contain shortcomings. SAM lacks sufficient representation of medical images in its training data, and medical images often exhibit blurry edges, which differ significantly from the clear edges present in natural images. Second, the characteristics of SAM prompts play a crucial role in segmentation performance. Only by judiciously selecting prompt strategies can the full potential of SAM be realized.
For these two reasons, significant efforts have been directed toward fine-tuning SAM, adapting SAM to three-dimensional (3D) medical datasets, expanding SAM functionalities, and optimizing prompting strategies. Comprehensive review articles have summarized these endeavors, such as the study by Zhang et al., which extensively outlined advancements in fine-tuning SAM, expanding its functionalities, optimizing prompting strategies, and distilling the challenges faced by SAM in the field of medical image segmentation. However, a systematic summary of methods for applying SAM to 3D medical datasets is lacking. Zhang et al. primarily elaborated on the fine-tuning of SAM, its application to 3D medical datasets, and related automatic prompting strategies. Nevertheless, as research on SAM deepens and its performance across various datasets improves, efforts in fine-tuning SAM, adapting it to 3D datasets, and optimizing prompting strategies have become more sophisticated. In addition, SAM has been extended to integrate semi-supervised learning methods and has been applied to novel directions such as interactive clinical healthcare. To summarize comprehensively the progress of SAM adaptation to medical image segmentation as well as to address existing challenges and provide directions for further research, a review that specifically focuses on the application of SAM to medical image segmentation is essential.
This study extensively reviewed more than one hundred articles focusing on the utilization of SAM for medical image segmentation. Initially, this study furnished an exhaustive exposition of the SAM architecture and delineated its direct application to medical image datasets (Table 1). Then, an in-depth analysis of SAM's adaptation to medical image segmentation was conducted, emphasizing innovative refinements in fine-tuning techniques, SAM's integration into 3D medical datasets, and its amalgamation with semi-supervised learning methodologies (Fig. 3) alongside other emerging avenues. Experimental evaluations on two proprietary medical image datasets validated the enhanced generalization capabilities of the large models after extensive data fine-tuning (Table 2). In addition, the study confirmed the effectiveness of combining SAM with semi-supervised networks in generating high-quality pseudo-labels, thereby augmenting the segmentation performance (Table 3). Finally, the study delved into the current limitations, identified areas requiring improvement, elucidated the challenges encountered in SAM's adaptation to medical image segmentation, and proposed future directions, including the construction of large-scale datasets, enhancement of multi-modal and multi-scale information processing, integration of SAM with semi-supervised network structures, and expansion of SAM's application in clinical settings.
SAM is progressively being established as a potent asset in the field of medical image segmentation. In summary, although the integration of SAM into medical image segmentation holds great promise, it continues to face many challenges. Addressing these challenges requires a more comprehensive investigation and more refined approach, thus paving the way for effective implementation and further evolution of large-scale models in the domain of medical segmentation.
Get Citation
Copy Citation Text
Tong Wu, Haoji Hu, Yang Feng, Qiong Luo, Dong Xu, Weizeng Zheng, Neng Jin, Chen Yang, Jincao Yao. Application of Segment Anything Model in Medical Image Segmentation[J]. Chinese Journal of Lasers, 2024, 51(21): 2107102
Category: Biomedical Optical Imaging
Received: Feb. 26, 2024
Accepted: May. 10, 2024
Published Online: Oct. 31, 2024
The Author Email: Hu Haoji (haoji_hu@zju.edu.cn)
CSTR:32183.14.CJL240614