Computer Engineering, Volume. 51, Issue 8, 16(2025)
Review of Application of SAM and Its Improved Models in Image Segmentation
[1] [1] BOMMASANI R, HUDSON D A, ADELI E, et al. On the opportunities and risks of foundation models[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2108.07258v3.
[2] [2] DUBEY A, JAUHRI A, PANDEY A, et al. The Llama 3 herd of models[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2407.21783.
[3] [3] ACHIAM J, ADLER S, AGARWAL S, et al. GPT-4 technical report[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2303.08774.
[5] [5] WANG H Y, GUO S Z, YE J, et al. SAM-Med3D: towards general-purpose segmentation models for volumetric medical images[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2310.15161v3.
[6] [6] PANDEY S, CHEN K F, DAM E B. Comprehensive multimodal segmentation in medical imaging: combining YOLOv8 with SAM and HQ-SAM models[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Washington D.C., USA: IEEE Press, 2023: 2584-2590.
[7] [7] PARULEKAR B, SINGH N, RAMIYA A M. Evaluation of Segment Anything Model (SAM) for automated labelling in machine learning classification of UAV geospatial data[J]. Earth Science Informatics, 2024, 17(5): 4407-4418.
[8] [8] HETANG C R, XUE H R, LE C, et al. Segment anything model for road network graph extraction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Washington D.C., USA: IEEE Press, 2024: 2556-2566.
[9] [9] ZHAO X Q, WU Z, CHEN Y B, et al. Fine-grained high-resolution remote sensing image change detection by SAM-U-Net change detection model[J]. Remote Sensing, 2024, 16(19): 3620.
[10] [10] ZHANG J J, BAI C J, HE H R, et al. SAM-E: leveraging visual foundation model with sequence imitation for embodied manipulation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2405.19586v1.
[11] [11] CHENG Y M, LI L L, XU Y Y, et al. Segment and track anything[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2305.06558v1.
[12] [12] AHMADI M, LONBAR A G, NAEINI H K, et al. Application of segment anything model for civil infrastructure defect assessment[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2304.12600v2.
[13] [13] KIRILLOV A, MINTUN E, RAVI N, et al. Segment anything[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2023: 3992-4003.
[14] [14] ZHANG Y C, SHEN Z R, JIAO R S. Segment anything model for medical image segmentation: current applications and future directions[J]. Computers in Biology and Medicine, 2024, 171: 108238.
[17] [17] ALI M, WU T, HU H J, et al. A review of the Segment Anything Model (SAM) for medical image analysis: accomplishments and perspectives[J]. Computerized Medical Imaging and Graphics, 2025, 119: 102473.
[18] [18] DOSOVITSKIY A. An image is worth 16 × 16 words: Transformers for image recognition at scale[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2010.11929.
[19] [19] RONNEBERGER O, FISCHER P, BROX T. U-Net: convolutional networks for biomedical image segmentation[M]. Berlin, Germany: Springer International Publishing, 2015.
[20] [20] CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/1706.05587v3.
[21] [21] HE K M, CHEN X L, XIE S N, et al. Masked autoencoders are scalable vision learners[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2022: 15979-15988.
[22] [22] ZHAO X, DING W C, AN Y Q, et al. Fast segment anything[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2306.12156v1.
[23] [23] ZHANG C N, HAN D S, QIAO Y, et al. Faster segment anything: towards lightweight SAM for mobile applications[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2306.14289v2.
[24] [24] ZHANG C N, HAN D S, ZHENG S, et al. MobileSAMv2: faster segment anything to everything[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2312.09579v1.
[25] [25] XIONG Y Y, VARADARAJAN B, WU L M, et al. EfficientSAM: leveraged masked image pretraining for efficient segment anything[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2024: 16111-16121.
[26] [26] ZHANG Z Y, CAI H, HAN S. EfficientViT-SAM: accelerated segment anything model without performance loss[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Washington D.C., USA: IEEE Press, 2024: 7859-7863.
[27] [27] ZHOU C, LI X T, LOY C C, et al. EdgeSAM: prompt-in-the-loop distillation for on-device deployment of SAM[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2312.06660v2.
[28] [28] KE L, YE M, DANELLJAN M, et al. Segment anything in high quality[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2306.01567.
[29] [29] SONG Y, ZHOU Q, LI X, et al. BA-SAM: scalable bias-mode attention mask for segment anything model[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2024: 3162-3173.
[30] [30] LI F, ZHANG H, SUN P Z, et al. Semantic-SAM: segment and recognize anything at any granularity[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2307.04767v1.
[31] [31] FENG Z S, ZHANG Y L, CHEN Y H, et al. SwinSAM: fine-grained polyp segmentation in colonoscopy images via segment anything model integrated with a Swin Transformer decoder[J]. Biomedical Signal Processing and Control, 2025, 100: 107055.
[32] [32] CHEN W T, VONG Y J, KUO S Y, et al. RobustSAM: segment anything robustly on degraded images[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2406.09627v1.
[33] [33] ZHANG L, LIANG Y, ZHANG R, et al. BLO-SAM: bi-level optimization based finetuning of the segment anything model for overfitting-preventing semantic segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2402.16338.
[34] [34] JIANG M Z, ZHOU J Y, WU J D, et al. Uncertainty-Aware Adapter: adapting Segment Anything Model (SAM) for ambiguous medical image segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2403.10931v2.
[35] [35] WU J D, JI W, LIU Y P, et al. Medical SAM adapter: adapting segment anything model for medical image segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2304.12620v7.
[36] [36] MA J, HE Y, LI F, et al. Segment anything in medical images[J]. Nature Communications, 2024, 15(1): 654.
[37] [37] ZHANG K D, LIU D. Customized segment anything model for medical image segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2304.13785v2.
[38] [38] GAO Y F, XIA W, HU D D, et al. DeSAM: decoupled segment anything model for generalizable medical image segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2306.00499v2.
[39] [39] CHEN T R, ZHU L Y, DING C T, et al. SAM-Adapter: adapting segment anything in underperformed scenes[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Washington D.C., USA: IEEE Press, 2023: 3359-3367.
[40] [40] HUANG Y, LAI W B, JI J Y, et al. HRSAM: efficient interactive segmentation in high-resolution images[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2407.02109v2.
[41] [41] LI B, XIAO H K, TANG L. ASAM: boosting segment anything model with adversarial tuning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2024: 3699-3710.
[43] [43] CHEN K Y, LIU C Y, CHEN H, et al. RSPrompter: learning to prompt for remote sensing instance segmentation based on visual foundation model[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 4701117.
[44] [44] YUE W X, ZHANG J, HU K, et al. SurgicalSAM: efficient class promptable surgical instrument segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2024: 6890-6898.
[45] [45] SUN Y P, CHEN J H, ZHANG S, et al. VRP-SAM: SAM with visual reference prompt[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2024: 23565-23574.
[46] [46] MO S T, TIAN Y P. AV-SAM: segment anything model meets audio-visual localization and segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2305.01836v1.
[47] [47] ZHANG Y X, CHENG T H, ZHU L H, et al. EVF-SAM: early vision-language fusion for text-prompted segment anything model[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2406.20076v5.
[48] [48] RAJI F, KE L, TAI Y W, et al. Segment anything meets point tracking[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2307.01197v2.
[49] [49] CHEN P F, XIE L X, HUO X Y, et al. SAM-CP: marrying SAM with composable prompts for versatile segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2407.16682v1.
[50] [50] ZHANG R R, JIANG Z K, GUO Z Y, et al. Personalize segment anything model with one shot[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2305.03048v2.
[51] [51] ZHOU C P, NING K J, SHEN Q Q, et al. SAM-SP: self-prompting makes SAM great again[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2408.12364v1.
[52] [52] XU Y S, TANG J Q, MEN A D, et al. EviPrompt: a training-free evidential prompt generation method for segment anything model in medical images[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2311.06400v1.
[53] [53] CHEN Z, XU Q, LIU X Y, et al. UN-SAM: universal prompt-free segmentation for generalized nuclei images[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2402.16663v1.
[55] [55] LENG T A, ZHANG Y M, HAN K, et al. Self-sampling meta SAM: enhancing few-shot medical image segmentation with meta-learning[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Washington D.C., USA: IEEE Press, 2024: 7910-7920.
[56] [56] QI X Y, WU Y F, MAO Y Q, et al. Self-guided few-shot semantic segmentation for remote sensing imagery based on large vision models[M]. Berlin, Germany: Springer, 2024.
[57] [57] HE C, LI K, ZHANG Y, et al. Weakly-supervised concealed object segmentation with sam-based pseudo labeling and multi-scale feature grouping[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2305.11003.
[58] [58] HU M Z, LI Y H, YANG X F. SkinSAM: empowering skin cancer segmentation with segment anything model[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2304.13973v1.
[59] [59] CAO Y K, XU X H, SUN C, et al. Segment any anomaly without training via hybrid prompt regularization[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2305.10724v1.
[60] [60] CUI C, DENG R N, LIU Q, et al. All-in-SAM: from weak annotation to pixel-wise nuclei segmentation with prompt-based finetuning[J]. Journal of Physics: Conference Series, 2024, 2722(1): 012012.
[61] [61] DAI H X, MA C, YAN Z L, et al. SAMAug: point prompt augmentation for segment anything model[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2307.01187v4.
[62] [62] WU K, ZHANG J N, PENG H W, et al. TinyViT: fast pretraining distillation for small vision transformers[M]. Berlin, Germany: Springer, 2022.
[63] [63] ZHANG H J, SU Y Y, XU X, et al. Improving the generalization of segmentation foundation model under distribution shift via weakly supervised adaptation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2024: 23385-23395.
[64] [64] SAHOO P, SINGH A K, SAHA S, et al. A systematic survey of prompt engineering in large language models: techniques and applications[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2402.07927v2.
[65] [65] ANTONIOU A, EDWARDS H, STORKEY A. How to train your MAML[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2402.16338.
[66] [66] SUN W X, LIU Z Y, ZHANG Y H, et al. An alternative to WSSS? An empirical study of the Segment Anything Model (SAM) on weakly-supervised semantic segmentation problems[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2305.01586v2.
[67] [67] HO J, JAIN A, ABBEEL P. Denoising diffusion probabilistic models[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2006.11239.
[68] [68] XU Q, LI J X, HE X J, et al. ESP-MedSAM: efficient self-prompting SAM for universal domain-generalized medical image segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2407.14153v4.
[69] [69] YILDIZ Z, GU H, ZHANG J, et al. SegmentWithSAM: 3D slicer extension for Segment Anything Model (SAM)[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2408.15224.
[70] [70] WANG D, ZHANG J, DU B, et al. SAMRS: scaling-up remote sensing segmentation dataset with segment anything model[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2305.02034.
[72] [72] ZHANG J, YANG X B, JIANG R, et al. RSAM-Seg: a SAM-based approach with prior knowledge integration for remote sensing image semantic segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2402.19004v1.
[73] [73] LEE H, KIM K, LEE K. Application of Geo-Segment Anything Model (SAM) scheme to water body segmentation: an experiment study using CAS500-1 images[J]. Korean Journal of Remote Sensing, 2024, 40(4): 343-350.
[74] [74] ZHANG X, LIU Y, LIN Y M, et al. UV-SAM: adapting segment anything model for urban village identification[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2024: 22520-22528.
[75] [75] XI L D, YU J C, GE D Q, et al. SAM-CFFNet: SAM-based cross-feature fusion network for intelligent identification of landslides[J]. Remote Sensing, 2024, 16(13): 2334.
[76] [76] GIANNAKIS I, BHARDWAJ A, SAM L, et al. Deep learning universal crater detection using Segment Anything Model (SAM)[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2304.07764v1.
[77] [77] ZHANG S M, LU Q H. Innovative integration of visual foundation model with a robotic arm on a mobile platform[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2404.18720v1.
[78] [78] MOENCK K, WENDT A, PRNTE P, et al. Industrial segment anything—a case study in aircraft manufacturing, intralogistics, maintenance, repair, and overhaul[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2307.12674v1.
[79] [79] LIANG W, MA X G. Group-Mix SAM: lightweight solution for industrial assembly line applications[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2403.10053v1.
[80] [80] LI Z S, HUO D, MEURER M, et al. Efficient cutting tool wear segmentation based on segment anything model[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2407.01211.
[81] [81] YANG Y H, WU X Y, HE T, et al. SAM3D: segment anything in 3D scenes[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2306.03908v1.
[82] [82] CORDTS M, OMRAN M, RAMOS S, et al. The Cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2016: 3213-3223.
[83] [83] NEUHOLD G, OLLMANN T, BUL S R, et al. The Mapillary Vistas dataset for semantic understanding of street scenes[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2017: 5000-5009.
[84] [84] LAKHANI P, MONGAN J, SINGHAL C, et al. The 2021 SIIM-FISABIO-RSNA machine learning COVID-19 challenge: annotation and standard exam classification of COVID-19 chest radiographs[J]. Journal of Digital Imaging, 2023, 36(1): 365-372.
[85] [85] LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft COCO: common objects in context[M]. Berlin, Germany: Springer, 2014.
[86] [86] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The PASCAL Visual Object Classes (VOC) challenge[J]. International Journal of Computer Vision, 2010, 88(2): 303-338.
[87] [87] ZHOU B L, ZHAO H, PUIG X, et al. Semantic understanding of scenes through the ADE20K dataset[J]. International Journal of Computer Vision, 2019, 127(3): 302-321.
[88] [88] MARTIN D, FOWLKES C, TAL D, et al. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics[C]//Proceedings of the 8th IEEE International Conference on Computer Vision. Washington D.C., USA: IEEE Press, 2001: 416-423.
[89] [89] DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Washington D.C., USA: IEEE Press, 2009: 248-255.
[90] [90] WANG L J, LU H C, WANG Y F, et al. Learning to detect salient objects with image-level supervision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2017: 3796-3805.
[91] [91] WANG J J, ZHENG Z, MA A L, et al. LoveDA: a remote sensing land-cover dataset for domain adaptive semantic segmentation[EB/OL]. [2024-10-11]. https://arxiv.org/abs/2110.08733v6.
[92] [92] LECLERC S, SMISTAD E, PEDROSA J, et al. Deep learning for segmentation using an open large-scale dataset in 2D echocardiography[J]. IEEE Transactions on Medical Imaging, 2019, 38(9): 2198-2210.
[93] [93] ZHANG J, FAN D P, DAI Y C, et al. RGB-D saliency detection via cascaded mutual information minimization[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Washington D.C., USA: IEEE Press, 2021: 4318-4327.
[94] [94] TU Z Z, XIA T, LI C L, et al. RGB-T image saliency detection via collaborative graph learning[J]. IEEE Transactions on Multimedia, 2020, 22(1): 160-173.
[95] [95] QIN X B, DAI H, HU X B, et al. Highly accurate dichotomous image segmentation[M]. Berlin, Germany: Springer, 2022.
[96] [96] FAN D P, JI G P, SUN G L, et al. Camouflaged object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2020: 2777-2787.
[97] [97] VICENTE T F Y, HOU L, YU C P, et al. Large-scale training of shadow detectors with noisily-annotated shadow examples[M]. Berlin, Germany: Springer International Publishing, 2016.
[98] [98] FAN D P, JI G P, XU P, et al. Advances in deep concealed scene understanding[J]. Visual Intelligence, 2023, 1(1): 16.
[99] [99] TAJBAKHSH N, GURUDU S R, LIANG J M. Automated polyp detection in colonoscopy videos using shape and context information[J]. IEEE Transactions on Medical Imaging, 2016, 35(2): 630-644.
[100] [100] SHUMAILOV I, SHUMAYLOV Z, ZHAO Y R, et al. AI models collapse when trained on recursively generated data[J]. Nature, 2024, 631(8022): 755-759.
Get Citation
Copy Citation Text
Mayilamu Musideke, GAO Yuxin, ZHANG Situo, FENG Ke, Abudukelimu Abulizi, Halidanmu Abudukelimu. Review of Application of SAM and Its Improved Models in Image Segmentation[J]. Computer Engineering, 2025, 51(8): 16
Category:
Received: Nov. 18, 2024
Accepted: Aug. 26, 2025
Published Online: Aug. 26, 2025
The Author Email: Halidanmu Abudukelimu (abdklmhldm@gmail.com)