Chinese Journal of Lasers, Volume. 51, Issue 21, 2107110(2024)
Full‐Automatic Brain Tumor Segmentation Based on Multimodal Feature Recombination and Scale Cross Attention Mechanism
Brain tumors pose a significant threat to human health, and fully automatic magnetic resonance imaging (MRI) segmentation of brain tumors and their subregions is fundamental to their computer-aided clinical diagnosis. During brain MRI segmentation using deep learning networks, tumors occupy a small volume of medical images, have blurred boundaries, and may appear in any shape and location in the brain, presenting significant challenges to brain tumor segmentation tasks. In this study, the morphological and anatomical characteristics of brain tumors are integrated, and a UNet with a multimodal recombination module and scale cross attention (MR-SC-UNet) is proposed. In the MR-SC-UNet, a multitask segmentation framework is employed, and a multimodal feature recombination module is designed for segmenting different subregions, such as the whole tumor (WT), tumor core (TC), and enhancing tumor (ET). In addition, the learned weights are used to effectively integrate information from different modalities, thereby obtaining more targeted lesion features. This approach aligns with the idea that different MRI modalities highlight different subregions of brain tumor lesions.
To address the feature differences required for segmenting the different subregions of brain tumors, a segmentation framework was proposed in this study, which takes the segmentation task of three lesion regions as independent sub-tasks. In this framework, complementary and shared information among various modalities is fully considered, and a multimodal feature recombination module was designed to automatically learn the attention weights of each modality. The recombined features derived by integrating these learned attention weights with the traditionally extracted features are then input into the segmentation network. In the segmentation network, the module automatically learns the attention weights of each modality and recombines these weights with traditionally extracted features. By treating the segmentation tasks of the three lesion regions as independent subtasks, accurate segmentation of the gliomas is achieved, thereby addressing the problem of differing multimodal information requirements for different regions. To address the inability of a 3DUNet to fully extract global features and fuse multiscale information, a U-shaped network based on scale cross attention (SC-U-Net) was proposed. Specifically, a scale cross attention (SC) module was designed and incorporated into the deep skip connections of a 3DUNet. By leveraging the global modeling capability of the transformer model, SC extracts the global features of the image and fully integrates multiscale information.
Figure 7 shows the results of the ablation experiments with different configurations of the SC module. When the SC module is added to the 3rd to 5th skip connections, the network achieves the best integration of deep multiscale features, thereby enhancing the feature extraction capability of the model. The average Dice coefficient of the three regions reaches 87.98%, and the mean 95% Hausdorff distance is 5.82 mm, thereby achieving optimal performance. Table 1 lists the ablation experimental results. The best results are obtained when the proposed MR and SC modules are used together, with the Dice coefficients for the three subregions increased by 1.34, 2.33, and 7.08 percentage points. Table 2 presents the comparison results of the six state-of-the-art methods, indicating superior performance in most metrics. Figures 8 and 9 show the segmentation visualization results, revealing that the improved model can more accurately identify the tumor tissue, resulting in smoother segmentation boundaries. Additionally, by integrating multiscale features, the model gains a larger receptive field, reducing the unreasonable segmentation caused by a single-scale and limited receptive field. Therefore, the segmentation results are closer to the annotated images with minimal false-positive regions.
In this study, a deep learning network framework, MR-SC-UNet, is proposed and applied to glioma segmentation tasks. The test results on the BraTS2019 dataset show that the proposed method achieves average Dice scores of 91.13%, 87.46%, and 87.98% for the WT, TC, and ET regions, respectively, demonstrating its feasibility and effectiveness. In clinical applications, accurate tumor segmentation can significantly improve the capabilities of radiologists and neurosurgeons for disease assessment and provide a scientific basis for precise treatment planning and risk assessment of patients.
Get Citation
Copy Citation Text
Hengyi Tian, Yu Wang, Hongbing Xiao. Full‐Automatic Brain Tumor Segmentation Based on Multimodal Feature Recombination and Scale Cross Attention Mechanism[J]. Chinese Journal of Lasers, 2024, 51(21): 2107110
Category: Biomedical Optical Imaging
Received: Apr. 16, 2024
Accepted: Jul. 9, 2024
Published Online: Oct. 31, 2024
The Author Email: Wang Yu (wangyu@btbu.edu.cn), Xiao Hongbing (x.hb@163.com)
CSTR:32183.14.CJL240779