Dual‑Branch Multimodal Medical Image Fusion Based on Local and Global Information Collaboration

Yu Shen; Jiaying Liu; Jiarong Yan; Ruoxuan Wang; Yukun Ma; Jiangcheng Li; Shan Bai; Ziyi Wei; Yangyang Li; Zhenkai Qiang

doi:10.3788/AOS250443

Acta Optica Sinica, Volume. 45, Issue 8, 0810001(2025)

Dual‑Branch Multimodal Medical Image Fusion Based on Local and Global Information Collaboration

Yu Shen1... Jiaying Liu1,*, Jiarong Yan1, Ruoxuan Wang1, Yukun Ma1, Jiangcheng Li1, Shan Bai1, Ziyi Wei1,2, Yangyang Li1 and Zhenkai Qiang1 |Show fewer author(s)

Author Affiliations

¹School of Electronics and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, Gansu, China

²Gansu Cuiying Information Technology Co., Ltd., Lanzhou 730030, Gansu, China

show less

Abstract Get PDF(in Chinese)

Objective

Medical image fusion is a crucial technology for assisting doctors in making accurate diagnoses. However, existing medical image fusion techniques suffer from issues such as blurred lesion boundaries, loss of detailed information, and high texture similarity between normal tissues and lesion regions. To address these problems, we propose a dual-branch multimodal medical image fusion method based on the collaboration of local and global information. This method not only reduces the loss of detailed information but also effectively improves the clarity and accuracy of the fused images, which ensures more precise identification of lesion regions, thereby providing more reliable and accurate support for medical image diagnosis.

Methods

We propose a dual-branch multimodal medical image fusion model based on the collaboration of local and global information. Firstly, the model utilizes a multi-scale depth-separable convolutional network to extract feature information with different receptive fields from the input images. Subsequently, the extracted features are fed into a dual-branch structure, which mainly consists of two modules: the deep local feature enhancement module and the global information extraction module. The local feature enhancement module focuses on enhancing image details, especially those in the lesion areas, to improve the clarity of these regions. The global information extraction module, on the other hand, emphasizes capturing the global structural context of the input images, which ensures the overall consistency of the images and the integrity of their organizational structures during the fusion process. To further optimize the feature fusion process, we introduce two advanced fusion units: the Multidimensional Joint Local Fusion Unit (MLFU) and the Cross-Global Fusion Unit (CGFU). The MLFU efficiently fuses the local features extracted by the two branches, which ensures that important fine-grained information is retained and enhanced during the fusion process. The CGFU promotes the fusion of global features, which facilitates information sharing and complementarity between different modalities. Finally, a convolutional layer is used to adjust and reconstruct the fused features, and a fused image with richer details and higher clarity is output.

Results and Discussions

The effectiveness of the proposed model in medical image fusion tasks has been validated through extensive comparison and ablation experiments. The experimental results demonstrate that our model significantly outperforms nine other mainstream methods in several key objective evaluation metrics on the Harvard public medical image dataset. Specifically, our model achieves improvements of 3.14%, 0.95%, 13.66%, 16.81%, and 1.12% in EN, SD, SF, AG, and CC, respectively (Table 4). The model enhances local feature extraction through the deep local feature enhancement module, which accurately captures subtle differences in the input images and significantly improves the clarity of lesion boundaries. Furthermore, to further optimize the fusion results, the model employs different fusion strategies for different feature types, which effectively integrates local features with global information and achieves more efficient information complementarity and collaboration between modalities. As a result, the fused images exhibit richer texture details and clearer structural features, thereby significantly enhancing image readability and diagnostic value (Figs. 6, 7, and 8). Ablation experiments further validate the effectiveness of each module in the model. The results show that removing the deep local feature enhancement module leads to a noticeable decline in lesion boundary clarity and texture detail, particularly in high-contrast lesion areas, where the fusion quality deteriorates. Furthermore, removing the global information fusion module results in a significant loss of global consistency and information complementarity between different modalities, which leads to a fusion result with a lack of necessary global coherence (Figs. 11, 12, and 13).

Conclusions

The algorithm proposed in this paper effectively integrates local and global information, achieving excellent detail preservation and structural representation in medical image fusion tasks. This method not only accurately fuses normal tissue structures to ensure global consistency but also highlights the details of abnormal lesion areas, which improves the visibility and recognizability of lesions. By combining deep local feature extraction with global context information, the algorithm ensures the preservation of local details while effectively enhancing the texture features and boundary clarity of lesions. The fused images exhibit richer texture details and clearer structural features. Experimental results, verified through numerous comparative experiments, demonstrate that this algorithm is effective in improving the diagnostic accuracy of medical images. Compared with other mainstream methods, the proposed algorithm performs outstandingly in multiple key objective evaluation metrics, especially in terms of detail preservation, structural clarity, and lesion prominence.

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Keywords

attention mechanism dual-branch architecture local and global information multimodal medical image fusion

Tools

Get Citation

Copy Citation Text

Yu Shen, Jiaying Liu, Jiarong Yan, Ruoxuan Wang, Yukun Ma, Jiangcheng Li, Shan Bai, Ziyi Wei, Yangyang Li, Zhenkai Qiang. Dual‑Branch Multimodal Medical Image Fusion Based on Local and Global Information Collaboration[J]. Acta Optica Sinica, 2025, 45(8): 0810001

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites