Chinese Journal of Lasers, Volume. 52, Issue 3, 0307104(2025)

Large Kernel Convolution and Transformer Parallelism Based 3D Medical Image Registration Modeling

Jing Peng, Jiarong Yan*, Yu Shen, Jiaying Liu, Ziyi Wei, Shan Bai, Jiangcheng Li, Yukun Ma, and Ruoxuan Wang
Author Affiliations
  • School of Electronics and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, Gansu , China
  • show less

    Objective

    Medical image registration is essential for surgical guidance and lesion monitoring. However, existing deep learning-based registration models typically rely on a single architecture, which limits the ability to leverage the complementary strengths of convolutional neural networks and Transformer models. This often leads to suboptimal registration accuracy and difficulties in preserving the original image topology. To address these challenges, a large kernel multi-scale convolution and Transformer-based parallel registration model (PLKCT-UNet) is proposed.

    Methods

    We develop PLKCT-UNet, a three dimensional (3D) medical image registration model that integrates large kernel convolution and Transformer parallel architecture. In the encoder, the model incorporates three key components. First, a large kernel multi-scale convolution module is designed to enhance the extraction of local detail information and manage large deformations effectively. Second, a 3D Swin Transformer module improves the model's capability to capture long-range dependencies, thereby enhancing generalization performance. Finally, a multi-scale attention aggregation strategy is employed to refine features after dual-encoder channel fusion, further boosting registration accuracy.

    Results and Discussions

    To verify the effectiveness of the PLKCT-UNet model, experiments were conducted using the OASIS and LPBA40 datasets. In the comparative experiments, the OASIS dataset was utilized to calculate the degree of overlap between the segmentation masks of the moving and fixed images after registration using seven different methods and the proposed method.Results demonstrate that the proposed algorithm significantly improves registration performance while preserving the integrity of brain structures and maintaining local and spatial information. The algorithm achieves superior registration accuracy and maintains the continuity and consistency of anatomical structures, even under complex deformations. In the ablation experiments, the OASIS dataset was used to assess the contributions of the large kernel convolution (LKC) module, 3D Swin Transformer, and multi-scale attention aggregation (MSAA) module in medical image processing. Results indicate that each module contributes to enhancing the overall network performance. Generalizability experiments were performed using the LPBA40 dataset to validate the robustness of PLKCT-UNet across different datasets. Comparisons with six mainstream algorithms show that PLKCT-UNet achieves higher registration accuracy and generates smoother deformation fields, thereby improving the overall registration quality. These experiments confirm the stability and generalization capability of PLKCT-UNet, highlighting its significant advantages in handling complex deformations.

    Conclusions

    This study presents PLKCT-UNet, a novel registration model based on LKC and Transformer parallelism. The LKC module addresses sensory field size limitations, balancing detailed and global structures while employing kernel decomposition to reduce computational costs. The Swin Transformer module effectively captures long-range dependencies, enhancing the model's generalization ability. The MSAA module refines spatial and channel features through an attention aggregation strategy, improving dual-encoder feature fusion. On the OASIS dataset, the proposed model demonstrates superior registration performance compared to mainstream methods. Generalization experiments on the LPBA40 dataset further confirm its robustness and versatility. These results establish PLKCT-UNet as a state-of-the-art solution for unimodal medical image registration with broad application potential. Future work will focus on extending the algorithm to multimodal medical image registration and exploring more efficient optimization schemes to further enhance its practicality.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Jing Peng, Jiarong Yan, Yu Shen, Jiaying Liu, Ziyi Wei, Shan Bai, Jiangcheng Li, Yukun Ma, Ruoxuan Wang. Large Kernel Convolution and Transformer Parallelism Based 3D Medical Image Registration Modeling[J]. Chinese Journal of Lasers, 2025, 52(3): 0307104

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Biomedical Optical Imaging

    Received: Oct. 15, 2024

    Accepted: Nov. 11, 2024

    Published Online: Jan. 20, 2025

    The Author Email: Yan Jiarong (yjr08140917@163.com)

    DOI:10.3788/CJL241269

    CSTR:32183.14.CJL241269

    Topics