Acta Optica Sinica, Volume. 45, Issue 7, 0717001(2025)

LTDA‐Mamba: Retinal Vessel Segmentation Based on a Hybrid CNN‐Mamba Network

Yuanyuan Peng1、*, Haoyang Li1, Wen Li1, and Yuejin Zhang2
Author Affiliations
  • 1School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang 330000, Jiangxi , China
  • 2School of Information and Software Engineering, East China Jiaotong University, Nanchang 330000, Jiangxi , China
  • show less

    Objective

    Retinal vessel segmentation is a crucial task in ophthalmology, as it aids in the early detection and monitoring of various eye diseases, such as glaucoma, diabetic retinopathy, and hypertension-related retinopathy. Accurate segmentation can provide valuable insights into the microvasculature of the eye, which is essential for diagnosing and managing these conditions. However, retinal vessel segmentation remains challenging due to the complexity and variability of retinal images, including factors like low contrast, illumination variations, and vessel thickness discrepancies. Therefore, the objective of this study is to develop a robust and accurate segmentation algorithm that can effectively address these challenges.

    Methods

    To achieve this objective, we propose a novel CNN-Mamba network that integrates local intensity order transformation (LIOT) and dual cross-attention mechanisms. The proposed network architecture consists of three main components: a convolutional neural network (CNN) encoder for feature extraction, a series of Mamba blocks that incorporate dual cross-attention mechanisms to capture complex dependencies between distant regions in the image, and a segmentation head for producing the final vessel segmentation mask. In the preprocessing stage, LIOT is applied to the input retinal image to enhance its contrast and detail. LIOT works by rearranging pixel intensities within a local window so that the intensity order reflects the underlying structure of the vessels. This preprocessing step facilitates better feature extraction by the CNN encoder, as it highlights the edges and contours of the vessels. The CNN encoder is responsible for extracting local features from the preprocessed image and consists of a series of convolutional layers, batch normalization layers, and ReLU activation functions. The output of the CNN encoder is a set of feature maps that capture various aspects of the retinal image, such as texture, edges, and shapes. The Mamba blocks are the core of the proposed network. Each Mamba block contains two parallel branches: a pixel-level selective structured state space model (PiM) and a patch-level selective structured state space model (PaM). The PiM branch focuses on processing local features and capturing neighboring pixel information, while the PaM branch handles remote dependency modeling and global patch interactions. The dual cross-attention mechanisms within the Mamba blocks enable the network to capture complex dependencies between distant regions in the image, improving its ability to segment fine vascular structures. Finally, the segmentation head consists of a series of convolutional layers and a sigmoid activation function, which produce the final vessel segmentation mask.

    Results and Discussions

    Experimental results on benchmark retinal vessel segmentation datasets demonstrate the effectiveness of the proposed CNN-Mamba network. The network achieves superior performance in terms of accuracy, sensitivity, and specificity compared to state-of-the-art methods. In particular, the integration of LIOT and dual cross-attention mechanisms significantly improves the network’s ability to segment fine vascular structures, even in challenging cases with low contrast or high variability in vessel thickness. We also conduct ablation studies to analyze the contributions of LIOT and the dual cross-attention mechanisms to the overall performance of the network. The results show that both components are essential for achieving optimal segmentation performance. Specifically, LIOT enhances the contrast and detail of the input image, facilitating better feature extraction by the CNN encoder. The dual cross-attention mechanisms within the Mamba blocks enable the network to capture complex dependencies between distant regions in the image, which is crucial for segmenting fine vascular structures. LTDA-Mamba demonstrates excellent vessel segmentation and blood vessel pixel identification capabilities, which leads to a reduction in the subjectivity associated with manual labeling. In general, LTDA-Mamba outperforms other cutting-edge methods with high sensitivity. Specifically, for the DRIVE, CHASE_DB1, and STARE datasets, the accuracy rates are 0.9689, 0.9741, and 0.9792, respectively. The sensitivities are 0.7868, 0.7697, and 0.7488, while the F1 scores are 0.8151, 0.8043, and 0.8219, respectively.

    Conclusions

    In conclusion, the proposed CNN-Mamba network, incorporating LIOT and dual cross-attention mechanisms, represents a significant advancement in retinal vessel segmentation. The network demonstrates the ability to accurately and consistently segment fine vascular structures, even in challenging cases. This capability suggests its potential for early disease detection, patient monitoring, and treatment planning in ophthalmology. The integration of LIOT and dual cross-attention mechanisms further enhances the network’s robustness and accuracy, which makes it a powerful tool for ophthalmic image analysis. Future work will focus on optimizing the network architecture and exploring additional preprocessing steps to further strengthen segmentation performance.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Yuanyuan Peng, Haoyang Li, Wen Li, Yuejin Zhang. LTDA‐Mamba: Retinal Vessel Segmentation Based on a Hybrid CNN‐Mamba Network[J]. Acta Optica Sinica, 2025, 45(7): 0717001

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Medical optics and biotechnology

    Received: Dec. 13, 2024

    Accepted: Jan. 16, 2025

    Published Online: Mar. 19, 2025

    The Author Email: Peng Yuanyuan (pengmi467347713@126.com)

    DOI:10.3788/AOS241887

    Topics