Spectroscopy and Spectral Analysis, Volume. 45, Issue 9, 2632(2025)

Cross-Modal Dual-Channel Camouflaged Object Detection Method for Visible-Spectrum Image

CHENG Yu-hu, WU Shi-jia, WANG Hao-yu, and WANG Xue-song*
Author Affiliations
  • School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
  • show less

    The camouflaged object detection (COD) task for visible-spectrum images aims to utilize visible-spectrum information to detect camouflaged objects that are visually consistent with their surrounding environment. This visual consistency poses challenges such as difficulty in distinguishing object boundaries and learning discriminative features, which limit the effectiveness of existing object detection methods for COD. A Cross-modal Dynamic Collaborative Dual-channel Network (CDCDN) is proposed to explore the potential of global-local multi-level visual perception and visual-language models in COD. First, to address the challenge of distinguishing object boundaries, a dynamic, collaborative, dual-channel module is designed. Through the dual channels, the detection process is decoupled into global information localizationand local feature refinement, enabling object detection and optimization from a multi-level visual perspective. A dynamic information collaboration and fusion mechanism is established, through which global and local information are mutually complemented and corrected by global gating constraints and local perception correction. The spatial capture capability of the model is enhanced in scenarios with blurred object boundaries. To address the difficulty in learning discriminative features, a cross-modal scene-object matching module is designed. By incorporating a pre-trained VLM, this module establishes cross-modal interactions between the visual and language modalities, thereby enhancing the distinction between objects and backgrounds in the feature space and improving the model's semantic discrimination in scenes with limited discriminative features. CDCDN is evaluated on the MHCD2022 and COD10K datasets using the mAP@0.5, mAP@0.5∶0.95, and mAP@0.75 metrics. CDCDN achieves scores of 67.6%, 42.6%, 48.4% on the MHCD2022 dataset, and 67.9%, 40.6%, 41.0% on the COD10K dataset, respectively. Compared to five mainstream object detection methods, including Faster R-CNN, DETR, Lite-DETR, YOLOv5, and YOLOv10, CDCDN achieves the best detection accuracy across all three metrics.Visualization of detection results in four common camouflaged scenes -barren land, grassland, woodland, and snowfield -demonstrates the adaptability of CDCDN to various scenes. In an ablation study, the key components of CDCDN are incrementally removed to systematically evaluate their contributions, with results showing that each component significantly enhances the model's detection performance. Comprehensive experimental results indicate that CDCDN can accurately detect camouflaged objects with high visual consistency to their surroundings, providing a novel solution for COD.

    Tools

    Get Citation

    Copy Citation Text

    CHENG Yu-hu, WU Shi-jia, WANG Hao-yu, WANG Xue-song. Cross-Modal Dual-Channel Camouflaged Object Detection Method for Visible-Spectrum Image[J]. Spectroscopy and Spectral Analysis, 2025, 45(9): 2632

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: Dec. 31, 2024

    Accepted: Sep. 19, 2025

    Published Online: Sep. 19, 2025

    The Author Email: WANG Xue-song (wangxuesongcumt@163.com)

    DOI:10.3964/j.issn.1000-0593(2025)09-2632-10

    Topics