Spectroscopy and Spectral Analysis, Volume. 45, Issue 9, 2632(2025)
Cross-Modal Dual-Channel Camouflaged Object Detection Method for Visible-Spectrum Image
The camouflaged object detection (COD) task for visible-spectrum images aims to utilize visible-spectrum information to detect camouflaged objects that are visually consistent with their surrounding environment. This visual consistency poses challenges such as difficulty in distinguishing object boundaries and learning discriminative features, which limit the effectiveness of existing object detection methods for COD. A Cross-modal Dynamic Collaborative Dual-channel Network (CDCDN) is proposed to explore the potential of global-local multi-level visual perception and visual-language models in COD. First, to address the challenge of distinguishing object boundaries, a dynamic, collaborative, dual-channel module is designed. Through the dual channels, the detection process is decoupled into global information localizationand local feature refinement, enabling object detection and optimization from a multi-level visual perspective. A dynamic information collaboration and fusion mechanism is established, through which global and local information are mutually complemented and corrected by global gating constraints and local perception correction. The spatial capture capability of the model is enhanced in scenarios with blurred object boundaries. To address the difficulty in learning discriminative features, a cross-modal scene-object matching module is designed. By incorporating a pre-trained VLM, this module establishes cross-modal interactions between the visual and language modalities, thereby enhancing the distinction between objects and backgrounds in the feature space and improving the model's semantic discrimination in scenes with limited discriminative features. CDCDN is evaluated on the MHCD2022 and COD10K datasets using the mAP@0.5, mAP@0.5∶0.95, and mAP@0.75 metrics. CDCDN achieves scores of 67.6%, 42.6%, 48.4% on the MHCD2022 dataset, and 67.9%, 40.6%, 41.0% on the COD10K dataset, respectively. Compared to five mainstream object detection methods, including Faster R-CNN, DETR, Lite-DETR, YOLOv5, and YOLOv10, CDCDN achieves the best detection accuracy across all three metrics.Visualization of detection results in four common camouflaged scenes -barren land, grassland, woodland, and snowfield -demonstrates the adaptability of CDCDN to various scenes. In an ablation study, the key components of CDCDN are incrementally removed to systematically evaluate their contributions, with results showing that each component significantly enhances the model's detection performance. Comprehensive experimental results indicate that CDCDN can accurately detect camouflaged objects with high visual consistency to their surroundings, providing a novel solution for COD.
Get Citation
Copy Citation Text
CHENG Yu-hu, WU Shi-jia, WANG Hao-yu, WANG Xue-song. Cross-Modal Dual-Channel Camouflaged Object Detection Method for Visible-Spectrum Image[J]. Spectroscopy and Spectral Analysis, 2025, 45(9): 2632
Received: Dec. 31, 2024
Accepted: Sep. 19, 2025
Published Online: Sep. 19, 2025
The Author Email: WANG Xue-song (wangxuesongcumt@163.com)