Laser & Optoelectronics Progress, Volume. 59, Issue 24, 2410006(2022)
Cross-Modal Hash Method Based on Multi-Scale Fusion and Projection Matching Constraint
Most cross-modal Hash methods based on deep learning learn unified Hash codes of different-modality data directly through neural networks. However, these methods ignore the factor that different scales of single-modality data contain different semantic information, which affects the data feature representation, and the importance of low-dimensional features in bridging the "heterogeneity" gap. Based on the above problems, we propose a new cross-modal Hash retrieval method (MFPMC) based on multi-scale fusion and projection matching constraint, which obtains low-dimensional features of different-modality data by designing the image multi-scale fusion network and text multi-scale fusion network. Moreover, it introduces the low-dimensional feature projection matching constraint and adversarial training to ensure the distribution consistency of low-dimensional features among different modalities. Simultaneously, low-dimensional features containing rich semantic information are used as inputs for the Hash function. Furthermore, inter-modal Hash code, intra-modal Hash code, quantization, and label-embedding losses are constructed to constrain the learning of Hash function and Hash codes to ensure the generation of discriminative discrete binary Hash codes. Experiments on two benchmark cross-modal retrieval datasets (MIRFlickr-25K and NUS-WIDE) reveal that the proposed method outperforms other methods in terms of retrieval performance.
Get Citation
Copy Citation Text
Wanyu Deng, Yina Zhao, Wanzhen Yang, Bo Zhang, Hao Li, Shuqi Ye. Cross-Modal Hash Method Based on Multi-Scale Fusion and Projection Matching Constraint[J]. Laser & Optoelectronics Progress, 2022, 59(24): 2410006
Category: Image Processing
Received: Sep. 17, 2021
Accepted: Oct. 27, 2021
Published Online: Oct. 31, 2022
The Author Email: Zhao Yina (3065783275@qq.com)