Cross-Modal Pedestrian Re-Identification Combining Frequency-Domain Attention and Modal Co-Feature Optimization

Taizhe Tan; Mengrou Li; Zhuo Yang; Zhiyuan Gong

doi:10.3788/LOP242267

Laser & Optoelectronics Progress, Volume. 62, Issue 12, 1215010(2025)

Cross-Modal Pedestrian Re-Identification Combining Frequency-Domain Attention and Modal Co-Feature Optimization

Taizhe Tan^1,2, Mengrou Li^1、*, Zhuo Yang^1,3, and Zhiyuan Gong¹

Author Affiliations

¹School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, Guangdong , China

²Heyuan Bay Area Digital Economy Technology Development Co., Ltd., Heyuan 517400, Guangdong , China

³Guangdong Key Laboratory of Human Sports Performance Science, Guangzhou 510500, Guangdong , China

show less

Abstract Get PDF(in Chinese)

Existing cross-modal pedestrian re-identification methods disregard noise interference and valuable information for pedestrian identification in modal specific features. Hence, a cross-modal pedestrian re-identification network combining frequency-domain attention and modal co-feature optimization is proposed to effectively suppress noise interference in different modal spaces and deeply mine and utilize the implicit identity-discrimination informations in modal specific features. First, a two-stream network integrated with a frequency-domain attention mechanism is used to effectively filter noise and extract modal shared and specific features. Second, the extracted modal specific features are purified and restored to reduce modal-style differences, while identity-discrimination informations are extracted and strengthened independent of modalities. Thereafter, this implicit discrimination informations are used to guide modal shared features to enhance the model's recognizability. Finally, variance aggregation loss is introduced to minimize the modal differences among the enhanced modal shared features. Based on extensive experimental results, the proposed method demonstrates significant performance improvement on three public datasets. In particular, its Rank-1 accuracy and mean average precision are 82.14% and 81.59%, respectively, in the all-search mode on the SYSU-MM01 dataset.

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Keywords

cross-modal difference cross-modal pedestrian re-identification frequency-domain attention modal specific feature pedestrian re-identification

Tools

Get Citation

Copy Citation Text

Taizhe Tan, Mengrou Li, Zhuo Yang, Zhiyuan Gong. Cross-Modal Pedestrian Re-Identification Combining Frequency-Domain Attention and Modal Co-Feature Optimization[J]. Laser & Optoelectronics Progress, 2025, 62(12): 1215010

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Machine Vision

Received: Nov. 15, 2024

Accepted: Jan. 6, 2025

Published Online: Jun. 9, 2025

The Author Email: Mengrou Li (2794653908@qq.com)

DOI:10.3788/LOP242267

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology