Optics and Precision Engineering, Volume. 33, Issue 2, 324(2025)

Special attribute-based cross-modal interactive fusion network for RGBT tracking

Xiaoqiang SHAO, Hao LI*, Zhiyue LÜ, Bo MA, Mingqian LIU, and Zehui HAN
Author Affiliations
  • College of Electrical and Control Engineering, Xi’an University of Science and Technology,Xi'an710054, China
  • show less

    RGBT target tracking has gained widespread application in fields such as video surveillance and autonomous driving due to its robustness and resistance to illumination and occlusion. By leveraging the common challenging attributes in infrared and visible light images and fully interacting between the two modalities, an effective tracking network was constructed, capable of overcoming the impacts of various adverse scenarios encountered during the tracking process. This network was composed of three modules: the specific attribute fusion module, the common attribute fusion module, and the cross-modality interaction module. The specific attribute fusion module enabled the network to extract modality-specific challenging attributes, effectively utilizing the advantages of different modalities. The common attribute fusion module extracted features that were matched in both modalities during target tracking and adaptively aggregated this information. It assigned corresponding weights to each common challenging attribute, thereby enhancing the tracker’s adaptability. The cross-modality interaction module incorporated common modality information into the specific modality information of infrared and visible light images, thus improving the network's robustness. To address the issue of information loss across different modalities, the traditional cross-entropy loss was optimized to enhance focus on each modality and accelerate network convergence. The proposed network is tested on the GTOT, RGBT234, and LasHeR datasets, achieving an accuracy of 84.1% and a precision of 57.3% on the RGBT234 dataset, 52.3% and a precision of 39.1% on the Lasher dataset. The results demonstrate that the tracker has achieved commendable performance, which validates the effectiveness of the proposed method.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Xiaoqiang SHAO, Hao LI, Zhiyue LÜ, Bo MA, Mingqian LIU, Zehui HAN. Special attribute-based cross-modal interactive fusion network for RGBT tracking[J]. Optics and Precision Engineering, 2025, 33(2): 324

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Jul. 9, 2024

    Accepted: --

    Published Online: Apr. 30, 2025

    The Author Email: Hao LI (2670815399@qq.com)

    DOI:10.37188/OPE.20253302.0324

    Topics