Optics and Precision Engineering, Volume. 33, Issue 12, 1940(2025)

RGB-T tracking network based on multi-modal feature fusion

Jing JIN, Jianqin LIU*, and Fengwen ZHAI
Author Affiliations
  • School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou730070, China
  • show less
    Figures & Tables(17)
    Overall structure diagram of the MMFFTN network
    Structure diagram of channel feature fusion module
    Structure diagram of local aggregation module
    Structure diagram of cross-modal feature fusion module
    Precision plot and Success plot of MMFFTN and compared algorithms on GTOT dataset
    Precision plot and Success plot of MMFFTN and compared algorithms on RGBT234 dataset
    Precision, success, and normalized precision plots of MMFFTN and compared algorithms on LasHeR
    Loss curves during training
    Precision rate plot under various attributes on the RGBT234 dataset
    Precision and success rate plot in different attributes on the LasHeR dataset
    Comparison results of tracking visualization on the LasHeR dataset
    Cross-modal image feature fusion visualization of heat maps
    • Table 1. Main datasets

      View table
      View in Article

      Table 1. Main datasets

      数据集序列分辨率最小帧数最大帧数平均帧数总帧数类别属性
      GTOT1950384×288403761577.8K97
      RGBT23420234630×460404 140498114.7 K2212
      LasHeR211 224630×4805712 862600734.8 K3219
    • Table 2. PR/SR scores based on attribute challenges on the GTOT dataset

      View table
      View in Article

      Table 2. PR/SR scores based on attribute challenges on the GTOT dataset

      属性OCCLSVFMLITCSODEFALL
      MACNet86.7/67.684.2/66.284.4/63.390.1/70.789.3/68.693.2/67.892.2/74.488.5/69.8
      SiamCDA82.2/69.491.5/74.886.2/71.992.4/96.482.6/68.587.9/72.794.7/76.587.7/73.2
      APFNet90.3/71.387.7/71.286.3/68.491.4/74.890.4/71.694.6/71.394.5/7890.5/73.7
      DFAT86.3/68.792.4/7589.1/7492.2/74.189.1/70.794.4/71.091.9/73.589.3/72.3
      TBSI87.5/72.594.3/7885.3/7289.9/74.985.4/71.386.4/68.685.7/71.588.3/74
      MMFFTN90.7/74.294.8/78.586.4/73.793.0/76.888.7/73.189.7/71.190.1/74.191/75.8
    • Table 3. Increase in metrics on the LasHeR dataset with the addition of different modules

      View table
      View in Article

      Table 3. Increase in metrics on the LasHeR dataset with the addition of different modules

      方法PRNPRSR
      Baseline65.461.551.8
      Baseline+CFFM67.663.754.2
      Baseline+CFFM+CMFM69.865.855.5
    • Table 4. Ablation of different modules of CFFM on LasHeR dataset

      View table
      View in Article

      Table 4. Ablation of different modules of CFFM on LasHeR dataset

      CFFMPRNPRSR
      CBAMLAM
      66.562.652.5
      67.564.854.0
      69.865.855.5
    • Table 5. Effect of different weights on model performance

      View table
      View in Article

      Table 5. Effect of different weights on model performance

      λgiouλl1PRSR
      1165.051.3
      2367.553.4
      5266.252.3
      2569.855.5
    Tools

    Get Citation

    Copy Citation Text

    Jing JIN, Jianqin LIU, Fengwen ZHAI. RGB-T tracking network based on multi-modal feature fusion[J]. Optics and Precision Engineering, 2025, 33(12): 1940

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Nov. 25, 2024

    Accepted: --

    Published Online: Aug. 15, 2025

    The Author Email: Jianqin LIU (1970477938@qq.com)

    DOI:10.37188/OPE.20253312.1940

    Topics