Attention interaction based RGB-T tracking method

[2] ZHANG X C, YE P, LEUNG H et al. Object fusion tracking based on visible and infrared images： a comprehensive review[J]. Information Fusion, 63, 166-187(2020).

[3] LI C L, XUE W L, JIA Y Q et al. LasHeR： a large-scale high-diversity benchmark for RGBT tracking[J]. IEEE Transactions on Image Processing, 31, 392-404(2022).

[4] LI C L, CHENG H, HU S Y et al. Learning collaborative sparse representation for grayscale-thermal tracking[J]. IEEE Transactions on Image Processing, 25, 5743-5756(2016).

[5] ZHANG P Y, ZHAO J, BO C J et al. Jointly modeling motion and appearance cues for robust RGB-T tracking[J]. IEEE Transactions on Image Processing, 30, 3335-3347(2021).

[6] LI C L, WU X H, ZHAO N et al. Fusing two-stream convolutional neural networks for RGB-T object tracking[J]. Neurocomputing, 281, 78-85(2018).

[7] HENRIQUES J F, CASEIRO R, MARTINS P et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 583-596(2015).

[8] ZHANG L C, DANELLJAN M, GONZALEZ-GARCIA A et al. Multi-modal fusion for end-to-end RGB-T tracking[C], 27, 2252-2261(2019).

[9] BHAT G, DANELLJAN M, VAN GOOL L et al. Learning discriminative model prediction for tracking[C], 6181-6190(2019).

[10] YAN B, PENG H W, FU J L et al. Learning spatio-temporal transformer for visual tracking[C], 10, 10428-10437(2021).

[11] LI B, WU W, WANG Q et al. SiamRPN++： evolution of Siamese visual tracking with very deep networks[C], 15, 4277-4286(2019).

[12] ZHANG T L, LIU X R, ZHANG Q et al. SiamCDA： complementarity- and distractor-aware RGB-T tracking based on Siamese network[J]. IEEE Transactions on Circuits and Systems for Video Technology, 32, 1403-1417(2022).

[13] HAN B. Learning multi-domain convolutional neural networks for visual tracking[C], 27, 4293-4302(2016).

[14] LI C L, LU A D, ZHENG A H et al. Multi-adapter RGBT tracking[C], 27, 2262-2270(2019).

[15] ZHANG P Y, WANG D, LU H C et al. Learning adaptive attribute-driven representation for real-time RGB-T tracking[J]. International Journal of Computer Vision, 129, 2714-2729(2021).

[16] LI C L, LIU L, LU A D et al. Challenge-aware RGBT Tracking[M]. Computer Vision-ECCV 2020, 222-237(2020).

[17] XIAO Y, YANG M M, LI C L et al. Attribute-based progressive fusion network for RGBT tracking[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 2831-2838(2022).

[18] WANG C Q, XU C Y, CUI Z et al. Cross-modal pattern-propagation for RGB-T tracking[C], 13, 7062-7071(2020).

[19] XU C Y, CUI Z, WANG C Q et al. Learning cross-modal interaction for RGB-T tracking[J]. Science China Information Sciences, 66, 119103(2022).

[20] VASWANI A, SHAZEER N M, PARMAR N et al. Attention is all you need[C], 5998-6008(2017).

[21] DOSOVITSKIY A, BEYER L, KOLESNIKOV A et al. An image is worth 16×16 words： transformers for image recognition at scale[C], 1(2021).

[22] LIU Z, LIN Y T, CAO Y et al. Swin Transformer： hierarchical Vision Transformer using Shifted Windows[C], 10, 9992-10002(2021).

[23] ZHU X Z, SU W J, LU L W et al. Deformable DETR： deformable transformers for end-to-end object detection[C], 1(2021).

[24] CHEN B Y, LI P X, BAI L et al. Backbone is all your need： a simplified architecture for visual object tracking[C], 375-392(2022).

[25] ZHU Y B, LI C L, TANG J et al. Quality-aware feature aggregation network for robust RGBT tracking[J]. IEEE Transactions on Intelligent Vehicles, 6, 121-130(2021).

[26] MEI J T, ZHOU D M, CAO J D et al. HDINet： hierarchical dual-sensor interaction network for RGBT tracking[J]. IEEE Sensors Journal, 21, 16915-16926(2021).

[27] TU Z Z, LIN C, ZHAO W et al. M⁵L： multi-modal multi-margin metric learning for RGBT tracking[J]. IEEE Transactions on Image Processing, 31, 85-98(2022).

[28] GAO Y, LI C L, ZHU Y B et al. Deep adaptive fusion network for high performance RGBT tracking[C], 91-99(2019).

[29] ZHU Y B, LI C L, LUO B et al. Dense feature aggregation and pruning for RGBT tracking[C], 465-472(25).

[30] ZHANG H, ZHANG L, ZHUO L et al. Object tracking in RGB-T videos using modal-aware attention network and competitive learning[J]. Sensors, 20, 393(2020).

[31] LU A D, QIAN C, LI C L et al. Duality-gated mutual condition network for RGBT tracking[J]. IEEE Transactions on Neural Networks and Learning Systems(2022).

[32] LI C L, ZHU C L, HUANG Y et al. Cross-modal ranking with soft consistency and noisy labels for robust RGB-T tracking[C], 831-847(2018).