Optics and Precision Engineering, Volume. 33, Issue 4, 653(2025)
A Transformer-based visual tracker via knowledge distillation
To achieve high-precision and real-time tracking with limited computing resources, a transformer-based visual tracker via knowledge distillation was proposed. By introducing the image dynamic correction module, our tracker fused the search image of the current frame with the predicted image based on optical flow, which could effectively deal with challenges such as fast motion and motion blur. In order to reduce model complexity, the knowledge distillation learning strategy was adopted to compress the model. By introducing homoscedastic uncertainty into the loss function, loss weights of different subtasks could be learned through our network, thereby avoiding the cumbersome and difficult manual parameter tuning. Additionally, during training for the student network, a random blurring strategy was employed to enhance model robustness. Two tracking frameworks with different complexities, named KTransT-T and KTransT, were proposed and compared with 12 algorithms on 5 public datasets. Experimental results show that KTransT-T has significant advantages in precision and success rate, while KTransT has lower model complexity and competitive tracking performance. KTransT runs at a speed of up to 158 frames per second, which can meet the requirements of real-time tracking.
Get Citation
Copy Citation Text
Na LI, Mengqiao LIU, Jinting PAN, Kai HUANG, Xingxuan JIA. A Transformer-based visual tracker via knowledge distillation[J]. Optics and Precision Engineering, 2025, 33(4): 653
Category:
Received: Sep. 24, 2024
Accepted: --
Published Online: May. 20, 2025
The Author Email: Na LI (lina114@xupt.edu.cn)