Acta Optica Sinica, Volume. 37, Issue 11, 1115005(2017)

Target Scale Adaptive Robust Tracking Based on Fusion of Multilayer Convolutional Features

Xin Wang*, Zhiqiang Hou, Wangsheng Yu, Zefenfen Jin, and Xianxiang Qin
Author Affiliations
  • Information and Navigation College, Air Force Engineering University, Xi'an, Shaanxi 710077, China
  • show less
    Figures & Tables(14)
    Schematic of deep convolution network of VGG-Net-19
    Visualizations for different convolutional layers of VGG-Net-19. (a) Input images; (b) Conv3-4; (c) Conv4-4; (d) Conv5-4
    Construct the scale pyramid of the target by multi-scale sampling
    Flow chart of proposed algorithm
    Comparison of partial tracking results of seven trackers
    Center location error curves of eight test sequences
    Overlap rate curves of eight test sequences
    (a) Success rate curves and (b) precision curves of 28 test sequences
    Tracking performance analysis in different combinations of feature. (a) Success rate curves; (b) precision curves
    • Table 1. Scale adaptive robust tracker based on fusion of multilayer convolutional features

      View table

      Table 1. Scale adaptive robust tracker based on fusion of multilayer convolutional features

      Input: Image sequence: I1, I2, …, In. Initial target position: p0=(x0, y0), and initial target scale: s0=(w0, h0).
      Output: The estimated position of target: pt=(xt, yt), and estimated scale: st=(wt, ht).
      For t=1,2,…,n, do:
      1Locate the Center of Target
      1.1Crop out the ROI image in frame #t centered at pt-1, and extract the hierarchical convolutional features;
      1.2Learn the correlation response map using Eq. (5) and Eq. (7) for each convolutional layer;
      1.3Fuse the multiple correlation response maps using Eq. (8), and obtain the compositive response map;
      1.4Locate the center of the target pt in frame #t using Eq. (9).
      2Estimate the Scale of Target
      2.1Obtain the multi-scale sample images Is={Is1,…, Ism} in frame #t based on pt and st-1;
      2.2Build scale filters by extracting HOG features from the above multi-scale sample images;
      2.3Compute the correlation response score using Eq. (10) and Eq. (11);
      2.4Estimate the optimal scale st of the target in frame #t using Eq. (12).
      3Model Update
      3.1Update the position filters using Eq. (13);
      3.2Update the scale filters using Eq. (14).
      Until End of the image sequence.
    • Table 2. Comparison of the tracking precisions of the algorithm of different attributes

      View table

      Table 2. Comparison of the tracking precisions of the algorithm of different attributes

      AlgorithmSV(28)IV(15)OCC(16)BC(11)DEF(9)MB(8)FM(12)IPR(18)OPR(23)OV(4)LR(3)
      Proposed0.8800.8380.841¯0.861¯0.9320.8700.7720.8790.855¯0.7020.873
      HCF0.8800.8580.8470.8670.927¯0.844¯0.757¯0.873¯0.8570.6560.863¯
      FCNT0.830¯0.7790.7370.7130.9250.7400.7150.7740.7980.691¯0.686
      CNN-SVM0.8270.7510.7330.6890.8900.7250.6850.7930.8000.6500.606
      CNT0.6620.5210.6670.4630.6860.4790.4770.5830.6300.4810.410
      DSST0.7400.6810.7850.6100.7330.6350.5390.7140.7250.4530.402
      KCF0.6800.6320.7440.5780.7340.6790.5860.6190.6780.6390.233
    • Table 3. Comparison of the tracking success rates of the algorithm of different attributes

      View table

      Table 3. Comparison of the tracking success rates of the algorithm of different attributes

      AlgorithmSV(28)IV(15)OCC(16)BC(11)DEF(9)MB(8)FM(12)IPR(18)OPR(23)OV(4)LR(3)
      Proposed0.6000.5560.5820.5860.6290.591¯0.5540.5910.5790.5270.574
      HCF0.5310.5090.5140.573¯0.5890.5940.545¯0.532¯0.5250.5220.497¯
      FCNT0.558¯0.551¯0.517¯0.5060.628¯0.5520.5330.5040.539¯0.5730.451
      CNN-SVM0.5130.4770.4730.5000.5940.5350.5130.4800.5040.536¯0.373
      CNT0.5080.4250.5060.3720.5410.4260.4110.4420.4750.4170.342
      DSST0.4510.4120.4620.4210.4910.4570.4110.4410.4460.4050.238
      KCF0.4270.3890.4580.3980.5010.5120.4500.3830.4250.5200.209
    • Table 4. Tracking speed of proposed algorithm for the eight videosframe /s

      View table

      Table 4. Tracking speed of proposed algorithm for the eight videosframe /s

      VideoCarScaleDog1DollIronmanMotorRollingSkiingSoccerWalking2Average
      Tracking speed9.08.39.76.73.112.14.79.67.9
    • Table 5. Comparison of average tracking speed of the trackers based on deep learningframe /s

      View table

      Table 5. Comparison of average tracking speed of the trackers based on deep learningframe /s

      TrackerProposedCNTFCNTCNN-SVMHCFMDNetDeepTrack[29]STCT[30]
      CodeM+CMMC+MM+CMMC+M
      PlatformCPU+GPUCPUCPU+GPUCPU+GPUGPUCPU+GPUCPU+GPUCPU+GPU
      Average tracking speed8.553-1012.52.5
    Tools

    Get Citation

    Copy Citation Text

    Xin Wang, Zhiqiang Hou, Wangsheng Yu, Zefenfen Jin, Xianxiang Qin. Target Scale Adaptive Robust Tracking Based on Fusion of Multilayer Convolutional Features[J]. Acta Optica Sinica, 2017, 37(11): 1115005

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Machine Vision

    Received: Jun. 21, 2017

    Accepted: --

    Published Online: Sep. 7, 2018

    The Author Email: Xin Wang (wangxiin@foxmail.com)

    DOI:10.3788/AOS201737.1115005

    Topics