Optics and Precision Engineering, Volume. 33, Issue 6, 928(2025)

Multi-level feature fusion for camera pose regression

Junwen SI and Ziwei ZHOU*
Author Affiliations
  • College of Computer and Software Engineeringng,University of Science and Technology Liaoning, Anshan114000, China
  • show less
    Figures & Tables(19)
    Structure of ResGraphLoc network
    Block structure diagram of CoordiBo
    Flowchart of coordinate attention
    Comparison of FReLU and ReLU functions
    Framework of image of feature fusion module
    Block diagram of image graph definition
    Block diagram of node updating
    Comparison results of TransBoNet and the proposed model on the FULL subset, with a focus on cases A and B.
    Experimental trajectory plots of different algorithmic models on the Oxford RobotCar dataset
    Feature-Guided heatmap in complex scenarios
    ResGraphLoc visualization in Office and RedKitchenpose
    • Table 1. Comparison of feature encoder structure parameters

      View table
      View in Article

      Table 1. Comparison of feature encoder structure parameters

      卷积阶段输出尺寸ResNet101视觉编码器
      卷积层1512×5127×7,64,stride=27×7,64,stride=2
      卷积层2256×2563×3max pool,stride=23×3max pool,stride=2
      卷积层3128×1281×11283×31281×1512×41×11283×31281×1512CA512×4
      卷积层464×641×12563×32561×11 024×231×12563×32561×11024CA1 024×23
      卷积层532×321×15123×35121×12 048×31×15123×35121×12 048CA2 048×3
    • Table 2. Training and test sets in Oxford RobotCar

      View table
      View in Article

      Table 2. Training and test sets in Oxford RobotCar

      SequenceTimeTagMode
      -2014-06-26-08-53-56OvercastTraining
      -2014-06-26-09-24-58OvercastTraining
      LOOP12014-06-23-15-41-25SunnyTesting
      LOOP22014-06-23-15-36-04SunnyTesting
      -2014-11-28-12-07-13OvercastTraining
      -2014-12-02-15-30-08OvercastTraining
      FULL12014-12-09-13-21-02OvercastTesting
      FULL22014-12-12-10-45-15OvercastTesting
    • Table 3. Experimental results of different algorithms on the Oxford RobotCar dataset

      View table
      View in Article

      Table 3. Experimental results of different algorithms on the Oxford RobotCar dataset

      MethodLOOP1LOOP2FULL1FULL2Average
      MeanMedianMeanMedianMeanMedianMeanMedianMeanMedian
      ResGraphLoc7.183.927.083.3216.967.4336.246.7816.875.36
      PoseNet25.296.8828.815.80125.6107.6131.06101.877.7055.52
      MapNet8.765.799.844.9141.417.9459.320.0429.812.17
      LsG9.07-9.19-31.65-53.45-25.84-
      AtLoc8.615.688.865.0529.611.148.212.223.88.54
      GNNMapNet7.76-8.15-17.35-37.81-17.77-
      AtLoc+7.824.347.243.7821.06.4042.67.0019.75.38
      TransBoNet----30.036.4652.428.7741.237.62
    • Table 4. Experimental results of different algorithms on the Oxford RobotCar dataset

      View table
      View in Article

      Table 4. Experimental results of different algorithms on the Oxford RobotCar dataset

      MethodLOOP1LOOP2FULL1FULL2Average
      MeanMedianMeanMedianMeanMedianMeanMedianMeanMedian
      ResGraphLoc2.481.952.741.863.161.266.521.323.731.60
      PoseNet17.452.0619.622.0527.122.526.0520.122.5611.7
      MapNet3.461.543.961.6712.56.6814.86.398.684.07
      LsG3.31-3.53-4.51-8.60-4.99-
      AtLoc4.582.234.672.0112.45.2811.14.638.193.54
      GNNMapNet2.54-2.57-3.47-7.55-4.03-
      AtLoc+3.621.923.602.046.151.509.951.485.831.74
      TransBoNet----8.181.2510.981.699.581.47
    • Table 5. Median position/orientation estimation errors on the 7-Scenes dataset

      View table
      View in Article

      Table 5. Median position/orientation estimation errors on the 7-Scenes dataset

      MethodScenes
      ChessFireHeadsOfficePumpkinKitchenStairsAvg
      GPoseNet0.20 m,7.11°0.38 m, 12.3°0.21 m, 13.8°0.28 m, 8.83°0.37 m, 6.94°0.35 m, 8.15°0.37 m, 12.5°0.31 m, 9.95°
      AtLoc0.10 m,4.07°0.25 m, 11.4°0.16 m, 11.8°0.17 m, 5.34°0.21 m, 4.37°0.23 m, 5.42°0.26 m, 10.5°0.20 m, 7.56°
      AtLoc+0.10 m,3.18°0.26 m, 10.8°0.14 m, 11.4°0.17 m, 5.16°0.20 m, 3.94°0.16 m, 4.90°0.29 m, 10.2°0.19 m, 7.08°
      GNNMapNet0.08 m,2.82°0.26 m, 8.94°0.17 m, 11.41°0.18 m, 5.08°0.15 m, 2.77°0.25 m, 4.48°0.23 m, 8.78°0.19 m, 6.33°
      MS-Transformer0.11 m,4.66°0.24 m, 9.6°0.14 m, 12.1°0.17 m, 5.66°0.18 m, 4.44°0.17 m, 5.94°0.26 m, 8.45°0.18 m, 7.28°
      IRPNet0.13 m,5.64°0.25 m, 9.67°0.15 m, 13.1°0.24 m, 6.33°0.22 m, 5.78°0.30 m, 7.29°0.34 m, 11.6°0.23 m, 8.49°
      DA-model0.11 m,6.06°0.26 m, 11.6°0.15 m, 12.8°0.17 m, 8.34°0.21 m, 6.37°0.19 m, 8.90°0.26 m, 12.2°0.19 m, 9.48°
      TransBoNet0.11 m,4.48°0.25 m, 12.4°0.18 m, 14.0°0.20 m, 5.08°0.19 m, 4.77°0.17 m, 5.35°0.30 m, 13.0°0.20 m, 8.45°
      ResGraphLoc0.08 m,3.25°0.24 m,9.2°0.15 m, 11.2°0.15 m, 4.55°0.17 m, 2.63°0.14 m, 3.20°0.21 m, 8.34°0.16 m, 6.05°
    • Table 6. Median position/orientation estimation errors on the 4Seasons dataset

      View table
      View in Article

      Table 6. Median position/orientation estimation errors on the 4Seasons dataset

      MethodBusiness CampusNeighborhoodOld Town
      MeanMedianMeanMedianMeanMedian
      ResGraphLoc5.25 m, 2.20°2.48 m, 1.45°1.40 m, 0.76°1.22 m, 0.65°25.48 m, 2.73°6.76 m, 1.27°
      MapNet10.35 m, 3.78°5.66 m, 1.83°2.81 m, 1.05°1.89 m, 0.92°46.56 m, 7.14°16.52 m, 2.12°
      AtLoc11.53 m, 4.84°5.81 m, 1.50°2.80 m, 1.16°1.83 m, 0.93°84.17 m, 7.81°17.10 m, 1.73°
      AtLoc+13.70 m, 6.41°5.58 m, 1.94°2.33 m, 1.39°1.61 m, 0.88°68.40 m, 5.51°14.52 m, 1.69°
      GNNMapNet7.69 m, 4.34°5.52 m, 2.16°3.02 m, 2.92°2.14 m, 1.45°41.54 m, 7.30°19.23 m, 3.26°
      IRPNet10.95 m, 5.38°5.91 m, 1.82°3.17 m, 2.85°1.98 m, 0.90°55.86 m, 6.97°17.33 m, 3.11°
      CoordiNet11.52 m, 3.44°6.44 m, 1.38°1.72 m, 0.86°1.37 m, 0.69°43.68 m, 3.58°11.83 m, 1.36°
    • Table 7. Model computational complexity analysis

      View table
      View in Article

      Table 7. Model computational complexity analysis

      MethodFLOPS/GActivations/millionsParams/millionsMemory/MB
      MapNet3.6723.722.34889.5
      AtLoc3.0683.824.44897.9
      MS-Transformer5.57852.218.54274.5
      DA-model0.2374.43.84714.2
      ReResGraphLoc0.6383.64.01521.7
    • Table 8. Result of ablation experiment

      View table
      View in Article

      Table 8. Result of ablation experiment

      网络配置误差均值
      基础模型8.42 m,3.68°
      基础模型+CoordiBo7.72 m,2.84°
      基础模型+图注意力层7.68 m,2.72°
      基础模型+多级特征融合7.84 m,3.05°
      基础模型+CoordiBo+图注意力层7.33 m,2.52°
      基础模型+CoordiBo+多级特征融合7.65 m,2.69°
      ResGraphLoc7.18 m,2.48°
    Tools

    Get Citation

    Copy Citation Text

    Junwen SI, Ziwei ZHOU. Multi-level feature fusion for camera pose regression[J]. Optics and Precision Engineering, 2025, 33(6): 928

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Jul. 13, 2024

    Accepted: --

    Published Online: Jun. 16, 2025

    The Author Email:

    DOI:10.37188/OPE.20253306.0928

    Topics