Chinese Journal of Lasers, Volume. 52, Issue 17, 1704001(2025)
Accurate Camera Pose Estimation for Model Scale Scaling Interference
The perspective-n-point (PnP) problem is a critical research task in computer vision, with extensive applications in robotics, augmented reality, autonomous driving, and other domains, holding significant academic and practical value. Traditional PnP methodologies typically rely on precise 3D models and impose stringent requirements on the accuracy of 3D coordinates. This dependency leads to significant degradation in estimation precision when input 3D models undergo scale variations. Although substantial progress has been made in conventional approaches, their core assumption—that the object’s 3D model must be fully accurate and of known scale—often fails in practical scenarios, constituting a key bottleneck that limits algorithmic robustness and precision. Furthermore, most traditional methods are predicated on idealized projection models, neglecting the influence of projection noise. Consequently, they lack rigorous analysis of the statistical properties of the estimator, resulting in systematic bias in the estimation results. To address these limitations, this study proposes a linear methodology that mitigates interference from imprecise 3D model inputs and achieves unbiased, precise estimation results.
This study incorporates imprecise 3D models into the framework and proposes a statistically grounded consistent PnP solver. The methodology proceeds as follows: First, both the unknown scaling factor and camera pose are treated as parameters to be jointly estimated. By refining the measurement model and employing variable elimination, a system of linear equations is formulated based on the original projection model, from which the least-squares solution is derived. Subsequently, a generalized eigenvalue problem is solved to obtain a consistent estimate of the projection noise variance, which is used to eliminate the estimator’s bias, yielding debiased and accurate estimates of the scaling factor and camera pose. Finally, Gauss?Newton iteration is applied to refine the solution, further enhancing estimation precision.
In synthetic data experiments, results show that as the standard deviation of input noise increases, the proposed method consistently maintains minimal errors with a slower growth rate compared to other state-of-the-art approaches (Fig. 2). When benchmarked against APnP, our method achieves lower errors in both pose estimation and scaling factor recovery (Fig. 3). Error statistics reveal that the proposed method improves the accuracy of rotation matrix and translation vector estimation by at least 8.67% and 11.71%, respectively (Table 1), while simultaneously achieving the shortest computation time, improving operational efficiency by at least 14.36% (Table 2).
In real data experiments using the KeypointNet dataset, the proposed method exhibits superior alignment with ground truth and significantly outperforms APnP in error metrics (Fig. 4). For experiments on the ETH3D Benchmark dataset, four representative images were selected as test cases (Fig. 5). Results indicate that the proposed method achieves the lowest errors and shortest runtime among comparative methods (Fig. 6), with a minimum of 10.06% improvement in rotation matrix accuracy and at least 14.19% enhancement in translation vector estimation across relevant scenarios (Tables 3?4). Furthermore, in extreme scaling factor experiments, the proposed method exhibits the lowest error margins compared to other methods (Fig. 7), confirming its robust performance under extreme scaling conditions.
To address the dual challenges of input 3D model inaccuracy and estimator bias in PnP problems, this study proposes a novel solver capable of resolving both issues simultaneously, aiming to enhance pose estimation accuracy while optimizing computational efficiency. The methodology employs linearization techniques to streamline the solving process, significantly improving computational performance. Furthermore, grounded in statistical principles, a bias elimination mechanism is introduced to mitigate deviations induced by projection noise, further refining pose estimation precision.
For rigorous validation, the proposed method is benchmarked against state-of-the-art PnP algorithms. Comprehensive experiments on both synthetic and real-world datasets demonstrate that it achieves superior accuracy with the lowest error margins and highest computational efficiency, conclusively validating its methodological advancements.
Get Citation
Copy Citation Text
Xiaoyan Zhou, Futao Lu, Qida Yu, Bo Ni, Guili Xu. Accurate Camera Pose Estimation for Model Scale Scaling Interference[J]. Chinese Journal of Lasers, 2025, 52(17): 1704001
Category: Measurement and metrology
Received: Feb. 24, 2025
Accepted: Apr. 24, 2025
Published Online: Sep. 17, 2025
The Author Email: Qida Yu (003550@nuist.edu.cn)
CSTR:32183.14.CJL250554