Chinese Journal of Lasers, Volume. 52, Issue 17, 1704001(2025)

Accurate Camera Pose Estimation for Model Scale Scaling Interference

Xiaoyan Zhou1, Futao Lu1, Qida Yu1、*, Bo Ni1, and Guili Xu2
Author Affiliations
  • 1School of Electronic and Information Engineering, Nanjing University of Information Science and Technology, Nanjing 201144, Jiangsu , China
  • 2School of Automation, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, Jiangsu , China
  • show less

    Objective

    The perspective-n-point (PnP) problem is a critical research task in computer vision, with extensive applications in robotics, augmented reality, autonomous driving, and other domains, holding significant academic and practical value. Traditional PnP methodologies typically rely on precise 3D models and impose stringent requirements on the accuracy of 3D coordinates. This dependency leads to significant degradation in estimation precision when input 3D models undergo scale variations. Although substantial progress has been made in conventional approaches, their core assumption—that the object’s 3D model must be fully accurate and of known scale—often fails in practical scenarios, constituting a key bottleneck that limits algorithmic robustness and precision. Furthermore, most traditional methods are predicated on idealized projection models, neglecting the influence of projection noise. Consequently, they lack rigorous analysis of the statistical properties of the estimator, resulting in systematic bias in the estimation results. To address these limitations, this study proposes a linear methodology that mitigates interference from imprecise 3D model inputs and achieves unbiased, precise estimation results.

    Methods

    This study incorporates imprecise 3D models into the framework and proposes a statistically grounded consistent PnP solver. The methodology proceeds as follows: First, both the unknown scaling factor and camera pose are treated as parameters to be jointly estimated. By refining the measurement model and employing variable elimination, a system of linear equations is formulated based on the original projection model, from which the least-squares solution is derived. Subsequently, a generalized eigenvalue problem is solved to obtain a consistent estimate of the projection noise variance, which is used to eliminate the estimator’s bias, yielding debiased and accurate estimates of the scaling factor and camera pose. Finally, Gauss?Newton iteration is applied to refine the solution, further enhancing estimation precision.

    Results and Discussions

    In synthetic data experiments, results show that as the standard deviation of input noise increases, the proposed method consistently maintains minimal errors with a slower growth rate compared to other state-of-the-art approaches (Fig. 2). When benchmarked against APnP, our method achieves lower errors in both pose estimation and scaling factor recovery (Fig. 3). Error statistics reveal that the proposed method improves the accuracy of rotation matrix and translation vector estimation by at least 8.67% and 11.71%, respectively (Table 1), while simultaneously achieving the shortest computation time, improving operational efficiency by at least 14.36% (Table 2).

    In real data experiments using the KeypointNet dataset, the proposed method exhibits superior alignment with ground truth and significantly outperforms APnP in error metrics (Fig. 4). For experiments on the ETH3D Benchmark dataset, four representative images were selected as test cases (Fig. 5). Results indicate that the proposed method achieves the lowest errors and shortest runtime among comparative methods (Fig. 6), with a minimum of 10.06% improvement in rotation matrix accuracy and at least 14.19% enhancement in translation vector estimation across relevant scenarios (Tables 3?4). Furthermore, in extreme scaling factor experiments, the proposed method exhibits the lowest error margins compared to other methods (Fig. 7), confirming its robust performance under extreme scaling conditions.

    Conclusions

    To address the dual challenges of input 3D model inaccuracy and estimator bias in PnP problems, this study proposes a novel solver capable of resolving both issues simultaneously, aiming to enhance pose estimation accuracy while optimizing computational efficiency. The methodology employs linearization techniques to streamline the solving process, significantly improving computational performance. Furthermore, grounded in statistical principles, a bias elimination mechanism is introduced to mitigate deviations induced by projection noise, further refining pose estimation precision.

    For rigorous validation, the proposed method is benchmarked against state-of-the-art PnP algorithms. Comprehensive experiments on both synthetic and real-world datasets demonstrate that it achieves superior accuracy with the lowest error margins and highest computational efficiency, conclusively validating its methodological advancements.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Xiaoyan Zhou, Futao Lu, Qida Yu, Bo Ni, Guili Xu. Accurate Camera Pose Estimation for Model Scale Scaling Interference[J]. Chinese Journal of Lasers, 2025, 52(17): 1704001

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Measurement and metrology

    Received: Feb. 24, 2025

    Accepted: Apr. 24, 2025

    Published Online: Sep. 17, 2025

    The Author Email: Qida Yu (003550@nuist.edu.cn)

    DOI:10.3788/CJL250554

    CSTR:32183.14.CJL250554

    Topics